Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(394)

Issue 6036054: code review 6036054: gc, go/parser: accept .go files with UTF-8 BOM, ignore ...

Can't Edit
Can't Publish+Mail
Start Review
Created:
12 years, 6 months ago by liigo
Modified:
12 years, 6 months ago
Reviewers:
r, 0xjnml, kardia, golang-dev
Visibility:
Public.

Description

gc, go/parser: accept .go files with UTF-8 BOM, ignore that bytes before compile and parse. Some text editors, especially in Windows, get used to add UTF-8 BOM to file header. Currently, the BOM are "invalid characters" for gc and go/parser, which should be ignored to make sure that the files are compiled and parsed properly. After this CL, cmd/go godoc gofmt and other go tools works well with UTF-8 BOM, too.

Patch Set 1 #

Patch Set 2 : diff -r 6c1797405851 https://code.google.com/p/go #

Unified diffs Side-by-side diffs Delta from patch set Stats (+31 lines, -4 lines) Patch
M src/cmd/gc/lex.c View 1 2 chunks +13 lines, -0 lines 0 comments Download
M src/pkg/go/parser/interface.go View 1 2 chunks +18 lines, -4 lines 0 comments Download

Messages

Total messages: 5
liigo
Hello golang-dev@googlegroups.com, I'd like you to review this change to https://code.google.com/p/go
12 years, 6 months ago (2012-04-15 17:37:47 UTC) #1
0xjnml
On Sunday, April 15, 2012 7:37:47 PM UTC+2, Liigo Zhuang wrote: > > Description: > ...
12 years, 6 months ago (2012-04-15 18:24:20 UTC) #2
r
Strictly speaking, a BOM is legal in UTF-8 but only as a marker for the ...
12 years, 6 months ago (2012-04-15 21:34:04 UTC) #3
liigo
BOM is commonly used to help text editors identifying encoding of a file. Some editors ...
12 years, 6 months ago (2012-04-16 07:17:53 UTC) #4
kardia
12 years, 6 months ago (2012-04-16 15:00:00 UTC) #5
Microsoft Notepad adds a BOM to files, as does Microsoft's Visual Studio 
Editor. BUT, those are the only products I know of that do that. Notepad++ 
does not, nor does Sublime Text 2, ... I have not opinion on this topic, 
but in my experience, the BOM automatic insertions is only with Microsoft's 
products. 

On Monday, April 16, 2012 12:17:52 AM UTC-7, Liigo Zhuang wrote:
>
> BOM is commonly used to help text editors identifying encoding of a file. 
> Some editors add BOM impliedly when saving utf-8 encoded files. If there is 
> no BOM, text editors maybe try guessing it when opening a file, and error 
> maybe occurs. See wikipedia how the BOMs appear and how they are widely 
> used:  http://en.wikipedia.org/wiki/Byte_order_mark
>
> So, a .go source files with utf-8 BOM is a valid utf-8 encoded files. But 
> the go's parser and compiler didn't accept these files, which is very 
> strange. If someone stick to not accept utf-8 BOM in .go files, I would 
> like to suggest modifying Go's language specification to clarity that 
> explicitly.
>
>
> 2012/4/16 Rob 'Commander' Pike <r@golang.org>
>
>> Strictly speaking, a BOM is legal in UTF-8 but only as a marker for
>> the type of the data stream, a magic number if you will. Since Go
>> source code is required to be UTF-8, a BOM is never necessary and
>> arguably erroneous. We've come this far without accepting BOMS and I'd
>> like to keep it that way.
>>
>> -rob
>>
>
>
>
> -- 
> by *Liigo*, http://blog.csdn.net/liigo/
> Google+  https://plus.google.com/105597640837742873343/
>
>  
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b