Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(510)

Issue 5505090: lexer.ll: Warn about non-UTF-8 characters (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
12 years, 3 months ago by dak
Modified:
12 years, 3 months ago
Reviewers:
Keith, lemzwerg, carl.d.sorensen
CC:
lilypond-devel_gnu.org
Visibility:
Public.

Description

lexer.ll: Warn about non-UTF-8 characters Making the warnings point to the exact bad byte rather than the enclosing construct would be nice.

Patch Set 1 #

Total comments: 8

Patch Set 2 : "Better error message locator" #

Patch Set 3 : If critics followed the same standards they ask of others they would have no time left to complain. #

Unified diffs Side-by-side diffs Delta from patch set Stats (+151 lines, -23 lines) Patch
M lily/include/lily-lexer.hh View 1 chunk +1 line, -0 lines 0 comments Download
M lily/lexer.ll View 1 2 18 chunks +150 lines, -23 lines 0 comments Download

Messages

Total messages: 12
lemzwerg
LGTM. http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll File lily/lexer.ll (right): http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll#newcode1009 lily/lexer.ll:1009: case 0xc2: Wouldn't it be more effective to ...
12 years, 3 months ago (2011-12-31 18:28:00 UTC) #1
Keith
Works nicely. Showing the input location will probably be very helpful. We probably want to ...
12 years, 3 months ago (2012-01-01 02:01:11 UTC) #2
Keith
On 2012/01/01 02:01:11, Keith wrote: > > Showing the input location will probably be very ...
12 years, 3 months ago (2012-01-01 02:03:50 UTC) #3
Carl
LGTM Carl
12 years, 3 months ago (2012-01-01 02:08:54 UTC) #4
dak
On 2012/01/01 02:01:11, Keith wrote: > Works nicely. > > Showing the input location will ...
12 years, 3 months ago (2012-01-01 09:40:36 UTC) #5
dak
http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll File lily/lexer.ll (right): http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll#newcode134 lily/lexer.ll:134: A [a-zA-Z\200-\377] On 2012/01/01 02:01:11, Keith wrote: > non-ASCII ...
12 years, 3 months ago (2012-01-01 09:56:04 UTC) #6
Keith
On Sun, 01 Jan 2012 01:40:36 -0800, <dak@gnu.org> wrote: > Our lexer has been written ...
12 years, 3 months ago (2012-01-01 10:05:42 UTC) #7
dak
On 2012/01/01 10:05:42, Keith wrote: > On Sun, 01 Jan 2012 01:40:36 -0800, <mailto:dak@gnu.org> wrote: ...
12 years, 3 months ago (2012-01-01 10:12:27 UTC) #8
Keith
On Sun, 01 Jan 2012 01:56:04 -0800, <dak@gnu.org> wrote: > http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll#newcode134 > lily/lexer.ll:134: A [a-zA-Z\200-\377] ...
12 years, 3 months ago (2012-01-01 10:16:47 UTC) #9
dak
On 2012/01/01 10:16:47, Keith wrote: > On Sun, 01 Jan 2012 01:56:04 -0800, <mailto:dak@gnu.org> wrote: ...
12 years, 3 months ago (2012-01-01 11:07:53 UTC) #10
Keith
On 2012/01/01 10:12:27, dak wrote: > > Consider a comment in your case/switch statement that ...
12 years, 3 months ago (2012-01-01 22:06:52 UTC) #11
dak
12 years, 3 months ago (2012-01-02 09:11:24 UTC) #12
On 2012/01/01 22:06:52, Keith wrote:
> On 2012/01/01 10:12:27, dak wrote:
> 
> Sorry, I wasn't making much sense.  As a reader I want to *recognize* what the
> but switch/case is doing rather than trying to figure it out.  Maybe :
>  // Test if these bytes are a UTF-8 encoding of a Unicode character,
>  // and warn if not.   Trap overly-long UTF-8 encodings, but we don't
>  // need to worry about finer details like some filters do.

What are the finer details?  UTF-16 surrogates?  I was just too lazy to figure
out the patterns for them.

> because your test is almost but not quite equivalent to the regex in the back
of
> the Flex manual.

I can't find a regex there.  Got a link or section name?  Is this a released
version?
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b