Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(61)

Issue 20470044: fix backslash-newline handling the JS parser (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
12 years, 4 months ago by felix8a
Modified:
12 years, 4 months ago
Reviewers:
kpreid2
CC:
google-caja-discuss_googlegroups.com
Base URL:
http://google-caja.googlecode.com/svn/trunk/
Visibility:
Public.

Description

The JS lexer elides backslash-newline sequences at an early stage, before tokenizing. This seems to be fantasy. It's not in Ecmascript, and I can't find any JS implementation that treats backslash-newline as a continuation. That behavior causes unexpected effects when a // comment ends with a backslash, like in escodegen.js as described in issue 1868. So, this CL eliminates the weirdness. InputElementSplitter is where the backslash-newline elision happens. Deleting that is straightforward. Ecmascript does say that backslash-newline in strings gets elided, so the rest of this CL is about supporting that. Our JS lexical tokens hold the original source code's char sequence, so at the lexer level nothing needs to be done with the backslash-newline, except for fixing up the lexer tests to match reality. At the JS parser level, StringLiteral nodes have the logic to convert the source code text to an actual value. Everyone defers to that, and adding handling of backslash-newline there is straightforward. Ecmascript is strict about "use strict" directives. Escape sequences and backslash-newline are not allowed in the directive. Our existing logic handles that fine. I just added some testcases to verify that. Stray backslashes in JS will become a WORD token or part of a WORD token, which is consistent with how we handle \u escapes. These will be rejected at the parser level. I don't see any particular reason to reject backslashes in the lexer level, so I left that alone.

Patch Set 1 #

Unified diffs Side-by-side diffs Delta from patch set Stats (+158 lines, -172 lines) Patch
M src/com/google/caja/lexer/InputElementSplitter.java View 7 chunks +6 lines, -66 lines 0 comments Download
M src/com/google/caja/parser/js/StringLiteral.java View 2 chunks +2 lines, -1 line 0 comments Download
M src/com/google/caja/render/TokenClassification.java View 1 chunk +0 lines, -2 lines 0 comments Download
M tests/com/google/caja/lexer/lexergolden1.txt View 2 chunks +92 lines, -92 lines 0 comments Download
M tests/com/google/caja/lexer/lexertest1.js View 2 chunks +6 lines, -9 lines 0 comments Download
M tests/com/google/caja/parser/js/ParserTest.java View 2 chunks +11 lines, -2 lines 0 comments Download
M tests/com/google/caja/parser/js/StringLiteralTest.java View 1 chunk +1 line, -0 lines 0 comments Download
M tests/com/google/caja/parser/js/parsergolden10.txt View 1 chunk +15 lines, -0 lines 0 comments Download
M tests/com/google/caja/parser/js/parsertest10.js View 1 chunk +9 lines, -0 lines 0 comments Download
M tests/com/google/caja/render/JsMinimalPrinterTest.java View 1 chunk +16 lines, -0 lines 0 comments Download

Messages

Total messages: 3
felix8a
12 years, 4 months ago (2013-10-31 21:54:53 UTC) #1
kpreid2
LGTM
12 years, 4 months ago (2013-10-31 22:21:25 UTC) #2
felix8a
12 years, 4 months ago (2013-10-31 22:24:44 UTC) #3
@r5622
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b