Issue 20470044: fix backslash-newline handling the JS parser

Keyboard Shortcuts

	File
u :	up to issue
m :	publish + mail comments
M :	edit review message
j / k :	jump to file after / before current file
J / K :	jump to next file with a comment after / before current file
	Side-by-side diff
i :	toggle intra-line diffs
e :	expand all comments
c :	collapse all comments
s :	toggle showing all comments
n / p :	next / previous diff chunk or comment
N / P :	next / previous comment
<Up> / <Down> :	next / previous line
<Enter> :	respond to / edit current comment
d :	mark current comment as done

	Issue
u :	up to list of issues
m :	publish + mail comments
j / k :	jump to patch after / before current patch
o / <Enter> :	open current patch in side-by-side view
i :	open current patch in unified diff view

	Issue List
j / k :	jump to issue after / before current issue
o / <Enter> :	open current issue
# :	close issue

	Comment/message editing
<Ctrl> + s or <Ctrl> + Enter :	save comment
<Esc> :	cancel edit

Issue 20470044: fix backslash-newline handling the JS parser (Closed)

Can't Edit
Can't Publish+Mail
Start Review

Created:
12 years, 4 months ago by felix8a

Modified:
12 years, 4 months ago

Reviewers:
kpreid2

CC:
google-caja-discuss_googlegroups.com

Base URL:
http://google-caja.googlecode.com/svn/trunk/

Visibility:
Public.

More Reviews

Description

The JS lexer elides backslash-newline sequences at an early stage, before tokenizing. This seems to be fantasy. It's not in Ecmascript, and I can't find any JS implementation that treats backslash-newline as a continuation. That behavior causes unexpected effects when a // comment ends with a backslash, like in escodegen.js as described in issue 1868. So, this CL eliminates the weirdness. InputElementSplitter is where the backslash-newline elision happens. Deleting that is straightforward. Ecmascript does say that backslash-newline in strings gets elided, so the rest of this CL is about supporting that. Our JS lexical tokens hold the original source code's char sequence, so at the lexer level nothing needs to be done with the backslash-newline, except for fixing up the lexer tests to match reality. At the JS parser level, StringLiteral nodes have the logic to convert the source code text to an actual value. Everyone defers to that, and adding handling of backslash-newline there is straightforward. Ecmascript is strict about "use strict" directives. Escape sequences and backslash-newline are not allowed in the directive. Our existing logic handles that fine. I just added some testcases to verify that. Stray backslashes in JS will become a WORD token or part of a WORD token, which is consistent with how we handle \u escapes. These will be rejected at the parser level. I don't see any particular reason to reject backslashes in the lexer level, so I left that alone.

Patch Set 1 #

Created: 12 years, 4 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Stats (+158 lines, -172 lines)			Patch
M	src/com/google/caja/lexer/InputElementSplitter.java	View	7 chunks	+6 lines, -66 lines	0 comments	Download
M	src/com/google/caja/parser/js/StringLiteral.java	View	2 chunks	+2 lines, -1 line	0 comments	Download
M	src/com/google/caja/render/TokenClassification.java	View	1 chunk	+0 lines, -2 lines	0 comments	Download
M	tests/com/google/caja/lexer/lexergolden1.txt	View	2 chunks	+92 lines, -92 lines	0 comments	Download
M	tests/com/google/caja/lexer/lexertest1.js	View	2 chunks	+6 lines, -9 lines	0 comments	Download
M	tests/com/google/caja/parser/js/ParserTest.java	View	2 chunks	+11 lines, -2 lines	0 comments	Download
M	tests/com/google/caja/parser/js/StringLiteralTest.java	View	1 chunk	+1 line, -0 lines	0 comments	Download
M	tests/com/google/caja/parser/js/parsergolden10.txt	View	1 chunk	+15 lines, -0 lines	0 comments	Download
M	tests/com/google/caja/parser/js/parsertest10.js	View	1 chunk	+9 lines, -0 lines	0 comments	Download
M	tests/com/google/caja/render/JsMinimalPrinterTest.java	View	1 chunk	+16 lines, -0 lines	0 comments	Download

Messages

Total messages: 3

Expand All Messages | Collapse All Messages

@r5622

Expand All Messages | Collapse All Messages