LGTM http://codereview.appspot.com/2006042/diff/2001/3001 File java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java (right): http://codereview.appspot.com/2006042/diff/2001/3001#newcode49 java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:49: parser.parseDom(""); Empty document should ideally not throw an ...
15 years, 5 months ago
(2010-08-25 07:49:24 UTC)
#2
LGTM
http://codereview.appspot.com/2006042/diff/2001/3001
File
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java
(right):
http://codereview.appspot.com/2006042/diff/2001/3001#newcode49
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:49:
parser.parseDom("");
Empty document should ideally not throw an exception, is there a way to handle
it?
Empty string could be returned as an empty page as
<html><head></head><body></body></html>
15 years, 5 months ago
(2010-08-25 07:59:25 UTC)
#3
On 2010/08/25 07:49:24, Kuntal Loya wrote:
> LGTM
>
> http://codereview.appspot.com/2006042/diff/2001/3001
> File
>
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java
> (right):
>
> http://codereview.appspot.com/2006042/diff/2001/3001#newcode49
>
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:49:
> parser.parseDom("");
> Empty document should ideally not throw an exception, is there a way to handle
> it?
> Empty string could be returned as an empty page as
> <html><head></head><body></body></html>
We could take care of it by checking if the trimmed content is empty and return
the empty well formed html document in that case.
But that would defeat the purpose of VanillaCajaHtmlParser.
The idea here is that if there is something bad in caja parser, we should fix it
in caja codebase. This class should be a thin wrapper integrating caja parser
functionality.
Besides, we have tested VanillaCajaHtmlParser enough and we never ran into an
issue where we would like an empty document. If at all, we would like it to
throw an exception on non html content so we can detect that non html data is
being served out as text/html.
Ideally the Caja parser should take care of it. You can mark this test case ...
15 years, 5 months ago
(2010-08-25 08:24:45 UTC)
#4
Ideally the Caja parser should take care of it. You can mark this test case as a
bad test, which should eventually be fixed.
Btw, are we talking of non-html content, or no content?
On 2010/08/25 08:24:45, Kuntal Loya wrote: > Ideally the Caja parser should take care of ...
15 years, 5 months ago
(2010-08-25 09:25:58 UTC)
#5
On 2010/08/25 08:24:45, Kuntal Loya wrote:
> Ideally the Caja parser should take care of it. You can mark this test case as
a
> bad test, which should eventually be fixed.
> Btw, are we talking of non-html content, or no content?
On 2010/08/25 08:24:45, Kuntal Loya wrote: > Ideally the Caja parser should take care of ...
15 years, 5 months ago
(2010-08-25 09:26:30 UTC)
#6
On 2010/08/25 08:24:45, Kuntal Loya wrote:
> Ideally the Caja parser should take care of it. You can mark this test case as
a
> bad test, which should eventually be fixed.
> Btw, are we talking of non-html content, or no content?
Marked the test where non html content is getting parsed as a document as a bad
test case. Please take another look.
15 years, 5 months ago
(2010-08-26 06:12:13 UTC)
#9
http://codereview.appspot.com/2006042/diff/8001/9004
File
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java
(right):
http://codereview.appspot.com/2006042/diff/8001/9004#newcode41
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:41:
@Named("vanillaCaJaParser.needsDebugData")
On 2010/08/26 04:52:02, anupama.dutta wrote:
> Change CaJa to Caja?
> Also, don't we need to add this to shindig.properties or some such thing?
Nice catch.
Done.
http://codereview.appspot.com/2006042/diff/15001/16005 File java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java (right): http://codereview.appspot.com/2006042/diff/15001/16005#newcode49 java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:49: return parseDomImpl(source); There is some caching related code in ...
15 years, 5 months ago
(2010-08-26 09:34:55 UTC)
#10
http://codereview.appspot.com/2006042/diff/15001/16005
File
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java
(right):
http://codereview.appspot.com/2006042/diff/15001/16005#newcode49
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:49:
return parseDomImpl(source);
There is some caching related code in GadgetHtmlParser's parseDom method - don't
we need that in addition to this call? I agree that the extra glue code for
ensuring that head tags exist etc. may not be needed here.
http://codereview.appspot.com/2006042/diff/15001/16005#newcode57
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:57:
DomParser parser = new DomParser(tokenQueue, false, mq);
Can you add a comment beside the false parameter to clarify what it is?
http://codereview.appspot.com/2006042/diff/15001/16005#newcode72
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:72:
doc.setUserData(HtmlSerialization.KEY, serializer, null);
How about using HtmlSerializer.attach(doc, serializer, null) - setUserData
confused me in terms of terminology.
http://codereview.appspot.com/2006042/diff/15001/16005#newcode75
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:75:
throw new GadgetException(GadgetException.Code.INTERNAL_SERVER_ERROR, e);
Do you think throwing the e.getCajaMessage().toString() etc. stuff as done in
CajaHtmlParser would be a good strategy?
http://codereview.appspot.com/2006042/diff/15001/16005 File java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java (right): http://codereview.appspot.com/2006042/diff/15001/16005#newcode49 java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:49: return parseDomImpl(source); On 2010/08/26 09:34:55, anupama.dutta wrote: > There ...
15 years, 4 months ago
(2010-08-27 02:11:50 UTC)
#11
http://codereview.appspot.com/2006042/diff/15001/16005
File
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java
(right):
http://codereview.appspot.com/2006042/diff/15001/16005#newcode49
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:49:
return parseDomImpl(source);
On 2010/08/26 09:34:55, anupama.dutta wrote:
> There is some caching related code in GadgetHtmlParser's parseDom method -
don't
> we need that in addition to this call? I agree that the extra glue code for
> ensuring that head tags exist etc. may not be needed here.
I suggest we add caching code after making sure it helps and is not a
detrimental to performance. Since we have better ideas like caching rewritten
responses that we would like to try out in the future, would it make sense to
add a TODO here to add a Dom cache after more evaluation.
Also, we might want to extend VanillaCajaParser into a
CachedEnabledVanillaCajaParser if we later find out that document caching helps
in reducing latency. SO we need not decide on the strategy right away.
http://codereview.appspot.com/2006042/diff/15001/16005#newcode57
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:57:
DomParser parser = new DomParser(tokenQueue, false, mq);
On 2010/08/26 09:34:55, anupama.dutta wrote:
> Can you add a comment beside the false parameter to clarify what it is?
Done.
http://codereview.appspot.com/2006042/diff/15001/16005#newcode72
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:72:
doc.setUserData(HtmlSerialization.KEY, serializer, null);
On 2010/08/26 09:34:55, anupama.dutta wrote:
> How about using HtmlSerializer.attach(doc, serializer, null) - setUserData
> confused me in terms of terminology.
Done.
http://codereview.appspot.com/2006042/diff/15001/16005#newcode75
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:75:
throw new GadgetException(GadgetException.Code.INTERNAL_SERVER_ERROR, e);
On 2010/08/26 09:34:55, anupama.dutta wrote:
> Do you think throwing the e.getCajaMessage().toString() etc. stuff as done in
> CajaHtmlParser would be a good strategy?
Done.
LGTM. Please send to jasvir@, dev@ for review + commit. http://codereview.appspot.com/2006042/diff/20001/21002 File java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java (right): http://codereview.appspot.com/2006042/diff/20001/21002#newcode56 ...
15 years, 4 months ago
(2010-08-27 06:27:46 UTC)
#12
LGTM.
Please send to jasvir@, dev@ for review + commit.
http://codereview.appspot.com/2006042/diff/20001/21002
File
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java
(right):
http://codereview.appspot.com/2006042/diff/20001/21002#newcode56
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:56:
// Bad behavior by Caja DomParser. Bug to be raised with Caja team.
Explain what should have happened here and make this a TODO?
http://codereview.appspot.com/2006042/diff/20001/21002#newcode93
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:93:
// Now test a bad case of normalization. Bug to be raised with Caja team.
Explain what should have happened here and make this a TODO?
http://codereview.appspot.com/2006042/diff/20001/21002 File java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java (right): http://codereview.appspot.com/2006042/diff/20001/21002#newcode56 java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:56: // Bad behavior by Caja DomParser. Bug to be ...
15 years, 4 months ago
(2010-08-27 12:26:14 UTC)
#13
http://codereview.appspot.com/2006042/diff/20001/21002
File
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java
(right):
http://codereview.appspot.com/2006042/diff/20001/21002#newcode56
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:56:
// Bad behavior by Caja DomParser. Bug to be raised with Caja team.
On 2010/08/27 06:27:47, anupama.dutta wrote:
> Explain what should have happened here and make this a TODO?
Done.
http://codereview.appspot.com/2006042/diff/20001/21002#newcode93
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:93:
// Now test a bad case of normalization. Bug to be raised with Caja team.
On 2010/08/27 06:27:47, anupama.dutta wrote:
> Explain what should have happened here and make this a TODO?
Done.
LGTM. http://codereview.appspot.com/2006042/diff/20002/21007 File java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java (right): http://codereview.appspot.com/2006042/diff/20002/21007#newcode56 java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:56: // Bad behavior by Caja DomParser. Bug to ...
15 years, 4 months ago
(2010-08-27 13:36:47 UTC)
#14
LGTM.
http://codereview.appspot.com/2006042/diff/20002/21007
File
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java
(right):
http://codereview.appspot.com/2006042/diff/20002/21007#newcode56
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:56:
// Bad behavior by Caja DomParser. Bug to be raised with Caja team.
Nitpick: Did you want to put a TODO: here and in the other bad test case as
well, so that these can be spotted easily?
Thanks for the thorough review Anupama. http://codereview.appspot.com/2006042/diff/20002/21007 File java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java (right): http://codereview.appspot.com/2006042/diff/20002/21007#newcode56 java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:56: // Bad behavior ...
15 years, 4 months ago
(2010-08-27 14:08:20 UTC)
#15
Thanks for the thorough review Anupama.
http://codereview.appspot.com/2006042/diff/20002/21007
File
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java
(right):
http://codereview.appspot.com/2006042/diff/20002/21007#newcode56
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:56:
// Bad behavior by Caja DomParser. Bug to be raised with Caja team.
On 2010/08/27 13:36:47, anupama.dutta wrote:
> Nitpick: Did you want to put a TODO: here and in the other bad test case as
> well, so that these can be spotted easily?
Done.
15 years, 4 months ago
(2010-09-02 22:23:43 UTC)
#16
http://codereview.appspot.com/2006042/diff/37001/38005
File
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java
(right):
http://codereview.appspot.com/2006042/diff/37001/38005#newcode70
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:70:
DomParser parser = getDomParser(source + '\n', mq);
I don't think you need the extraneous EOL whitespace at the any more.
http://codereview.appspot.com/2006042/diff/37001/38002
File
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java
(right):
http://codereview.appspot.com/2006042/diff/37001/38002#newcode102
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:102:
public void testBadTagBalancing() throws Exception {
This is not correct both according to spec and in practice on common browsers.
The html5 spec suggests that the open script tag will consume till the next
close script tag or the end of file. 4/4 modern browsers also consume tokens
till the next close script tag and do not execute the script at all if a close
script is not found.
15 years, 4 months ago
(2010-09-05 04:50:49 UTC)
#18
http://codereview.appspot.com/2006042/diff/37001/38005
File
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java
(right):
http://codereview.appspot.com/2006042/diff/37001/38005#newcode70
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParser.java:70:
DomParser parser = getDomParser(source + '\n', mq);
On 2010/09/02 22:23:43, jasvir wrote:
> I don't think you need the extraneous EOL whitespace at the any more.
Done.
http://codereview.appspot.com/2006042/diff/37001/38002
File
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java
(right):
http://codereview.appspot.com/2006042/diff/37001/38002#newcode102
java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/VanillaCajaHtmlParserTest.java:102:
public void testBadTagBalancing() throws Exception {
On 2010/09/02 22:23:43, jasvir wrote:
> This is not correct both according to spec and in practice on common browsers.
> The html5 spec suggests that the open script tag will consume till the next
> close script tag or the end of file. 4/4 modern browsers also consume tokens
> till the next close script tag and do not execute the script at all if a close
> script is not found.
Taking your word for parsing unclosed script tags. I tried this example on
chrome and firefox, and neither of them executed the script element.
Issue 2006042: Vanilla Caja html parser
(Closed)
Created 15 years, 5 months ago by gagan.goku
Modified 15 years, 4 months ago
Reviewers: dev_shindig.apache.org, Jasvir, anupama.dutta, zhoresh, mhermanto
Base URL: http://svn.apache.org/repos/asf/shindig/trunk/
Comments: 21