|
|
|
Created:
15 years, 10 months ago by gagan.goku Modified:
15 years, 5 months ago CC:
anupama.dutta, pradnya, vikaas.arora, cool-shindig-committers_googlegroups.com Base URL:
http://svn.apache.org/repos/asf/shindig/trunk/ Visibility:
Public. |
DescriptionCurrently we have code that converts & to & and then & in url back to &.
There is a bug right now in the way that dynamic nodes are added (say from StyleConcatVisitor) for which the reconversion from & to & does not happen.
Patch Set 1 #Patch Set 2 : removing_unnecessary_imports #
Total comments: 2
Patch Set 3 : uploading_all_changes_before_svn_up #Patch Set 4 : adding_option_for_not_escaping_ampersand #Patch Set 5 : removing_unnecessary_files #
MessagesTotal messages: 6
Good catch in the MutableContent! http://codereview.appspot.com/1632042/diff/2001/3003 File java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java (left): http://codereview.appspot.com/1632042/diff/2001/3003#oldcode169 java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java:169: elem.getNamespaceURI() == null && Can you create a test case that show it is not needed? (or maybe I didn't see it) http://codereview.appspot.com/1632042/diff/2001/3005 File java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java (right): http://codereview.appspot.com/1632042/diff/2001/3005#newcode135 java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java:135: concatUri.getUri().toString()); I guess there is a reason why it was added, maybe the right solution is to fix the parser?
Sign in to reply to this message.
On 2010/06/09 23:04:12, zhoresh wrote: > Good catch in the MutableContent! > Thanks > http://codereview.appspot.com/1632042/diff/2001/3003 > File > java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java > (left): > > http://codereview.appspot.com/1632042/diff/2001/3003#oldcode169 > java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java:169: > elem.getNamespaceURI() == null && > Can you create a test case that show it is not needed? (or maybe I didn't see > it) > > http://codereview.appspot.com/1632042/diff/2001/3005 > File > java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java > (right): > > http://codereview.appspot.com/1632042/diff/2001/3005#newcode135 > java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java:135: > concatUri.getUri().toString()); > I guess there is a reason why it was added, maybe the right solution is to fix > the parser? I thought i understood what was going on, but now i don't. What I think is going on is: Elements created using Document.createElement() method have their getNamespaceURI as null. Elements that are parsed from html or copied / cloned around here and there have their getNamespaceURI as non-null. In both ConcatVistor and RenderingGadgetRewriter, we create elements using createElement and these will have getNamespaceURI() as null. We manually go and replace all occurances of '&' with '&' in the url. In DefaultHtmlSerializer, we check for url's and the logic states what for everything other than a url, replace '&' with '&'. But the logic for isUrl has the check element.getNamespaceURI() which will count only the elements created using createElement. So, all other urls that were parsed from html, get their '&' replaced by '&' Thus, the actual behavior seems to be to replace '&' by '&' but the intent of the code logic states otherwise. My main question is: why do we need this escaping & to & business. We should really not be doing such hardcoding stuff either. To make things even worse, in StyleTagExtractorVisitor, we create a <link> node but we dont seem to be replacing '&' with '&' here. So we don't seem to be consistent either. Here is a list of all places where we use '&': http://www.google.com/codesearch?hl=en&sa=N&filter=0&num=100&q=file:gadgets+f... To make things even worser, browsers seem to be treating & in urls differently. Using the changes in current patch, i loaded view-source:http://localhost:9003/gadgets/accel?container=default&url=http://www.cnn.com/ It has a concatenated script tag: <script src="http://localhost:9003/gadgets/concat?container=default&gadget=http%3A%2F%2Fwww.cnn.com%2F&debug=0&nocache=0&type=js&json=false&1=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&2=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&3=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&4=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&5=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&6=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&7=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&8=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&9=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&10=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&11=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&12=http%3A%2F%2Fi.cdn.turner.com%2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&13=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&14=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&15=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&16=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&17=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&18=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&19=http%3A%2F%2Fwww.cnn.com%2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&20=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&21=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js" type="text/javascript"></script> Then i ran nc -l -p 9003 If i click on the img src link from within firefox, the request i get is this: GET /gadgets/concat?container=default&gadget=http%3A%2F%2Fwww.cnn.com%2F&debug=0&nocache=0&type=js&json=false&1=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&2=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&3=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&4=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&5=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&6=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&7=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&8=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&9=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&10=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&11=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&12=http%3A%2F%2Fi.cdn.turner.com%2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&13=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&14=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&15=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&16=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&17=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&18=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&19=http%3A%2F%2Fwww.cnn.com%2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&20=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&21=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js HTTP/1.1 that is, with & in url replaced by & Now if i directly copy and paste the exact link in the navigation bar, here is what i get: GET /gadgets/concat?container=default&gadget=http%3A%2F%2Fwww.cnn.com%2F&debug=0&nocache=0&type=js&json=false&1=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&2=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&3=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&4=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&5=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&6=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&7=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&8=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&9=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&10=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&11=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&12=http%3A%2F%2Fi.cdn.turner.com%2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&13=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&14=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&15=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&16=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&17=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&18=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&19=http%3A%2F%2Fwww.cnn.com%2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&20=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&21=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js HTTP/1.1 that is, with & as is. This url throws a 404 with the following error: MISSING_PARAMETER ..<url>.. Missing type. I have spent quite some time trying to think this through and come up with an acceptable solution. In the process, i ended up adding an option called gadgets.uri.escapeAmpersandToAndAmp in the shindig.properties file, hoping that i could encapsulate all the '&' logic with this flag and this would be a non breaking change. Alas, im not fluent enough in Guice to accomplish that [I could not inject Injector into some classes and ended up modifying lot of code :( ] My suggestion is to get rid of escaping '&' business for urls everywhere. Why do we need it anyways ? To make things worsest, codereview did not let me submit this reply and kept throwing invalid XSRF token or some crap. So end result = i'm confused and kind of giving up.
Sign in to reply to this message.
If I understand correctly, the html definition call for escaping & with &, if only for the small chance of something actually want a value like '<' as is. So I don't think we want the new logic in your change to disable it. What we need is to fix the places it is not converted back correctly. On Sat, Jun 12, 2010 at 10:20 PM, <gagan.goku@gmail.com> wrote: > Reviewers: johnfargo, shindig.remailer_gmail.com, zhoresh, > > Message: > > On 2010/06/09 23:04:12, zhoresh wrote: > >> Good catch in the MutableContent! >> > > Thanks > > > http://codereview.appspot.com/1632042/diff/2001/3003 >> File >> > > > java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java > >> (left): >> > > http://codereview.appspot.com/1632042/diff/2001/3003#oldcode169 >> > > > java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java:169: > >> elem.getNamespaceURI() == null && >> Can you create a test case that show it is not needed? (or maybe I >> > didn't see > >> it) >> > > http://codereview.appspot.com/1632042/diff/2001/3005 >> File >> > > > java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java > >> (right): >> > > http://codereview.appspot.com/1632042/diff/2001/3005#newcode135 >> > > > java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java:135: > >> concatUri.getUri().toString()); >> I guess there is a reason why it was added, maybe the right solution >> > is to fix > >> the parser? >> > > I thought i understood what was going on, but now i don't. > > What I think is going on is: > Elements created using Document.createElement() method have their > getNamespaceURI as null. > Elements that are parsed from html or copied / cloned around here and > there have their getNamespaceURI as non-null. > > In both ConcatVistor and RenderingGadgetRewriter, we create elements > using createElement and these will have getNamespaceURI() as null. We > manually go and replace all occurances of '&' with '&' in the url. > > In DefaultHtmlSerializer, we check for url's and the logic states what > for everything other than a url, replace '&' with '&'. But the logic > for isUrl has the check element.getNamespaceURI() which will count only > the elements created using createElement. > So, all other urls that were parsed from html, get their '&' replaced by > '&' > > Thus, the actual behavior seems to be to replace '&' by '&' but the > intent of the code logic states otherwise. > > My main question is: why do we need this escaping & to & business. We > should really not be doing such hardcoding stuff either. > > To make things even worse, in StyleTagExtractorVisitor, we create a > <link> node but we dont seem to be replacing '&' with '&' here. So > we don't seem to be consistent either. > > Here is a list of all places where we use '&': > > http://www.google.com/codesearch?hl=en&sa=N&filter=0&num=100&q=file:gadgets+f... > :\.java+\%26amp;+-file:Test.java > > > To make things even worser, browsers seem to be treating & in urls > differently. > Using the changes in current patch, i loaded > view-source: > http://localhost:9003/gadgets/accel?container=default&url=http://www.cnn.com/ > > It has a concatenated script tag: > <script > src=" > http://localhost:9003/gadgets/concat?container=default&gadget=http%3A%2F%... > " > type="text/javascript"></script> > > Then i ran nc -l -p 9003 > > If i click on the img src link from within firefox, the request i get is > this: > GET > /gadgets/concat?container=default&gadget=http%3A%2F%2Fwww.cnn.com > %2F&debug=0&nocache=0&type=js&json=false&1=http%3A%2F%2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&2=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&3=http%3A%2F% > 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&4=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&5=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&6=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&7=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&8=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&9=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&10=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&11=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&12=http%3A%2F% > 2Fi.cdn.turner.com > %2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&13=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&14=http%3A%2F% > 2Fi.cdn.turner.com%2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&15=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&16=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&17=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&18=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&19=http%3A%2F%2Fwww.cnn.com > %2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&20=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&21=http%3A%2F% > 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js > HTTP/1.1 > > that is, with & in url replaced by & > > > Now if i directly copy and paste the exact link in the navigation bar, > here is what i get: > > GET > /gadgets/concat?container=default&gadget=http%3A%2F%2Fwww.cnn.com > %2F&debug=0&nocache=0&type=js&json=false&1=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&2=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&3=http%3A%2F% > 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&4=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&5=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&6=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&7=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&8=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&9=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&10=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&11=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&12=http%3A%2F% > 2Fi.cdn.turner.com > %2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&13=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&14=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&15=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&16=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&17=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&18=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&19=http%3A%2F% > 2Fwww.cnn.com > %2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&20=http%3A%2F% > 2Fi.cdn.turner.com > %2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&21=http%3A%2F% > 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js > HTTP/1.1 > > that is, with & as is. > This url throws a 404 with the following error: > MISSING_PARAMETER ..<url>.. Missing type. > > > I have spent quite some time trying to think this through and come up > with an acceptable solution. In the process, i ended up adding an option > called gadgets.uri.escapeAmpersandToAndAmp in the shindig.properties > file, hoping that i could encapsulate all the '&' logic with this > flag and this would be a non breaking change. Alas, im not fluent enough > in Guice to accomplish that [I could not inject Injector into some > classes and ended up modifying lot of code :( ] > > > My suggestion is to get rid of escaping '&' business for urls > everywhere. Why do we need it anyways ? > To make things worsest, codereview did not let me submit this reply and > kept throwing invalid XSRF token or some crap. > So end result = i'm confused and kind of giving up. > > Description: > Currently we have code that converts & to & and then & in url > back to &. > There is a bug right now in the way that dynamic nodes are added (say > from StyleConcatVisitor) for which the reconversion from & to & does > not happen. > > > Please review this at http://codereview.appspot.com/1632042/show > > Affected files: > M .gitignore > M config/container.js > M java/common/conf/shindig.properties > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/CompactHtmlSerializer.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParser.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/render/RenderingGadgetRewriter.java > A > java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/AbsolutePathReferenceRewriter.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ProxyingContentRewriter.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/RewriteModule.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ScriptConcatContentRewriter.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/StyleConcatContentRewriter.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/servlet/HtmlAccelServlet.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/spec/MessageBundle.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/uri/DefaultConcatUriManager.java > M > java/gadgets/src/main/java/org/apache/shindig/gadgets/uri/DefaultProxyUriManager.java > M > java/gadgets/src/test/java/org/apache/shindig/gadgets/http/AbstractHttpFetcherTest.java > M > java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParsingTestBase.java > A > java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializerTest.java > M > java/gadgets/src/test/java/org/apache/shindig/gadgets/render/RenderingGadgetRewriterTest.java > M > java/gadgets/src/test/java/org/apache/shindig/gadgets/render/SanitizingGadgetRewriterTest.java > M > java/gadgets/src/test/java/org/apache/shindig/gadgets/render/old/SanitizingGadgetRewriterTest.java > M > java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/ConcatVisitorTest.java > M > java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/DomWalkerTestBase.java > M > java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/RewriterTestBase.java > M > java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/old/BaseRewriterTestCase.java > M > java/gadgets/src/test/java/org/apache/shindig/gadgets/uri/DefaultConcatUriManagerTest.java > M > java/gadgets/src/test/java/org/apache/shindig/gadgets/uri/DefaultProxyUriManagerTest.java > > >
Sign in to reply to this message.
+1 to that. The reason you can't copy-and-paste the URL into your browser bar is that the URL is unescaped by the browser when it's the value of a target attribute ie src="value". Re: the XSRF issue on codereview, that happens occasionally with the tool and is orthogonal. I've seen it occur when you start an operation but don't complete it within a period of time (in the tool). A refresh usually fixes the problem. --j On Mon, Jun 14, 2010 at 10:12 AM, Ziv Horesh <zhoresh@gmail.com> wrote: > > If I understand correctly, the html definition call for escaping & with > &, if only for the small chance of something actually want a value like > '<' as is. > So I don't think we want the new logic in your change to disable it. > What we need is to fix the places it is not converted back correctly. > > > > On Sat, Jun 12, 2010 at 10:20 PM, <gagan.goku@gmail.com> wrote: > >> Reviewers: johnfargo, shindig.remailer_gmail.com, zhoresh, >> >> Message: >> >> On 2010/06/09 23:04:12, zhoresh wrote: >> >>> Good catch in the MutableContent! >>> >> >> Thanks >> >> >> http://codereview.appspot.com/1632042/diff/2001/3003 >>> File >>> >> >> >> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java >> >>> (left): >>> >> >> http://codereview.appspot.com/1632042/diff/2001/3003#oldcode169 >>> >> >> >> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java:169: >> >>> elem.getNamespaceURI() == null && >>> Can you create a test case that show it is not needed? (or maybe I >>> >> didn't see >> >>> it) >>> >> >> http://codereview.appspot.com/1632042/diff/2001/3005 >>> File >>> >> >> >> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java >> >>> (right): >>> >> >> http://codereview.appspot.com/1632042/diff/2001/3005#newcode135 >>> >> >> >> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java:135: >> >>> concatUri.getUri().toString()); >>> I guess there is a reason why it was added, maybe the right solution >>> >> is to fix >> >>> the parser? >>> >> >> I thought i understood what was going on, but now i don't. >> >> What I think is going on is: >> Elements created using Document.createElement() method have their >> getNamespaceURI as null. >> Elements that are parsed from html or copied / cloned around here and >> there have their getNamespaceURI as non-null. >> >> In both ConcatVistor and RenderingGadgetRewriter, we create elements >> using createElement and these will have getNamespaceURI() as null. We >> manually go and replace all occurances of '&' with '&' in the url. >> >> In DefaultHtmlSerializer, we check for url's and the logic states what >> for everything other than a url, replace '&' with '&'. But the logic >> for isUrl has the check element.getNamespaceURI() which will count only >> the elements created using createElement. >> So, all other urls that were parsed from html, get their '&' replaced by >> '&' >> >> Thus, the actual behavior seems to be to replace '&' by '&' but the >> intent of the code logic states otherwise. >> >> My main question is: why do we need this escaping & to & business. We >> should really not be doing such hardcoding stuff either. >> >> To make things even worse, in StyleTagExtractorVisitor, we create a >> <link> node but we dont seem to be replacing '&' with '&' here. So >> we don't seem to be consistent either. >> >> Here is a list of all places where we use '&': >> >> http://www.google.com/codesearch?hl=en&sa=N&filter=0&num=100&q=file:gadgets+f... >> :\.java+\%26amp;+-file:Test.java >> >> >> To make things even worser, browsers seem to be treating & in urls >> differently. >> Using the changes in current patch, i loaded >> view-source: >> http://localhost:9003/gadgets/accel?container=default&url=http://www.cnn.com/ >> >> It has a concatenated script tag: >> <script >> src=" >> http://localhost:9003/gadgets/concat?container=default&gadget=http%3A%2F%... >> " >> type="text/javascript"></script> >> >> Then i ran nc -l -p 9003 >> >> If i click on the img src link from within firefox, the request i get is >> this: >> GET >> /gadgets/concat?container=default&gadget=http%3A%2F%2Fwww.cnn.com >> %2F&debug=0&nocache=0&type=js&json=false&1=http%3A%2F%2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&2=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&3=http%3A%2F% >> 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&4=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&5=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&6=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&7=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&8=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&9=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&10=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&11=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&12=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&13=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&14=http%3A%2F% >> 2Fi.cdn.turner.com%2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&15=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&16=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&17=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&18=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&19=http%3A%2F% >> 2Fwww.cnn.com >> %2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&20=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&21=http%3A%2F% >> 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js >> HTTP/1.1 >> >> that is, with & in url replaced by & >> >> >> Now if i directly copy and paste the exact link in the navigation bar, >> here is what i get: >> >> GET >> /gadgets/concat?container=default&gadget=http%3A%2F%2Fwww.cnn.com >> %2F&debug=0&nocache=0&type=js&json=false&1=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&2=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&3=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&4=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&5=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&6=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&7=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&8=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&9=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&10=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&11=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&12=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&13=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&14=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&15=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&16=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&17=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&18=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&19=http%3A%2F% >> 2Fwww.cnn.com >> %2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&20=http%3A%2F% >> 2Fi.cdn.turner.com >> %2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&21=http%3A%2F% >> 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js >> HTTP/1.1 >> >> that is, with & as is. >> This url throws a 404 with the following error: >> MISSING_PARAMETER ..<url>.. Missing type. >> >> >> I have spent quite some time trying to think this through and come up >> with an acceptable solution. In the process, i ended up adding an option >> called gadgets.uri.escapeAmpersandToAndAmp in the shindig.properties >> file, hoping that i could encapsulate all the '&' logic with this >> flag and this would be a non breaking change. Alas, im not fluent enough >> in Guice to accomplish that [I could not inject Injector into some >> classes and ended up modifying lot of code :( ] >> >> >> My suggestion is to get rid of escaping '&' business for urls >> everywhere. Why do we need it anyways ? >> To make things worsest, codereview did not let me submit this reply and >> kept throwing invalid XSRF token or some crap. >> So end result = i'm confused and kind of giving up. >> >> Description: >> Currently we have code that converts & to & and then & in url >> back to &. >> There is a bug right now in the way that dynamic nodes are added (say >> from StyleConcatVisitor) for which the reconversion from & to & does >> not happen. >> >> >> Please review this at http://codereview.appspot.com/1632042/show >> >> Affected files: >> M .gitignore >> M config/container.js >> M java/common/conf/shindig.properties >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/CompactHtmlSerializer.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParser.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/render/RenderingGadgetRewriter.java >> A >> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/AbsolutePathReferenceRewriter.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ProxyingContentRewriter.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/RewriteModule.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ScriptConcatContentRewriter.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/StyleConcatContentRewriter.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/servlet/HtmlAccelServlet.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/spec/MessageBundle.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/uri/DefaultConcatUriManager.java >> M >> java/gadgets/src/main/java/org/apache/shindig/gadgets/uri/DefaultProxyUriManager.java >> M >> java/gadgets/src/test/java/org/apache/shindig/gadgets/http/AbstractHttpFetcherTest.java >> M >> java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParsingTestBase.java >> A >> java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializerTest.java >> M >> java/gadgets/src/test/java/org/apache/shindig/gadgets/render/RenderingGadgetRewriterTest.java >> M >> java/gadgets/src/test/java/org/apache/shindig/gadgets/render/SanitizingGadgetRewriterTest.java >> M >> java/gadgets/src/test/java/org/apache/shindig/gadgets/render/old/SanitizingGadgetRewriterTest.java >> M >> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/ConcatVisitorTest.java >> M >> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/DomWalkerTestBase.java >> M >> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/RewriterTestBase.java >> M >> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/old/BaseRewriterTestCase.java >> M >> java/gadgets/src/test/java/org/apache/shindig/gadgets/uri/DefaultConcatUriManagerTest.java >> M >> java/gadgets/src/test/java/org/apache/shindig/gadgets/uri/DefaultProxyUriManagerTest.java >> >> >> >
Sign in to reply to this message.
On 2010/06/14 17:22:23, johnfargo wrote:
> +1 to that. The reason you can't copy-and-paste the URL into your browser
> bar is that the URL is unescaped by the browser when it's the value of a
> target attribute ie src="value".
>
I chatted with zhoresh and i think now i understand why the browser would want
all &'s to be replaced by &
So ideally, the best thing would be to take care of all the & escaping in html
serializer itself.
However, I would not want to change something that breaks current functionality
:(
I'l try enumerating all possible cases and try to change only the
DefaultHtmlSerializer file. This way, all the logic for escaping "&" and other
escape characters like "quotes" will reside in only place and it would be
cleaner.
If you feel that approach is better, we can slowly start removing
stringreplace("&", "&") from places like ConcatVisitor and
RenderingGadgetRewriter after we are sure that we nailed down all possible
scenarios.
Sound reasonable ?
Btw, the Copy Location url not working when pasted into browser navigation bar
is really annoying. The first thing that comes to mind is that somehow the url
was not generated correctly.
Maybe we should find out why google.com does not choose to escape &. Does google
not bother about html standard, or does it assume that most famous browsers
already take care of it ?
> Re: the XSRF issue on codereview, that happens occasionally with the tool
> and is orthogonal. I've seen it occur when you start an operation but don't
> complete it within a period of time (in the tool). A refresh usually fixes
> the problem.
>
Yeah this went away when i refreshed, but my original comment was lost :(
> --j
>
> On Mon, Jun 14, 2010 at 10:12 AM, Ziv Horesh <mailto:zhoresh@gmail.com> wrote:
>
> >
Sign in to reply to this message.
On Mon, Jun 14, 2010 at 3:06 PM, <gagan.goku@gmail.com> wrote: > On 2010/06/14 17:22:23, johnfargo wrote: > >> +1 to that. The reason you can't copy-and-paste the URL into your >> > browser > >> bar is that the URL is unescaped by the browser when it's the value of >> > a > >> target attribute ie src="value". >> > > I chatted with zhoresh and i think now i understand why the browser > would want all &'s to be replaced by & > So ideally, the best thing would be to take care of all the & escaping > in html serializer itself. > However, I would not want to change something that breaks current > functionality :( > > I'l try enumerating all possible cases and try to change only the > DefaultHtmlSerializer file. This way, all the logic for escaping "&" > and other escape characters like "quotes" will reside in only place and > it would be cleaner. > If you feel that approach is better, we can slowly start removing > stringreplace("&", "&") from places like ConcatVisitor and > RenderingGadgetRewriter after we are sure that we nailed down all > possible scenarios. > > Sound reasonable ? > That sounds reasonable to me. I agree that proper encapsulation has HtmlSerializer taking care of this exclusively. > > Btw, the Copy Location url not working when pasted into browser > navigation bar is really annoying. The first thing that comes to mind is > that somehow the url was not generated correctly. > Maybe we should find out why google.com does not choose to escape &. > Does google not bother about html standard, or does it assume that most > famous browsers already take care of it ? Good question, and worth asking to various parties. Anyone from Shindig know why this is generally-accepted practice? > > > Re: the XSRF issue on codereview, that happens occasionally with the >> > tool > >> and is orthogonal. I've seen it occur when you start an operation but >> > don't > >> complete it within a period of time (in the tool). A refresh usually >> > fixes > >> the problem. >> > > Yeah this went away when i refreshed, but my original comment was lost > :( Sadness :( > > --j >> > > On Mon, Jun 14, 2010 at 10:12 AM, Ziv Horesh >> > <mailto:zhoresh@gmail.com> wrote: > > > >> > > > http://codereview.appspot.com/1632042/show >
Sign in to reply to this message.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
