|
|
|
Created:
15 years, 5 months ago by Jasvir Modified:
13 years, 11 months ago CC:
dev-remailer_shindig.apache.org, shindig.remailer_gmail.com Base URL:
https://svn.apache.org/repos/asf/shindig/trunk/ Visibility:
Public. |
DescriptionReplace the resource attribute list in ProxyVisitor with on that derives it programmatically from the html schema.
https://issues.apache.org/jira/browse/SHINDIG-1390
Patch Set 1 #
MessagesTotal messages: 6
Hi Jasvir, This code looks great. I however have a couple of doubts - * The tags you mentioned in the list contain a lot of elements-attribute combination which are no longer supported by the current browsers. Given that this is a one time initialization it doesn't really hurt us to include them. However, the HTML5 elements are still missing from the list. I noticed that you are handling "embed" separately, but that means we are still holding back to the older design. * There is another problem we are working on - We want some of the element attributes to be rewritten only when they meet certain criteria. Say for example, we want to rewrite a LINK HREF tag only when its TYPE is "text/css" and REL is "stylesheet" i.e. <LINK HREF="foo.css" TEXT="text/css" REL="stylesheet"></LINK> should be rewritten while <LINK HREF="foo-zh.html" HREFLANG="zh" REL="alternate></LINK> should not be rewritten. I am not very sure as to how this problem will be addressed with this code change. * Lastly, we still want the application to have the flexibility of configuring what attributes of what elements should be rewritten. E.g. we ourselves don't want to rewrite any of html or flash resources, like IFRAME SRC or EMBED SRC. I don't know if providing a config file with a list of tagsToRewrite is the best approach. - Kuntal
Sign in to reply to this message.
On 2010/07/28 12:02:21, Kuntal Loya wrote: > Hi Jasvir, > > This code looks great. I however have a couple of doubts - > > * The tags you mentioned in the list contain a lot of elements-attribute > combination which are no longer supported by the current browsers. Given that > this is a one time initialization it doesn't really hurt us to include them. > However, the HTML5 elements are still missing from the list. I noticed that you > are handling "embed" separately, but that means we are still holding back to the > older design. > > * There is another problem we are working on - We want some of the element > attributes to be rewritten only when they meet certain criteria. Say for > example, we want to rewrite a LINK HREF tag only when its TYPE is "text/css" and > REL is "stylesheet" i.e. <LINK HREF="foo.css" TEXT="text/css" > REL="stylesheet"></LINK> should be rewritten while <LINK HREF="foo-zh.html" > HREFLANG="zh" REL="alternate></LINK> should not be rewritten. > I am not very sure as to how this problem will be addressed with this code > change. > > * Lastly, we still want the application to have the flexibility of configuring > what attributes of what elements should be rewritten. E.g. we ourselves don't > want to rewrite any of html or flash resources, like IFRAME SRC or EMBED SRC. I > don't know if providing a config file with a list of tagsToRewrite is the best > approach. To elaborate more, proxying a resource (like css , js) which cannot have javascript inside it is probably fine. But proxying an html page (text/html) through is error prone. If you have an <a href> tag for some webpage like this: <a href="www.example.org">hello</a> and we rewrite it to <a href="http://localhost:8080/gadgets/proxy?url=www.example.org">hello</a> then we open up problems for xhr requests made by javascript included in the page. For example, a javascript on the page might be making an xmlHttpRequest for a relative url say "/foo" which earlier resolved to "www.example.org/foo" but will now resolve to "localhost:8080/foo" We found similar problems with flash as well. Hence the decision to not rewrite <iframe src> and <object> tags for now. > > > - Kuntal
Sign in to reply to this message.
Retracting this CL at least for now because of the problems pointed out. On 2010/07/28 12:28:29, gagan.goku wrote: > On 2010/07/28 12:02:21, Kuntal Loya wrote: > > Hi Jasvir, > > > > This code looks great. I however have a couple of doubts - > > > > * The tags you mentioned in the list contain a lot of elements-attribute > > combination which are no longer supported by the current browsers. Given that > > this is a one time initialization it doesn't really hurt us to include them. > > However, the HTML5 elements are still missing from the list. I noticed that > you > > are handling "embed" separately, but that means we are still holding back to > the > > older design. > > > > * There is another problem we are working on - We want some of the element > > attributes to be rewritten only when they meet certain criteria. Say for > > example, we want to rewrite a LINK HREF tag only when its TYPE is "text/css" > and > > REL is "stylesheet" i.e. <LINK HREF="foo.css" TEXT="text/css" > > REL="stylesheet"></LINK> should be rewritten while <LINK HREF="foo-zh.html" > > HREFLANG="zh" REL="alternate></LINK> should not be rewritten. > > I am not very sure as to how this problem will be addressed with this code > > change. > > > > * Lastly, we still want the application to have the flexibility of configuring > > what attributes of what elements should be rewritten. E.g. we ourselves don't > > want to rewrite any of html or flash resources, like IFRAME SRC or EMBED SRC. > I > > don't know if providing a config file with a list of tagsToRewrite is the best > > approach. > To elaborate more, proxying a resource (like css , js) which cannot have > javascript inside it is probably fine. But proxying an html page (text/html) > through is error prone. If you have an <a href> tag for some webpage like this: I am not sure I understand - both css files (in expression() statements) and js files can contain javascript. Is the problem that not all relative links buried inside scripts can be found and rewritten statically? > <a href="www.example.org">hello</a> > > and we rewrite it to > <a href="http://localhost:8080/gadgets/proxy?url=www.example.org">hello</a> > > then we open up problems for xhr requests made by javascript included in the > page. For example, a javascript on the page might be making an xmlHttpRequest > for a relative url say "/foo" which earlier resolved to "www.example.org/foo" > but will now resolve to "localhost:8080/foo" > > We found similar problems with flash as well. Hence the decision to not rewrite > <iframe src> and <object> tags for now. > > > > > > > - Kuntal
Sign in to reply to this message.
Hi Jasvir On Fri, Jul 30, 2010 at 12:58 AM, <jasvir@gmail.com> wrote: > Retracting this CL at least for now because of the problems pointed out. > > > On 2010/07/28 12:28:29, gagan.goku wrote: > >> On 2010/07/28 12:02:21, Kuntal Loya wrote: >> > Hi Jasvir, >> > >> > This code looks great. I however have a couple of doubts - >> > >> > * The tags you mentioned in the list contain a lot of >> > elements-attribute > >> > combination which are no longer supported by the current browsers. >> > Given that > >> > this is a one time initialization it doesn't really hurt us to >> > include them. > >> > However, the HTML5 elements are still missing from the list. I >> > noticed that > >> you >> > are handling "embed" separately, but that means we are still holding >> > back to > >> the >> > older design. >> > >> > * There is another problem we are working on - We want some of the >> > element > >> > attributes to be rewritten only when they meet certain criteria. Say >> > for > >> > example, we want to rewrite a LINK HREF tag only when its TYPE is >> > "text/css" > >> and >> > REL is "stylesheet" i.e. <LINK HREF="foo.css" TEXT="text/css" >> > REL="stylesheet"></LINK> should be rewritten while <LINK >> > HREF="foo-zh.html" > >> > HREFLANG="zh" REL="alternate></LINK> should not be rewritten. >> > I am not very sure as to how this problem will be addressed with >> > this code > >> > change. >> > >> > * Lastly, we still want the application to have the flexibility of >> > configuring > >> > what attributes of what elements should be rewritten. E.g. we >> > ourselves don't > >> > want to rewrite any of html or flash resources, like IFRAME SRC or >> > EMBED SRC. > >> I >> > don't know if providing a config file with a list of tagsToRewrite >> > is the best > >> > approach. >> To elaborate more, proxying a resource (like css , js) which cannot >> > have > >> javascript inside it is probably fine. But proxying an html page >> > (text/html) > >> through is error prone. If you have an <a href> tag for some webpage >> > like this: > > I am not sure I understand - both css files (in expression() statements) > and js files can contain javascript. Is the problem that not all > relative links buried inside scripts can be found and rewritten > statically? > > Yes, css can also have javascript in the expression statement, but we are kind of ignoring that for since its only ie specific and is kind of not widely used. The other issue is more problematic though. As you correctly stated, the crux of the problem is that its not possible to statically find relative links that the browser is going to follow (that i know of). Hence the decision to not rewrite <iframe src> and <object> tags for now. > <a href="www.example.org">hello</a> >> > > and we rewrite it to >> <a >> > href="http://localhost:8080/gadgets/proxy?url=www.example.org">hello</a> > > then we open up problems for xhr requests made by javascript included >> > in the > >> page. For example, a javascript on the page might be making an >> > xmlHttpRequest > >> for a relative url say "/foo" which earlier resolved to >> > "www.example.org/foo" > >> but will now resolve to "localhost:8080/foo" >> > > We found similar problems with flash as well. Hence the decision to >> > not rewrite > >> <iframe src> and <object> tags for now. >> > > > >> > >> > - Kuntal >> > > > > http://codereview.appspot.com/1900042/show >
Sign in to reply to this message.
Hi Jasvir, We can retract this CL for now, but we'd still like to work on an elegant solution to the problems mentioned. For the 1st and the 3rd issues mentioned - I can't think of anything better than populating the list of elements as we are doing now. But instead of mentioning the element names as string, we can use something like nu.validator.htmlparser.impl.ElementName and AttributeName (we actually just need an enum and not the whole class) For the 2nd issue mentioned, what I could think of is - In the map, along with the list of attributes to rewrite, we can have a Function which takes the element as a parameter and returns true if the attributes can be rewritten else false. Here we assume, that if there are multiple attributes, they will all be rewritten if the function returns true, a separate function for each attribute seems a little too much. We will need a builder or have to find some other way to populate this resource map. Thoughts? In the meantime, can we proceed with the original CL http://codereview.appspot.com/1806044/show? Thanks, Kuntal On Fri, Jul 30, 2010 at 12:58 AM, <jasvir@gmail.com> wrote: > Retracting this CL at least for now because of the problems pointed out. > > > On 2010/07/28 12:28:29, gagan.goku wrote: > >> On 2010/07/28 12:02:21, Kuntal Loya wrote: >> > Hi Jasvir, >> > >> > This code looks great. I however have a couple of doubts - >> > >> > * The tags you mentioned in the list contain a lot of >> > elements-attribute > >> > combination which are no longer supported by the current browsers. >> > Given that > >> > this is a one time initialization it doesn't really hurt us to >> > include them. > >> > However, the HTML5 elements are still missing from the list. I >> > noticed that > >> you >> > are handling "embed" separately, but that means we are still holding >> > back to > >> the >> > older design. >> > >> > * There is another problem we are working on - We want some of the >> > element > >> > attributes to be rewritten only when they meet certain criteria. Say >> > for > >> > example, we want to rewrite a LINK HREF tag only when its TYPE is >> > "text/css" > >> and >> > REL is "stylesheet" i.e. <LINK HREF="foo.css" TEXT="text/css" >> > REL="stylesheet"></LINK> should be rewritten while <LINK >> > HREF="foo-zh.html" > >> > HREFLANG="zh" REL="alternate></LINK> should not be rewritten. >> > I am not very sure as to how this problem will be addressed with >> > this code > >> > change. >> > >> > * Lastly, we still want the application to have the flexibility of >> > configuring > >> > what attributes of what elements should be rewritten. E.g. we >> > ourselves don't > >> > want to rewrite any of html or flash resources, like IFRAME SRC or >> > EMBED SRC. > >> I >> > don't know if providing a config file with a list of tagsToRewrite >> > is the best > >> > approach. >> To elaborate more, proxying a resource (like css , js) which cannot >> > have > >> javascript inside it is probably fine. But proxying an html page >> > (text/html) > >> through is error prone. If you have an <a href> tag for some webpage >> > like this: > > I am not sure I understand - both css files (in expression() statements) > and js files can contain javascript. Is the problem that not all > relative links buried inside scripts can be found and rewritten > statically? > > > <a href="www.example.org">hello</a> >> > > and we rewrite it to >> <a >> > href="http://localhost:8080/gadgets/proxy?url=www.example.org">hello</a> > > then we open up problems for xhr requests made by javascript included >> > in the > >> page. For example, a javascript on the page might be making an >> > xmlHttpRequest > >> for a relative url say "/foo" which earlier resolved to >> > "www.example.org/foo" > >> but will now resolve to "localhost:8080/foo" >> > > We found similar problems with flash as well. Hence the decision to >> > not rewrite > >> <iframe src> and <object> tags for now. >> > > > >> > >> > - Kuntal >> > > > > http://codereview.appspot.com/1900042/show >
Sign in to reply to this message.
|
|||||||||||||||||||||||||||||||||||||||||||||||
