Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(3216)

Issue 1952044: PHP Shindig: non UTF-8 gadgets lose all non asci characters

Can't Edit
Can't Publish+Mail
Start Review
Created:
13 years, 8 months ago by bashofmann
Modified:
13 years, 8 months ago
Reviewers:
Paul Lindner
Base URL:
http://svn.apache.org/repos/asf/shindig/trunk/
Visibility:
Public.

Description

Fix for bug report from Justin Wyllie: The original problem which I posted to the users list was that gadgets with non UTF-8 encodings (I used iso-8859-1 to test) were losing all non ascii characters in both the title (metadata call) and content (gadget rendering call). Details of the problem and solution is as follows: In BasicRemoteContentFetcher this line: $content = mb_convert_encoding($content, 'UTF-8', $charset); converts the fetched XML as a string to UTF-8 whatever encoding it was in. ($charset is the source encoding) But the xml declaration line was not touched. So, after this we may have a gadget like this: <?xml version="1.0" encoding="iso-8859-1"?><Module> <ModulePrefs title="IñtërnâtiônàlizætiønX" /> <Content type="html"> <![CDATA[ ]]> </Content> </Module> which is UTF-8 encoded but with an iso-8859-1 encoding attribute. Later in the call (metadata request or gadget rendering) in GadgetSpecParser->parse() we load the XML content into an XML DOM object. At this point the error occurs - naturally as the UTF-8 content is flagged as being in iso-8859-1. My fix is as follows: In BasicRemoteContentFetcher->parseResult replace: $content = mb_convert_encoding($content, 'UTF-8', $charset); with $content = mb_convert_encoding($content, 'UTF-8', $charset); $pattern = 'encoding=\s*([' . '\'"])' . $charset . '\s*\1'; $content = mb_ereg_replace($pattern,'encoding="UTF-8"',$content,"i") ; Now the XML is UTF-8 encoded and has the correct UTF-8 encoding attribute. Justin

Patch Set 1 #

Unified diffs Side-by-side diffs Delta from patch set Stats (+22 lines, -11 lines) Patch
sample/BasicRemoteContentFetcher.php View 2 chunks +22 lines, -11 lines 0 comments Download

Messages

Total messages: 2
bashofmann
13 years, 8 months ago (2010-08-18 14:07:13 UTC) #1
Paul Lindner
13 years, 8 months ago (2010-08-18 15:47:32 UTC) #2
lgtm..
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b