DescriptionFix for bug report from Justin Wyllie:
The original problem which I posted to the users list was that gadgets with non UTF-8 encodings (I used iso-8859-1 to test) were losing all non ascii characters in both the title (metadata call) and content (gadget rendering call).
Details of the problem and solution is as follows:
In BasicRemoteContentFetcher this line:
$content = mb_convert_encoding($content, 'UTF-8', $charset);
converts the fetched XML as a string to UTF-8 whatever encoding it was in. ($charset is the source encoding)
But the xml declaration line was not touched. So, after this we may have a gadget like this:
<?xml version="1.0" encoding="iso-8859-1"?><Module> <ModulePrefs title="IñtërnâtiônàlizætiønX" /> <Content type="html"> <![CDATA[ ]]> </Content> </Module>
which is UTF-8 encoded but with an iso-8859-1 encoding attribute.
Later in the call (metadata request or gadget rendering) in GadgetSpecParser->parse() we load the XML content into an XML DOM object. At this point the error occurs - naturally as the UTF-8 content is flagged as being in iso-8859-1.
My fix is as follows:
In BasicRemoteContentFetcher->parseResult replace:
$content = mb_convert_encoding($content, 'UTF-8', $charset);
with
$content = mb_convert_encoding($content, 'UTF-8', $charset); $pattern = 'encoding=\s*([' . '\'"])' . $charset . '\s*\1'; $content = mb_ereg_replace($pattern,'encoding="UTF-8"',$content,"i") ;
Now the XML is UTF-8 encoded and has the correct UTF-8 encoding attribute.
Justin
Patch Set 1 #
MessagesTotal messages: 2
|