DescriptionInspect byte sequences read in xml.text() for presence of XML
disallowed characters.
Uses a second bytes.Buffer to inspect bytes read into the data
slice rune-by-rune. New var CharacterRange []unicode.Range
(taken from http://www.xml.com/axml/testaxml.htm Section 2.2
Characters) created to assist in the inspection.
Test case "TestDisallowedCharacters" added to xml_test.go.
This change adds time to parses: locally, a sample NLM Medline
file of 59 megabytes averages 9.3 seconds over 5 consecutive
runs to process without the inspection, and 10.0 seconds to
process with the inspection. Very short files appear to take
some thousands of nano-seconds longer. Not a scientific analysis
of the difference.
Fixes issue 1259.
Patch Set 1 #Patch Set 2 : code review 2967041: Inspect byte sequences read in xml.text() for presence ... #
Total comments: 6
Patch Set 3 : code review 2967041: Inspect byte sequences read in xml.text() for presence ... #
Total comments: 12
Patch Set 4 : code review 2967041: Inspect byte sequences read in xml.text() for presence ... #Patch Set 5 : This is an attempt to get xml.go and xml_test.go back into the change list. #Patch Set 6 : code review 2967041: Inspect byte sequences read in xml.text() for presence ... #
Total comments: 4
Patch Set 7 : code review 2967041: Inspect byte sequences read in xml.text() for presence ... #MessagesTotal messages: 16
|