|
|
Created:
13 years, 3 months ago by arthurhsu Modified:
13 years, 3 months ago CC:
skia-review_googlegroups.com Base URL:
http://skia.googlecode.com/svn/trunk Visibility:
Public. |
DescriptionUse bfrange to shrink ToUnicode table.
BUG=258
TEST=none
Patch Set 1 #
Total comments: 1
Patch Set 2 : Update per code review #Patch Set 3 : Update per code review #
Total comments: 26
Patch Set 4 : Update per code review #Patch Set 5 : Add missing file #
Total comments: 18
Patch Set 6 : Update per code review #Patch Set 7 : Update per code review #
Total comments: 2
MessagesTotal messages: 11
How much shrinkage does this give us?
Sign in to reply to this message.
bfrange could probably use some isolated unittests, just to be sure we handle all of its trickiness. LGTM http://codereview.appspot.com/4844043/diff/1/src/pdf/SkPDFFont.cpp File src/pdf/SkPDFFont.cpp (right): http://codereview.appspot.com/4844043/diff/1/src/pdf/SkPDFFont.cpp#newcode461 src/pdf/SkPDFFont.cpp:461: bfcharEntries[index].fUnicode = base.fUnicode; Slightly more common, and more efficient, usage pattern for this is: BFChar* entry = bfcharEntries.append(); entry->fGlyphId = base.fStart; entry->base.fUnicode; This way the code (and the reader) doesn't have to reinterpret index each time.
Sign in to reply to this message.
I've added unit tests and updated code based on Mike's comment. The unit tests unveiled two bugs (the last entry is not published, and high byte of range is wrongly determined) and I fixed them also.
Sign in to reply to this message.
http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp File src/pdf/SkPDFFont.cpp (right): http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode373 src/pdf/SkPDFFont.cpp:373: namespace { We don't need a namespace here - the type will already be private to the .cpp file. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode389 src/pdf/SkPDFFont.cpp:389: for (int i = 0; i < bfchar.count() / 100 + 1; ++i) { Would it be cleaner to do i = 0; i < bfchar.count(); i += 100 ? http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode391 src/pdf/SkPDFFont.cpp:391: count = (count > 100) ? 100 : count; if (count > 100) { count = 100; } or count = min(count, 100); http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode396 src/pdf/SkPDFFont.cpp:396: cmap->writeHexAsText(bfchar[i * 100 + j].fGlyphId, 4); with i+= 100, you can just use i here. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode428 src/pdf/SkPDFFont.cpp:428: static void append_cmap_sections(const SkTDArray<SkUnichar>& glyphUnicode, glyphUnicode could be named better. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode447 src/pdf/SkPDFFont.cpp:447: if (i == base.fStart + continuousEntries && I think we can do better than this when you consider subsetting. Consider glyphs 1-10 that map to unicode 101-110. But we only have 1,3,5,7,9 in the subset: <1> <101> <3> <103> <5> <105> <7> <107> <9> <109> is much longer than <1> <10> <101> It doesn't hurt to map an entry that we won't refer to. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode482 src/pdf/SkPDFFont.cpp:482: // bfchar must present before bfrange per spec. nit: full sentence: The spec requires bfchars before bfranges. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode487 src/pdf/SkPDFFont.cpp:487: SkPDFStream* generate_tounicode_cmap(const SkTDArray<SkUnichar>& glyphUnicode, this should still be static. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode487 src/pdf/SkPDFFont.cpp:487: SkPDFStream* generate_tounicode_cmap(const SkTDArray<SkUnichar>& glyphUnicode, glyphUnicode could be named better.
Sign in to reply to this message.
http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp File src/pdf/SkPDFFont.cpp (right): http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode487 src/pdf/SkPDFFont.cpp:487: SkPDFStream* generate_tounicode_cmap(const SkTDArray<SkUnichar>& glyphUnicode, this should still be static. Oh... it's so you can test it. Maybe add a comment. http://codereview.appspot.com/4844043/diff/7001/tests/PDFPrimitivesTest.cpp File tests/PDFPrimitivesTest.cpp (right): http://codereview.appspot.com/4844043/diff/7001/tests/PDFPrimitivesTest.cpp#n... tests/PDFPrimitivesTest.cpp:238: static void TestToUnicode(skiatest::Reporter* reporter) { Let's put this in a new file. http://codereview.appspot.com/4844043/diff/7001/tests/PDFPrimitivesTest.cpp#n... tests/PDFPrimitivesTest.cpp:278: "<</Length 455\n\ This test is fragile to dictionary ordering. Maybe calling just append_cmap_sections would be better?
Sign in to reply to this message.
http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp File src/pdf/SkPDFFont.cpp (right): http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode373 src/pdf/SkPDFFont.cpp:373: namespace { On 2011/08/03 23:30:22, Steve VanDeBogart wrote: > We don't need a namespace here - the type will already be private to the .cpp > file. Done. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode389 src/pdf/SkPDFFont.cpp:389: for (int i = 0; i < bfchar.count() / 100 + 1; ++i) { On 2011/08/03 23:30:22, Steve VanDeBogart wrote: > Would it be cleaner to do i = 0; i < bfchar.count(); i += 100 ? Done. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode391 src/pdf/SkPDFFont.cpp:391: count = (count > 100) ? 100 : count; On 2011/08/03 23:30:22, Steve VanDeBogart wrote: > if (count > 100) { > count = 100; > } > or > count = min(count, 100); changed to use SkMin32() http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode396 src/pdf/SkPDFFont.cpp:396: cmap->writeHexAsText(bfchar[i * 100 + j].fGlyphId, 4); On 2011/08/03 23:30:22, Steve VanDeBogart wrote: > with i+= 100, you can just use i here. Done. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode428 src/pdf/SkPDFFont.cpp:428: static void append_cmap_sections(const SkTDArray<SkUnichar>& glyphUnicode, On 2011/08/03 23:30:22, Steve VanDeBogart wrote: > glyphUnicode could be named better. Done. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode447 src/pdf/SkPDFFont.cpp:447: if (i == base.fStart + continuousEntries && On 2011/08/03 23:30:22, Steve VanDeBogart wrote: > I think we can do better than this when you consider subsetting. Consider > glyphs 1-10 that map to unicode 101-110. But we only have 1,3,5,7,9 in the > subset: > <1> <101> <3> <103> <5> <105> <7> <107> <9> <109> > is much longer than > <1> <10> <101> > It doesn't hurt to map an entry that we won't refer to. We won't have the info in the near future when subset info is honored in advanced type metrics, therefore I did not attempt to do further optimization like this. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode482 src/pdf/SkPDFFont.cpp:482: // bfchar must present before bfrange per spec. On 2011/08/03 23:30:22, Steve VanDeBogart wrote: > nit: full sentence: The spec requires bfchars before bfranges. Done. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode487 src/pdf/SkPDFFont.cpp:487: SkPDFStream* generate_tounicode_cmap(const SkTDArray<SkUnichar>& glyphUnicode, On 2011/08/03 23:30:22, Steve VanDeBogart wrote: > this should still be static. Done. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode487 src/pdf/SkPDFFont.cpp:487: SkPDFStream* generate_tounicode_cmap(const SkTDArray<SkUnichar>& glyphUnicode, On 2011/08/03 23:35:03, Steve VanDeBogart wrote: > this should still be static. Oh... it's so you can test it. Maybe add a > comment. Changed to open append_cmap_sections since the original test is fragile. http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode487 src/pdf/SkPDFFont.cpp:487: SkPDFStream* generate_tounicode_cmap(const SkTDArray<SkUnichar>& glyphUnicode, On 2011/08/03 23:30:22, Steve VanDeBogart wrote: > glyphUnicode could be named better. Done. http://codereview.appspot.com/4844043/diff/7001/tests/PDFPrimitivesTest.cpp File tests/PDFPrimitivesTest.cpp (right): http://codereview.appspot.com/4844043/diff/7001/tests/PDFPrimitivesTest.cpp#n... tests/PDFPrimitivesTest.cpp:238: static void TestToUnicode(skiatest::Reporter* reporter) { On 2011/08/03 23:35:03, Steve VanDeBogart wrote: > Let's put this in a new file. Done. http://codereview.appspot.com/4844043/diff/7001/tests/PDFPrimitivesTest.cpp#n... tests/PDFPrimitivesTest.cpp:278: "<</Length 455\n\ On 2011/08/03 23:35:03, Steve VanDeBogart wrote: > This test is fragile to dictionary ordering. Maybe calling just > append_cmap_sections would be better? Done.
Sign in to reply to this message.
http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp File src/pdf/SkPDFFont.cpp (right): http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode447 src/pdf/SkPDFFont.cpp:447: if (i == base.fStart + continuousEntries && On 2011/08/04 00:13:58, arthurhsu wrote: > On 2011/08/03 23:30:22, Steve VanDeBogart wrote: > > I think we can do better than this when you consider subsetting. Consider > > glyphs 1-10 that map to unicode 101-110. But we only have 1,3,5,7,9 in the > > subset: > > <1> <101> <3> <103> <5> <105> <7> <107> <9> <109> > > is much longer than > > <1> <10> <101> > > It doesn't hurt to map an entry that we won't refer to. > > We won't have the info in the near future when subset info is honored in > advanced type metrics, therefore I did not attempt to do further optimization > like this. As we discussed, please add a comment about the spec being unclear if we can do better, but the savings being bounded (416k pre-compressed worse-case by my calculation). http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp File src/pdf/SkPDFFont.cpp (right): http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode394 src/pdf/SkPDFFont.cpp:394: cmap->writeHexAsText(bfchar[i + j].fGlyphId, 4); Does this need to be at least four bytes, or can we just use the natural length of the number? http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode435 src/pdf/SkPDFFont.cpp:435: BFRange base; base -> currentRangeEntry http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode441 src/pdf/SkPDFFont.cpp:441: // PDF spec mentioned that bfrange can not change the higher byte, nit: mentioned that->requires nit: bytes, -> byte. http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode481 src/pdf/SkPDFFont.cpp:481: // The spec requires bfchar must present before bfrange per spec. nit: The spec requires that all bfchar entries must come before bfrange entries. http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode482 src/pdf/SkPDFFont.cpp:482: if (bfcharEntries.count()) append_bfchar_section(bfcharEntries, cmap); nit: these don't have to be conditional / skia style requires {}'s on ifs. http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode487 src/pdf/SkPDFFont.cpp:487: const SkTDArray<SkUnichar>& glyphToUnicode, nit: indent is 8 spaces. http://codereview.appspot.com/4844043/diff/4003/tests/ToUnicode.cpp File tests/ToUnicode.cpp (right): http://codereview.appspot.com/4844043/diff/4003/tests/ToUnicode.cpp#newcode27 tests/ToUnicode.cpp:27: void append_cmap_sections(const SkTDArray<SkUnichar>& glyphToUnicode, Hmm, if this gets out of sync with the definition, the compiler won't complain and we'll get some strange results... Maybe make it a static private function of SkPDFFont, and make TestToUnicode a friend? http://codereview.appspot.com/4844043/diff/4003/tests/ToUnicode.cpp#newcode32 tests/ToUnicode.cpp:32: SkTDArray<uint16_t> glyphIDs; nit: glyphsInSubset http://codereview.appspot.com/4844043/diff/4003/tests/ToUnicode.cpp#newcode49 tests/ToUnicode.cpp:49: glyphToUnicode.push(0x2F); // 8 Can you also add a range with an entry missing, like 9,10,11,12
Sign in to reply to this message.
http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp File src/pdf/SkPDFFont.cpp (right): http://codereview.appspot.com/4844043/diff/7001/src/pdf/SkPDFFont.cpp#newcode447 src/pdf/SkPDFFont.cpp:447: if (i == base.fStart + continuousEntries && On 2011/08/04 17:15:53, Steve VanDeBogart wrote: > On 2011/08/04 00:13:58, arthurhsu wrote: > > On 2011/08/03 23:30:22, Steve VanDeBogart wrote: > > > I think we can do better than this when you consider subsetting. Consider > > > glyphs 1-10 that map to unicode 101-110. But we only have 1,3,5,7,9 in the > > > subset: > > > <1> <101> <3> <103> <5> <105> <7> <107> <9> <109> > > > is much longer than > > > <1> <10> <101> > > > It doesn't hurt to map an entry that we won't refer to. > > > > We won't have the info in the near future when subset info is honored in > > advanced type metrics, therefore I did not attempt to do further optimization > > like this. > > As we discussed, please add a comment about the spec being unclear if we can do > better, but the savings being bounded (416k pre-compressed worse-case by my > calculation). Done. http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp File src/pdf/SkPDFFont.cpp (right): http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode394 src/pdf/SkPDFFont.cpp:394: cmap->writeHexAsText(bfchar[i + j].fGlyphId, 4); On 2011/08/04 17:15:53, Steve VanDeBogart wrote: > Does this need to be at least four bytes, or can we just use the natural length > of the number? I tried that at the very beginning of implementing ToUnicode and Adobe Reader does not like it. http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode435 src/pdf/SkPDFFont.cpp:435: BFRange base; On 2011/08/04 17:15:53, Steve VanDeBogart wrote: > base -> currentRangeEntry Done. http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode441 src/pdf/SkPDFFont.cpp:441: // PDF spec mentioned that bfrange can not change the higher byte, On 2011/08/04 17:15:53, Steve VanDeBogart wrote: > nit: mentioned that->requires > nit: bytes, -> byte. Done. http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode481 src/pdf/SkPDFFont.cpp:481: // The spec requires bfchar must present before bfrange per spec. On 2011/08/04 17:15:53, Steve VanDeBogart wrote: > nit: The spec requires that all bfchar entries must come before bfrange entries. Done. http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode482 src/pdf/SkPDFFont.cpp:482: if (bfcharEntries.count()) append_bfchar_section(bfcharEntries, cmap); On 2011/08/04 17:15:53, Steve VanDeBogart wrote: > nit: these don't have to be conditional / skia style requires {}'s on ifs. Done. http://codereview.appspot.com/4844043/diff/4003/src/pdf/SkPDFFont.cpp#newcode487 src/pdf/SkPDFFont.cpp:487: const SkTDArray<SkUnichar>& glyphToUnicode, On 2011/08/04 17:15:53, Steve VanDeBogart wrote: > nit: indent is 8 spaces. Done. http://codereview.appspot.com/4844043/diff/4003/tests/ToUnicode.cpp File tests/ToUnicode.cpp (right): http://codereview.appspot.com/4844043/diff/4003/tests/ToUnicode.cpp#newcode27 tests/ToUnicode.cpp:27: void append_cmap_sections(const SkTDArray<SkUnichar>& glyphToUnicode, On 2011/08/04 17:15:53, Steve VanDeBogart wrote: > Hmm, if this gets out of sync with the definition, the compiler won't complain > and we'll get some strange results... Maybe make it a static private function of > SkPDFFont, and make TestToUnicode a friend? I am hesitated of doing so. I'd like to keep testing related stuff out of production code's header. If the test failed, the dev who changed the signature of this function should fix it, and that's why we unit test this function. http://codereview.appspot.com/4844043/diff/4003/tests/ToUnicode.cpp#newcode32 tests/ToUnicode.cpp:32: SkTDArray<uint16_t> glyphIDs; On 2011/08/04 17:15:53, Steve VanDeBogart wrote: > nit: glyphsInSubset Done. http://codereview.appspot.com/4844043/diff/4003/tests/ToUnicode.cpp#newcode49 tests/ToUnicode.cpp:49: glyphToUnicode.push(0x2F); // 8 On 2011/08/04 17:15:53, Steve VanDeBogart wrote: > Can you also add a range with an entry missing, like 9,10,11,12 Done.
Sign in to reply to this message.
LGTM with nits... fixed and committed as r2075. http://codereview.appspot.com/4844043/diff/2003/src/pdf/SkPDFFont.cpp File src/pdf/SkPDFFont.cpp (right): http://codereview.appspot.com/4844043/diff/2003/src/pdf/SkPDFFont.cpp#newcode470 src/pdf/SkPDFFont.cpp:470: (i >> 8) == (currentRangeEntry.fStart >> 8) && nit: 470-472 in 8 more http://codereview.appspot.com/4844043/diff/2003/src/pdf/SkPDFFont.cpp#newcode505 src/pdf/SkPDFFont.cpp:505: // The spec requires all bfchar entries for a font must present before nit: present -> come
Sign in to reply to this message.
|