|
|
Created:
9 years, 3 months ago by pkx166h Modified:
9 years, 2 months ago CC:
lilypond-devel_gnu.org Visibility:
Public. |
DescriptionReduce size of PDF files when inc. in *TeX docs
Issue 4251
This changes the way lilypond uses fonts to draw glyphs.
It avoids to used glyphshow for all emmentaler glyphs and
adds encoding vectors to the emmentaler fonts before they
are used. It also changes the ghostscript parameters used
to generate pdfs from postscript code.
These changes help to reduce pdf file sizes if you include
lilypond snippets in *TeX documents. The pdfs generated by
a patched lilypond and *tex themselves are _much_ bigger,
but if you run ghostscript and pdfsizeopt.py on those
files they implode.
added a command line option
--bigpdf / -b, and documented that option in the german
and english versions of usage.pdf .
Patch Set 1 #
Total comments: 10
Patch Set 2 : Format corrections from Werner, rewrite of English entry for Usage. #
Total comments: 16
Patch Set 3 : Corrections as per Werner, plus new edit to the English doc #
Total comments: 16
Patch Set 4 : Corrections from Werner, some scm formatting and TexInfo syntax fixes #Patch Set 5 : CHange of German Translation - if and when this compiles this will get pushed. The rest of the patc… #
MessagesTotal messages: 47
Added Knut as I 'own' this patch while it is being reviewed. James
Sign in to reply to this message.
The English Documentation is not very well written - which I completely understand of course - so once this patch has passed all the tests, and because I have the patch file (as I am managing this patch for Knut), I will re-do the English documentation. As I cannot speak German I won't touch that.
Sign in to reply to this message.
LGTM, thanks! I only have some minor comments regarding improved legibility of the source code. https://codereview.appspot.com/194090043/diff/1/Documentation/de/usage/runnin... File Documentation/de/usage/running.itely (right): https://codereview.appspot.com/194090043/diff/1/Documentation/de/usage/runnin... Documentation/de/usage/running.itely:160: pdftex-, xetex-, oder luatex-Dokumente eingebettet werden, I would end this line with `;' or `:' instead of a comma. https://codereview.appspot.com/194090043/diff/1/ps/encodingdefs.ps File ps/encodingdefs.ps (right): https://codereview.appspot.com/194090043/diff/1/ps/encodingdefs.ps#newcode8 ps/encodingdefs.ps:8: /LilyNoteHeadEncoding [ /.notdef /noteheads.d0doFunk /noteheads.d0fa For better orientation, please reformat this to have a fixed number of entries per line (I suggest 4 items), together with comments that indicate the current index (something like `% 0x50'). https://codereview.appspot.com/194090043/diff/1/ps/encodingdefs.ps#newcode91 ps/encodingdefs.ps:91: /noteheads.d0doFunk {<01> show} def /noteheads.d0fa {<02> show} def Here, I would prefer one entry per line. https://codereview.appspot.com/194090043/diff/1/ps/encodingdefs.ps#newcode214 ps/encodingdefs.ps:214: /LilyScriptEncoding [ /.notdef /clefs.blackmensural.c The same comment as above. https://codereview.appspot.com/194090043/diff/1/scm/output-ps.scm File scm/output-ps.scm (right): https://codereview.appspot.com/194090043/diff/1/scm/output-ps.scm#newcode126 scm/output-ps.scm:126: (ly:format "currentpoint ~4f ~4f rmoveto ~a moveto ~4f 0 rmoveto" x y g w))) Please reformat this (and similar) code to stay within the 80-characters-per-line limit if possible.
Sign in to reply to this message.
Format corrections from Werner, rewrite of English entry for Usage.
Sign in to reply to this message.
Thanks https://codereview.appspot.com/194090043/diff/1/Documentation/de/usage/runnin... File Documentation/de/usage/running.itely (right): https://codereview.appspot.com/194090043/diff/1/Documentation/de/usage/runnin... Documentation/de/usage/running.itely:160: pdftex-, xetex-, oder luatex-Dokumente eingebettet werden, On 2015/01/10 10:02:07, lemzwerg wrote: > I would end this line with `;' or `:' instead of a comma. Done. https://codereview.appspot.com/194090043/diff/1/ps/encodingdefs.ps File ps/encodingdefs.ps (right): https://codereview.appspot.com/194090043/diff/1/ps/encodingdefs.ps#newcode8 ps/encodingdefs.ps:8: /LilyNoteHeadEncoding [ /.notdef /noteheads.d0doFunk /noteheads.d0fa On 2015/01/10 10:02:07, lemzwerg wrote: > For better orientation, please reformat this to have a fixed number of entries > per line (I suggest 4 items), I did 3 items maximum and kept things within the line length. > together with comments that indicate the current > index (something like `% 0x50'). Werner, I don't understand what you mean by "...together with comments that indicate the current index (something like `% 0x50')." As I am helping shepherd this patch and because of problems with Knut's patch applying if this is something you can easily explain to me I can do this for Knut and save some time for the patch. https://codereview.appspot.com/194090043/diff/1/ps/encodingdefs.ps#newcode91 ps/encodingdefs.ps:91: /noteheads.d0doFunk {<01> show} def /noteheads.d0fa {<02> show} def On 2015/01/10 10:02:07, lemzwerg wrote: > Here, I would prefer one entry per line. Done. https://codereview.appspot.com/194090043/diff/1/ps/encodingdefs.ps#newcode214 ps/encodingdefs.ps:214: /LilyScriptEncoding [ /.notdef /clefs.blackmensural.c On 2015/01/10 10:02:07, lemzwerg wrote: > The same comment as above. Done. https://codereview.appspot.com/194090043/diff/1/scm/output-ps.scm File scm/output-ps.scm (right): https://codereview.appspot.com/194090043/diff/1/scm/output-ps.scm#newcode126 scm/output-ps.scm:126: (ly:format "currentpoint ~4f ~4f rmoveto ~a moveto ~4f 0 rmoveto" x y g w))) On 2015/01/10 10:02:07, lemzwerg wrote: > Please reformat this (and similar) code to stay within the > 80-characters-per-line limit if possible. Hope this is better (I don't code so am not sure if there are specific 'rules' about how scm is formatted here.
Sign in to reply to this message.
Thanks for your help and assistance! > > For better orientation, please reformat this to have a fixed number > > of entries per line (I suggest 4 items), > > I did 3 items maximum and kept things within the line length. Basically, this is fine. However, three is incommensurable to 16, ... > Werner, I don't understand what you mean by "...together with > comments that indicate the current index (something like `% 0x50')." Example: % 0x00 /foo /bar /baz /bla /buf /urgh /pfft /zap /boinck % 0x09 ... > > Please reformat this (and similar) code to stay within the > > 80-characters-per-line limit if possible. > > Hope this is better (I don't code so am not sure if there are > specific 'rules' about how scm is formatted here. It looks better, thanks. In case of doubt, the formatting produced by Emacs is the one we follow.
Sign in to reply to this message.
https://codereview.appspot.com/194090043/diff/20001/Documentation/de/usage/ru... File Documentation/de/usage/running.itely (right): https://codereview.appspot.com/194090043/diff/20001/Documentation/de/usage/ru... Documentation/de/usage/running.itely:160: pdftex-, xetex-, oder luatex-Dokumente eingebettet werden; I suggest to write this as @w{pdftex-}, @w{xetex-}, ... to indicate that the hyphen is a non-breakable one. This probably doesn't work correctly with HTML, but IMHO such cases should be tagged. https://codereview.appspot.com/194090043/diff/20001/Documentation/de/usage/ru... Documentation/de/usage/running.itely:161: es ermöglicht ghostscript im Anschluß, duplizierte Font-Daten According to new German orthography, this should be `Anschluss'. https://codereview.appspot.com/194090043/diff/20001/Documentation/de/usage/ru... Documentation/de/usage/running.itely:166: notation.pdf (Lilypond 2.18.2) ist ca. 26MB groß, erzeugt ... ca.@: 26MB ...
Sign in to reply to this message.
On 2015/01/11 20:13:46, lemzwerg wrote: > https://codereview.appspot.com/194090043/diff/20001/Documentation/de/usage/ru... > File Documentation/de/usage/running.itely (right): > > https://codereview.appspot.com/194090043/diff/20001/Documentation/de/usage/ru... > Documentation/de/usage/running.itely:160: pdftex-, xetex-, oder luatex-Dokumente > eingebettet werden; > I suggest to write this as > > @w{pdftex-}, @w{xetex-}, ... > > to indicate that the hyphen is a non-breakable one. This probably doesn't work > correctly with HTML, but IMHO such cases should be tagged. > > https://codereview.appspot.com/194090043/diff/20001/Documentation/de/usage/ru... > Documentation/de/usage/running.itely:161: es ermöglicht ghostscript im Anschluß, > duplizierte Font-Daten > According to new German orthography, this should be `Anschluss'. > > https://codereview.appspot.com/194090043/diff/20001/Documentation/de/usage/ru... > Documentation/de/usage/running.itely:166: notation.pdf (Lilypond 2.18.2) ist ca. > 26MB groß, erzeugt > ... ca.@: 26MB ... Actually I simplified the English version of the document and took out a lot of unnecessary information from it. It'd be better if someone could translate the few lines of German here and I can apply them myself (along with the newer examples). Thanks James
Sign in to reply to this message.
https://codereview.appspot.com/194090043/diff/20001/Documentation/usage/runni... File Documentation/usage/running.itely (right): https://codereview.appspot.com/194090043/diff/20001/Documentation/usage/runni... Documentation/usage/running.itely:187: font data which can make significant reductions in file size. One point is missing: Why is the option called --bigpdfs if the we get significant reductions in file size? https://codereview.appspot.com/194090043/diff/20001/Documentation/usage/runni... Documentation/usage/running.itely:199: Using @code{pdfsizeopt.py} can then be used to further optimize the size I would start with Optionally, using @code{pdfsizeopt.py} ... to indicate that this is an extra bonus. https://codereview.appspot.com/194090043/diff/20001/Documentation/usage/runni... Documentation/usage/running.itely:206: Should the (current) problem with external links be mentioned here too?
Sign in to reply to this message.
On 12.01.2015 06:56, lemzwerg@googlemail.com wrote: > > https://codereview.appspot.com/194090043/diff/20001/Documentation/usage/runni... > File Documentation/usage/running.itely (right): > > https://codereview.appspot.com/194090043/diff/20001/Documentation/usage/runni... > Documentation/usage/running.itely:187: font data which can make > significant reductions in file size. > One point is missing: Why is the option called --bigpdfs if the we get > significant reductions in file size? We include up to three full copies (with different encoding vectors) for every emmentaler font used instead of one newly contructed font that includes only the subset of emmentaler glyphs used by the document. For all other fonts ghostscript is told to avoid subsettting whenever it is possible. That all means that the file size of the pdfs produced by lilypond itselb increases dramatically. --bigpdfs is a name that describes the most visible effect of this code. It discourages the normal user, but that's not a bad effect. If you don't know why you want to use this option you probably don't have a reason to use it. --optpdfs would be misleading, and I have some ideas how to optimize pdf output. A class of possible names is related to the implementation details: Something like --dont_subset_fonts_use_emmentaler_via-encoding_vectors would be technically correct but much to long. --nofontsubsetting, --useencodings, --avoidglyphshow would emphasize only parts of the technical details. Another class of possible names would emphasize the intended use: --optimze_for_inclusion_in_TeX_if_you_include_multiple_lilypond_pdfs would be much to long but self-explanatory. --texsnip would be short enough and based on the intended use, and -t isn't used up to now.<http://dict.leo.org/#/search=self-explanatory&searchLoc=0&resultOrder=basic&multiwordShowSingle=on> A name that discourages Fred Foobar form using it unless he has read about it in the documentation and knows why he wants to use it is a good name, therefore I still like --bigpdfs. cu, Knut
Sign in to reply to this message.
I don't object to the name! I only state that the option's name doesn't have an explanation in the English documentation, and I think it would be good if it gets added.
Sign in to reply to this message.
On 2015/01/12 08:52:02, lemzwerg wrote: > I don't object to the name! I only state that the option's name doesn't have an > explanation in the English documentation, and I think it would be good if it > gets added. Knut already did and I 'rewrote' this (as per the stated summary of patch #2). Documentation/usage/running.itely ?! James
Sign in to reply to this message.
Hmm. I can't find in your description that `--bigpdfs' creates *big* output files that get later reduced to small one by running ghostscript again.
Sign in to reply to this message.
On 14.01.2015 08:15, lemzwerg@googlemail.com wrote: > Hmm. I can't find in your description that `--bigpdfs' creates *big* > output files that get later reduced to small one by running ghostscript > again. > > https://codereview.appspot.com/194090043/ > No? Have a look at the patch sent to lilypond-devel on 8-1 . There I wrote: With this patch pdfs generated by lilypond are much bigger, but if you include more than two of them in a TeX document and feed the pdfs to ghostscripts pdfwrite device, the resulting pdf is much smaller. The files produced by ghostscript can then be processed by pdfsizeopt.py for even better results. One example: notation.pdf of lilypond 2.19.16 28.0 MB original size 300.0 MB built with --bigpdf enabled 5.9 MB as above + postprocessing with gs 4.3 MB as above + postprocessing with gs and pdfsizeopt I think that already is pretty clear. A day later, in the message to James Lowe, I published statistics about the "one page with 4 small snippets" TeX document I attached to that message: Then translate test.tex to pdf with lualatex --shell-escape test gs -sDEVICE=pdfwrite -o testtmp.pdf test.pdf pdfsizeopt.py --use-multivalent=no testtmp.pdf testfinal.pdf dir test*pdf --sort=time | tac With line 23 "\def\lilyparms{ }" I get -rw-r--r-- 1 knut users 173538 9. Jan 11:53 test.pdf -rw-r--r-- 1 knut users 157303 9. Jan 11:53 testtmp.pdf -rw-r--r-- 1 knut users 149945 9. Jan 11:53 testfinal.pdf With line 23 "\def\lilyparms{ --bigpdf }" I get -rw-r--r-- 1 knut users 879441 9. Jan 11:55 test.pdf -rw-r--r-- 1 knut users 63359 9. Jan 11:55 testtmp.pdf -rw-r--r-- 1 knut users 59437 9. Jan 11:55 testfinal.pdf The original patch to running.itely included: +@item -b, --bigpdfs +Generate really big pdf files with as less as possible +optimization of font data. cu, Knut
Sign in to reply to this message.
Knut, *your* patch set has this, but James's version (in patch set 2) misses it.
Sign in to reply to this message.
On 2015/01/14 08:42:27, lemzwerg wrote: > Knut, *your* patch set has this, but James's version (in patch set 2) misses it. Yes that's true. I thought this was a 'lost in translation' error. So we have: +Generate really big pdf files with as less as possible +optimization of font data. but now we have the additional information "With this patch pdfs generated by lilypond are much bigger, but if you include more than two of them in a TeX document and feed the pdfs to ghostscripts pdfwrite device, the resulting pdf is much smaller." Which now makes more sense in context. Assuming this command is acceptable (there was some comments by David K in this thread that made me pause my format work on the patch so as not to waste more time), then I can include it in a new patch for review.
Sign in to reply to this message.
David's concerns are very specific to the Lilypond documentation, not covering the general case. Many programs simply can't process PS output at all, so the suggestion to collect PS data that gets reduced later on is not applicable. The only valid alternative is to make Lilypond natively produce PDF, but this is a long-term solution. And it seems to me that even then we will need a '--bigpdf' option (but implemented in a different way) to allow optimal PDF merging later on by post-processing tools. For this reason I vote to include Knut's work right now, since it quickly solves the given issue in a reliable way, with the only ugliness of having very large intermediate files.
Sign in to reply to this message.
On 2015/01/15 07:08:33, lemzwerg wrote: > David's concerns are very specific to the Lilypond documentation, not covering > the general case. Many programs simply can't process PS output at all, so the > suggestion to collect PS data that gets reduced later on is not applicable. > > The only valid alternative is to make Lilypond natively produce PDF, but this is > a long-term solution. And it seems to me that even then we will need a > '--bigpdf' option (but implemented in a different way) to allow optimal PDF > merging later on by post-processing tools. > > For this reason I vote to include Knut's work right now, since it quickly solves > the given issue in a reliable way, with the only ugliness of having very large > intermediate files. Reliable? If I remember correctly, the tool used for combining the fonts (ppdfsizeopt.py) fails on the PDF files from PDFLaTeX, so there must be an additional iteration through GhostScript. This additional iteration will reencode and resample included bitmap graphics at some command line option dependent resolution, correct? What happens with hyperlinks? Has anybody checked those? At any rate, I've taken a look at the description of pdfsizeopt, and it would appear that it is optimized for working on PDF files created by PDFTeX. That would imply that it would be a) really a good idea to get along without using Ghostscript as an intermediary. That seems like it would require fixing pdfsizeopt. Its project page contains a link "Doesn't pdfsizeopt work with your PDF? Report the issue". Now there is a remarkable dearth of names on the web pages, but from other projects and content under this account and the account's name I should be surprised if this project is not owned by Szabó Péter. And I should be surprised if he does not manage to fix the problem when reported or suggest a full quality workaround. b) in a similar vein, I'd ask Péter for suggestions about the best course for having the font compaction work without blowing up the intermediate files all too much. Of course I am speculating on him just making pdfsizeopt do all the work, but even if not, he'll be likely to come up with a good plan. The downside to the choice of using pdfsizeopt here is that it does not currently seem to be easily available preinstalled for Ubuntu (and it has a number of dependencies making preinstallation desirable). Maybe that will change in future. With regard to a PDF file example for pdfsizeopt, maybe reporting the Notation manual is a bit unwieldy. The "Learning" manual should likely have the same kind of problems, right?
Sign in to reply to this message.
Well, we get a large size reduction even if we don't use pdfsizeopt! Using this program is an extra bonus but not mandatory. And you are right, I hope that Péter fixes the reported issues, provided someone is going to add them to the bug tracker (which hasn't happened yet, looking at https://code.google.com/p/pdfsizeopt/issues/list). The hyperlink issue is not related to the --bigpdf option (since it is a bug in ghostscript), so I don't think that this is a showstopper. Regarding your b) issue: I fully agree. Contacting Péter might be very helpful. Nevertheless, this takes time. Given that it should be straightforward to make --bigpdf a no-op in case it is no longer useful, I still vote for incorporating the patch.
Sign in to reply to this message.
On 15.01.2015 10:45, dak@gnu.org wrote: > > Reliable? If I remember correctly, the tool used for combining the > fonts (ppdfsizeopt.py) Ghostscript does the font merging. > fails on the PDF files from PDFLaTeX, so there > must be an additional iteration through GhostScript. This additional > iteration will reencode and resample included bitmap graphics at some > command line option dependent resolution, correct? pdfsizeopt.py does some optimization of the remaining fonts, it tries to find better compression for images, etc. > What happens with hyperlinks? Has anybody checked those? BTW: All this has been documented in the commit message of the git-formatted patch sent to lilypond-devel: Internal hyperlinks are fully preserved with current ghostscript git master. External hyperlinks (GoToR) _to_ a file processed this way are broken. Fixing this would require major changes to ghostscript. External hyperlinks _from_ a file processed this way to other pdfs are preserved if the reader program isn't broken (acroread is not broken in this respect, evince is). For more details see Ghostscript bug #695747 <http://bugs.ghostscript.com/show_bug.cgi?id=695747#c22> > At any rate, I've taken a look at the description of pdfsizeopt, and it > would appear that it is optimized for working on PDF files created by > PDFTeX. That would imply that it would be > a) really a good idea to get along without using Ghostscript as an > intermediary. That seems like it would require fixing pdfsizeopt. Its > project page contains a link "Doesn't pdfsizeopt work with your PDF? > Report the issue". Now there is a remarkable dearth of names on the web > pages, but from other projects and content under this account and the > account's name I should be surprised if this project is not owned by > Szabó Péter. And I should be surprised if he does not manage to fix the > problem when reported or suggest a full quality workaround. > b) in a similar vein, I'd ask Péter for suggestions about the best > course for having the font compaction work without blowing up the > intermediate files all too much. Of course I am speculating on him just > making pdfsizeopt do all the work, but even if not, he'll be likely to > come up with a good plan. The pdfsizeopt.py problems we run into (at least issues 2 and 18) are reported since 2009, and a fix is still missing. No, I won't rely on Peter to enhance and fix his tool fast. ghostscript is the tool that does the main work, pdfsizeopt.py is an option. cu, Knut
Sign in to reply to this message.
On 15.01.2015 12:49, lemzwerg@googlemail.com wrote: > > The hyperlink issue is not related to the --bigpdf option (since it is a > bug in ghostscript), so I don't think that this is a showstopper. Well, it means that the code currently cannot be used to build lilyponds own documentation. cu, Knut
Sign in to reply to this message.
On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote: > On 15.01.2015 10:45, mailto:dak@gnu.org wrote: > > > > Reliable? If I remember correctly, the tool used for combining the > > fonts (ppdfsizeopt.py) > > Ghostscript does the font merging. Ok. > BTW: All this has been documented in the commit message of the git-formatted > patch sent to lilypond-devel: > > Internal hyperlinks are fully preserved with current ghostscript git master. > > External hyperlinks (GoToR) _to_ a file processed this way are broken. > Fixing this would require major changes to ghostscript. > > External hyperlinks _from_ a file processed this way to other pdfs are > preserved if the reader program isn't broken (acroread is not broken > in this respect, evince is). > > For more details see Ghostscript bug #695747 > <http://bugs.ghostscript.com/show_bug.cgi?id=695747#c22> If external hyperlinks from our documentation PDF to other files stop working, we cannot make this the default way of building our documentation. The version of Ghostscript that is pertinent here is not the development master but mainly the version used in GUB, and secondarily the version of Ghostscript we expect to be current in GNU/Linux or other distributions that build LilyPond natively. There is some settling-down time as they are unlikely to use any 2.19 version (it's a development version, after all), but basically there needs to be a reasonable chance of the Ghostscript versions being fine by the time we release version 2.20.
Sign in to reply to this message.
On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote: > On 15.01.2015 10:45, mailto:dak@gnu.org wrote: > > > > Reliable? If I remember correctly, the tool used for combining the > > fonts (ppdfsizeopt.py) > > Ghostscript does the font merging. Any idea whether something could be done to make PDFTeX do the font merging instead when including all the PDF files?
Sign in to reply to this message.
On 15.01.2015 13:15, dak@gnu.org wrote: > On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote: > >> Ghostscript does the font merging. > > Any idea whether something could be done to make PDFTeX do the font > merging instead when including all the PDF files? No, not really. That would require a lot of work. cu, Knut > > https://codereview.appspot.com/194090043/ >
Sign in to reply to this message.
On 15/01/15 13:18, Knut Petersen wrote: > On 15.01.2015 13:15, dak@gnu.org wrote: >> On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote: >> >>> Ghostscript does the font merging. >> >> Any idea whether something could be done to make PDFTeX do the font >> merging instead when including all the PDF files? > > No, not really. That would require a lot of work. > > cu, > Knut >> >> https://codereview.appspot.com/194090043/ >> > So do I go ahead and continue helping Knut with this patch? James
Sign in to reply to this message.
On 15.01.2015 13:12, dak@gnu.org wrote: > > If external hyperlinks from our documentation PDF to other files stop > working, we cannot make this the default way of building our > documentation. Indeed. Building lilypond with --bigpdfs enabled by default is a good test for that code, nothing more, nothing less. It passes that test. If you use pdftex, xetex, luatex or other TeX dialects that are able to directly include pdfs produced by lilypond and if you use that feature a lot, the --bigpdfs code will help you to reduce the file size of your final document significantly. Typical use would be a dissertation, a song book, etc. cu, Knut
Sign in to reply to this message.
On 2015/01/15 13:18:46, Knut_Petersen_t-online.de wrote: > On 15.01.2015 13:15, mailto:dak@gnu.org wrote: > > On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote: > > > >> Ghostscript does the font merging. > > > > Any idea whether something could be done to make PDFTeX do the font > > merging instead when including all the PDF files? > > No, not really. That would require a lot of work. Judging from the documentation, that should be the default (namely, when \pdfinclusioncopyfonts is at its default value of 0 and we are talking about Type1 fonts). Cf <URL:http://tex.stackexchange.com/questions/136574/merging-duplicate-embedded-fonts#138726> for example. So the question is what is keeping this from happening. Maybe we need to call ps2pdf (when converting the fragments for inclusion) with some particular options to keep the fonts in a mergeable state?
Sign in to reply to this message.
On 15.01.2015 14:47, dak@gnu.org wrote: > On 2015/01/15 13:18:46, Knut_Petersen_t-online.de wrote: >> On 15.01.2015 13:15, mailto:dak@gnu.org wrote: >> > On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote: >> > >> >> Ghostscript does the font merging. >> > >> > Any idea whether something could be done to make PDFTeX do the font >> > merging instead when including all the PDF files? > >> No, not really. That would require a lot of work. > > Judging from the documentation, that should be the default (namely, when > \pdfinclusioncopyfonts is at its default value of 0 and we are talking > about Type1 fonts). Cf > <URL:http://tex.stackexchange.com/questions/136574/merging-duplicate-embedded-fonts#138726> > for example. So the question is what is keeping this from happening. > Maybe we need to call ps2pdf (when converting the fragments for > inclusion) with some particular options to keep the fonts in a mergeable Current lilypond uses glyphshow to draw glyphs in postscript, encoding vectors are not present. As there is no direct equivalent of the postscript glyphshow operator in the pdf language, ghostscript constructs a _new_ font with an encoding vector, including only the subset of glyphs used in the document. Ghostscript then uses the glyphs in that font indexed by the new encoding vector. The name of that new font is derived from the original. It's pretty unlikely that two lilypond scores use the exactly same subset of glyphs in exactly the same order, so it's pretty likely that the two new fonts are not identical. But they share (aside from the prefix) their name. pdflatex would need to inspect all glyphs in those fonts, detect which are identical and construct up to three fonts (remember the size limit of encoding vectors and the number of emmentaler glyphs) with encodings from the fonts found in the lilypond pdfs. It then would need to recode the data stream and everything would be fine. Identical fonts could be expected to be merged by default and without problems (my interpretation of the pdftex documentation is different). But without --bigpdfs you do not have identical fonts in the lilypond pdfs, even if you instruct ghostscript not to subset fonts. It ignores that order because it cannot obey. cu, Knut
Sign in to reply to this message.
On 2015/01/15 15:06:06, Knut_Petersen_t-online.de wrote: > On 15.01.2015 14:47, mailto:dak@gnu.org wrote: > > On 2015/01/15 13:18:46, Knut_Petersen_t-online.de wrote: > >> On 15.01.2015 13:15, mailto:dak@gnu.org wrote: > >> > On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote: > >> > > >> >> Ghostscript does the font merging. > >> > > >> > Any idea whether something could be done to make PDFTeX do the font > >> > merging instead when including all the PDF files? > > > >> No, not really. That would require a lot of work. > > > > Judging from the documentation, that should be the default (namely, when > > \pdfinclusioncopyfonts is at its default value of 0 and we are talking > > about Type1 fonts). Cf > > > <URL:http://tex.stackexchange.com/questions/136574/merging-duplicate-embedded-fonts#138726> > > for example. So the question is what is keeping this from happening. > > Maybe we need to call ps2pdf (when converting the fragments for > > inclusion) with some particular options to keep the fonts in a mergeable > > Current lilypond uses glyphshow to draw glyphs in postscript, > encoding vectors are not present. > > As there is no direct equivalent of the postscript glyphshow operator > in the pdf language, ghostscript constructs a _new_ font with an > encoding vector, including only the subset of glyphs used in the document. > Ghostscript then uses the glyphs in that font indexed by the new encoding > vector. The name of that new font is derived from the original. > > It's pretty unlikely that two lilypond scores use the exactly same > subset of glyphs in exactly the same order, so it's pretty likely that > the two new fonts are not identical. But they share (aside from the > prefix) their name. > > pdflatex would need to inspect all glyphs in those fonts, detect which are > identical and construct up to three fonts (remember the size limit of > encoding vectors and the number of emmentaler glyphs) with encodings > from the fonts found in the lilypond pdfs. It then would need to recode > the data stream and everything would be fine. > > Identical fonts could be expected to be merged by default and without > problems (my interpretation of the pdftex documentation is different). PDFTeX apparently does merge subsetted fonts, so I don't think we should need to include the complete fonts in order to get font merging. But we probably should work with coding vectors so that we can use identical font names, just sparsely populated. I would then expect PDFTeX to merge the sparsely populated fonts of identical name unless \pdfinclusioncopyfonts is set. Redundant coding vectors should have much less of an impact on the intermediate file size than the full Emmentaler fonts. Right?
Sign in to reply to this message.
On 15.01.2015 17:01, dak@gnu.org wrote: > > PDFTeX apparently does merge subsetted fonts, Which version? > so I don't think we should > need to include the complete fonts in order to get font merging. But we > probably should work with coding vectors so that we can use identical > font names, just sparsely populated. ghostscript, called by lilypond to produce pdfs, needs three encoding vectors for the emmentaler glyphs, and writes three copies of the font to the pdf file. I don't see a way to avoid that. > I would then expect PDFTeX to > merge the sparsely populated fonts of identical name unless > \pdfinclusioncopyfonts is set. > > Redundant coding vectors should have much less of an impact on the > intermediate file size than the full Emmentaler fonts. Right? That would mean to change e.g. /Emmentaler-18 findfont dup length dict copy begin /Encoding LilyNoteHeadEncoding def /Emmentaler-18-N currentdict definefont pop end /Emmentaler-18 findfont dup length dict copy begin /Encoding LilyScriptEncoding def /Emmentaler-18-S currentdict definefont pop end /Emmentaler-18 findfont dup length dict copy begin /Encoding LilyOtherEncoding def /Emmentaler-18-O currentdict definefont pop end in a way that gs does not include three copies of the Emmentaler-18 font. I don't think that is possible. cu, Knut > > https://codereview.appspot.com/194090043/ >
Sign in to reply to this message.
Knut Petersen <Knut_Petersen@t-online.de> writes: > On 15.01.2015 17:01, dak@gnu.org wrote: >> >> PDFTeX apparently does merge subsetted fonts, > > Which version? Those versions having the \pdfincludedcopyfonts setting? >> so I don't think we should >> need to include the complete fonts in order to get font merging. But we >> probably should work with coding vectors so that we can use identical >> font names, just sparsely populated. > > ghostscript, called by lilypond to produce pdfs, needs three encoding > vectors for the emmentaler glyphs, and writes three copies of the font > to the pdf file. I don't see a way to avoid that. There must be some workable solution for Asian fonts I should hope. I can't believe that some 10000 character font would be included 10000/256 times in a PDF file when thoroughly used. I don't have any workable experience myself. It's just a "this can't possibly be the whole truth" feeling. Sometimes it just overtaxes my imagination what kind of thing people are willing to put up with. But I sure hope this is not another such case.
Sign in to reply to this message.
I have not yet used LilyPond with TeX, so I have no opinion. I looked up the history: Use of PostScript glyphshow, rather than show, for all characters seems to have started with http://git.savannah.gnu.org/gitweb/?p=lilypond.git;a=commitdiff;h=c3d64971575... after some discussion at http://lists.gnu.org/archive/html/lilypond-devel/2004-10/msg00150.html 1 The Ghostscript maintainer explains that for purposes of creating a PDF, ghostscript emulates glyphshow by collecting the glyphshow-ed glyphs into a font with an encoding : http://bugs.ghostscript.com/show_bug.cgi?id=695728 When several segments of Lilypond output are merged into one PDF, several of these fonts created on-the-fly take up space in the output file. https://codereview.appspot.com/194090043/diff/20001/ps/encodingdefs.ps File ps/encodingdefs.ps (right): https://codereview.appspot.com/194090043/diff/20001/ps/encodingdefs.ps#newcode8 ps/encodingdefs.ps:8: /LilyNoteHeadEncoding [ This is a little different from "FetaNoteheadsEncoding" in 'mf/out/feta-noteheads11.enc' generated by 'scripts/build/mf-to-table.py' https://codereview.appspot.com/194090043/diff/20001/ps/encodingdefs.ps#newcod... ps/encodingdefs.ps:108: /noteheads.d0doFunk {<01> show} def I suppose you use PS definitions here because the Scheme code does not know the encoding table, so the Scheme does not know what number to write in '\?? show' https://codereview.appspot.com/194090043/diff/20001/scm/output-ps.scm File scm/output-ps.scm (right): https://codereview.appspot.com/194090043/diff/20001/scm/output-ps.scm#newcode64 scm/output-ps.scm:64: (ly:inexact->string i 8))) This is the old code that output PS 'show' rather than 'glyphshow'
Sign in to reply to this message.
On 2015/01/11 20:13:29, lemzwerg wrote: > Thanks for your help and assistance! > > > > For better orientation, please reformat this to have a fixed number > > > of entries per line (I suggest 4 items), > > > > I did 3 items maximum and kept things within the line length. > > Basically, this is fine. However, three is incommensurable to 16, ... > > > Werner, I don't understand what you mean by "...together with > > comments that indicate the current index (something like `% 0x50')." > > Example: > > % 0x00 > /foo /bar /baz > /bla /buf /urgh > /pfft /zap /boinck > % 0x09 > ... Done. > > > > Please reformat this (and similar) code to stay within the > > > 80-characters-per-line limit if possible. > > > > Hope this is better (I don't code so am not sure if there are > > specific 'rules' about how scm is formatted here. > > It looks better, thanks. In case of doubt, the formatting produced by Emacs is > the one we follow. Can you point me to some pages that explain this on the web?
Sign in to reply to this message.
Corrections as per Werner, plus new edit to the English doc
Sign in to reply to this message.
Thanks for your patience. https://codereview.appspot.com/194090043/diff/20001/Documentation/de/usage/ru... File Documentation/de/usage/running.itely (right): https://codereview.appspot.com/194090043/diff/20001/Documentation/de/usage/ru... Documentation/de/usage/running.itely:166: notation.pdf (Lilypond 2.18.2) ist ca. 26MB groß, erzeugt On 2015/01/11 20:13:45, lemzwerg wrote: > ... ca.@: 26MB ... See comment above https://codereview.appspot.com/194090043/diff/20001/Documentation/usage/runni... File Documentation/usage/running.itely (right): https://codereview.appspot.com/194090043/diff/20001/Documentation/usage/runni... Documentation/usage/running.itely:187: font data which can make significant reductions in file size. On 2015/01/12 05:56:14, lemzwerg wrote: > One point is missing: Why is the option called --bigpdfs if the we get > significant reductions in file size? I think this has now been explained and I have also rewritten the explanation based on the main thread of the tracker. https://codereview.appspot.com/194090043/diff/20001/Documentation/usage/runni... Documentation/usage/running.itely:199: Using @code{pdfsizeopt.py} can then be used to further optimize the size On 2015/01/12 05:56:14, lemzwerg wrote: > I would start with > > Optionally, using @code{pdfsizeopt.py} ... > > to indicate that this is an extra bonus. See above. https://codereview.appspot.com/194090043/diff/20001/Documentation/usage/runni... Documentation/usage/running.itely:206: On 2015/01/12 05:56:14, lemzwerg wrote: > Should the (current) problem with external links be mentioned here too? See above. https://codereview.appspot.com/194090043/diff/20001/ps/encodingdefs.ps File ps/encodingdefs.ps (right): https://codereview.appspot.com/194090043/diff/20001/ps/encodingdefs.ps#newcode8 ps/encodingdefs.ps:8: /LilyNoteHeadEncoding [ On 2015/01/18 06:33:02, Keith wrote: > This is a little different from "FetaNoteheadsEncoding" in > 'mf/out/feta-noteheads11.enc' generated by 'scripts/build/mf-to-table.py' I need some advice on what to do here from Knut. https://codereview.appspot.com/194090043/diff/20001/ps/encodingdefs.ps#newcod... ps/encodingdefs.ps:108: /noteheads.d0doFunk {<01> show} def On 2015/01/18 06:33:02, Keith wrote: > I suppose you use PS definitions here because the Scheme code does not know the > encoding table, so the Scheme does not know what number to write in '\?? show' Again I cannot comment - Knut? https://codereview.appspot.com/194090043/diff/20001/scm/output-ps.scm File scm/output-ps.scm (right): https://codereview.appspot.com/194090043/diff/20001/scm/output-ps.scm#newcode64 scm/output-ps.scm:64: (ly:inexact->string i 8))) On 2015/01/18 06:33:02, Keith wrote: > This is the old code that output PS 'show' rather than 'glyphshow' I don't know what to do here (if anything).
Sign in to reply to this message.
Thank *you* for your hard work. Here's the next round of comments. https://codereview.appspot.com/194090043/diff/40001/Documentation/de/usage/ru... File Documentation/de/usage/running.itely (right): https://codereview.appspot.com/194090043/diff/40001/Documentation/de/usage/ru... Documentation/de/usage/running.itely:160: pdftex-, xetex-, oder luatex-Dokumente eingebettet werden; @w{pdftex-}, @w{xetex-}, https://codereview.appspot.com/194090043/diff/40001/Documentation/de/usage/ru... Documentation/de/usage/running.itely:162: zusammenzufassen. Mit pdfsizeopt.py sind dann noch weitere @file{pdfsizeopt.py} https://codereview.appspot.com/194090043/diff/40001/Documentation/de/usage/ru... Documentation/de/usage/running.itely:166: notation.pdf (Lilypond 2.18.2) ist ca. 26MB groß, erzeugt @file{notation.pdf} https://codereview.appspot.com/194090043/diff/40001/Documentation/usage/runni... File Documentation/usage/running.itely (right): https://codereview.appspot.com/194090043/diff/40001/Documentation/usage/runni... Documentation/usage/running.itely:185: @code{PDF} files generated will be much larger than normal (due to I think there is no reason to use @code{...} here. https://codereview.appspot.com/194090043/diff/40001/Documentation/usage/runni... Documentation/usage/running.itely:186: little or no font optimization). However, if two or more files are Please always use two spaces after a full stop. https://codereview.appspot.com/194090043/diff/40001/Documentation/usage/runni... Documentation/usage/running.itely:187: to be included within @w{@code{pdftex}}, @w{@code{xetex} or Ditto: no @code necessary, since you don't talk about the binary itself but the respective program in general. https://codereview.appspot.com/194090043/diff/40001/Documentation/usage/runni... Documentation/usage/running.itely:189: gostscript) resulting @code{PDF} in @emph{significantly} smaller files. s/gostscript/ghostscript. And somehow the whole sentence seems to be mangled... https://codereview.appspot.com/194090043/diff/40001/lily/general-scheme.cc File lily/general-scheme.cc (right): https://codereview.appspot.com/194090043/diff/40001/lily/general-scheme.cc#ne... lily/general-scheme.cc:307: "Return true if the command line includes the --bigpdf parameter." @option{--bigpdf} https://codereview.appspot.com/194090043/diff/40001/lily/main.cc File lily/main.cc (right): https://codereview.appspot.com/194090043/diff/40001/lily/main.cc#newcode161 lily/main.cc:161: {0, "bigpdfs", 'b', _i("generate big pdf files")}, s/pdf/PDF/
Sign in to reply to this message.
>> In case of doubt, the formatting produced by Emacs is the one we >> follow. > > Can you point me to some pages that explain this on the web? Well, I can describe what I do: I simply load the Scheme file into emacs, which automatically selects Scheme mode. Then I press the tab key line per line to force a re-indentation. I could do this in one rush, but I prefer doing it more slowly... Werner
Sign in to reply to this message.
On Tue, 20 Jan 2015 13:19:26 -0800, <pkx166h@gmail.com> wrote: > https://codereview.appspot.com/194090043/diff/20001/ps/encodingdefs.ps#newcode8 > ps/encodingdefs.ps:8: /LilyNoteHeadEncoding [ > On 2015/01/18 06:33:02, Keith wrote: >> This is a little different from "FetaNoteheadsEncoding" in >> 'mf/out/feta-noteheads11.enc' generated by > 'scripts/build/mf-to-table.py' > > I need some advice on what to do here from Knut. > > https://codereview.appspot.com/194090043/diff/20001/ps/encodingdefs.ps#newcod... > ps/encodingdefs.ps:108: /noteheads.d0doFunk {<01> show} def > On 2015/01/18 06:33:02, Keith wrote: >> I suppose you use PS definitions here because the Scheme code does not > know the >> encoding table, so the Scheme does not know what number to write in > '\?? show' > > Again I cannot comment - Knut? > > https://codereview.appspot.com/194090043/diff/20001/scm/output-ps.scm > File scm/output-ps.scm (right): > > https://codereview.appspot.com/194090043/diff/20001/scm/output-ps.scm#newcode64 > scm/output-ps.scm:64: (ly:inexact->string i 8))) > On 2015/01/18 06:33:02, Keith wrote: >> This is the old code that output PS 'show' rather than 'glyphshow' > > I don't know what to do here (if anything). > Nothing need be done about these threee. They were only suggestions for simplification -- and note even suggestions, just hints that simplification might be possible.
Sign in to reply to this message.
Corrections from Werner, some scm formatting and TexInfo syntax fixes
Sign in to reply to this message.
Thanks to all https://codereview.appspot.com/194090043/diff/40001/Documentation/de/usage/ru... File Documentation/de/usage/running.itely (right): https://codereview.appspot.com/194090043/diff/40001/Documentation/de/usage/ru... Documentation/de/usage/running.itely:160: pdftex-, xetex-, oder luatex-Dokumente eingebettet werden; On 2015/01/20 22:38:04, lemzwerg wrote: > @w{pdftex-}, @w{xetex-}, Could someone translate the English version I have now done? Then I can put it into the patch. Else the German documentation will bear no resemblance to the English. https://codereview.appspot.com/194090043/diff/40001/Documentation/usage/runni... File Documentation/usage/running.itely (right): https://codereview.appspot.com/194090043/diff/40001/Documentation/usage/runni... Documentation/usage/running.itely:185: @code{PDF} files generated will be much larger than normal (due to On 2015/01/20 22:38:05, lemzwerg wrote: > I think there is no reason to use @code{...} here. Done. https://codereview.appspot.com/194090043/diff/40001/Documentation/usage/runni... Documentation/usage/running.itely:186: little or no font optimization). However, if two or more files are On 2015/01/20 22:38:05, lemzwerg wrote: > Please always use two spaces after a full stop. Done. https://codereview.appspot.com/194090043/diff/40001/Documentation/usage/runni... Documentation/usage/running.itely:187: to be included within @w{@code{pdftex}}, @w{@code{xetex} or On 2015/01/20 22:38:04, lemzwerg wrote: > Ditto: no @code necessary, since you don't talk about the binary itself but the > respective program in general. Done. https://codereview.appspot.com/194090043/diff/40001/Documentation/usage/runni... Documentation/usage/running.itely:189: gostscript) resulting @code{PDF} in @emph{significantly} smaller files. On 2015/01/20 22:38:05, lemzwerg wrote: > s/gostscript/ghostscript. And somehow the whole sentence seems to be mangled... OK yes, I guess I had been looking at this too long. I've reworded bits of this again, see if makes more sense now. https://codereview.appspot.com/194090043/diff/40001/lily/general-scheme.cc File lily/general-scheme.cc (right): https://codereview.appspot.com/194090043/diff/40001/lily/general-scheme.cc#ne... lily/general-scheme.cc:307: "Return true if the command line includes the --bigpdf parameter." On 2015/01/20 22:38:05, lemzwerg wrote: > @option{--bigpdf} Done. https://codereview.appspot.com/194090043/diff/40001/lily/main.cc File lily/main.cc (right): https://codereview.appspot.com/194090043/diff/40001/lily/main.cc#newcode161 lily/main.cc:161: {0, "bigpdfs", 'b', _i("generate big pdf files")}, On 2015/01/20 22:38:05, lemzwerg wrote: > s/pdf/PDF/ Done.
Sign in to reply to this message.
On 2015/01/20 22:41:38, wl_gnu.org wrote: > >> In case of doubt, the formatting produced by Emacs is the one we > >> follow. > > > > Can you point me to some pages that explain this on the web? > > Well, I can describe what I do: I simply load the Scheme file into > emacs, which automatically selects Scheme mode. Then I press the tab > key line per line to force a re-indentation. I could do this in one > rush, but I prefer doing it more slowly... > > > Werner Thanks for that, that seemed easy enough. James
Sign in to reply to this message.
On 2015/01/24 14:10:46, J_lowe wrote: > On 2015/01/20 22:41:38, http://wl_gnu.org wrote: > > >> In case of doubt, the formatting produced by Emacs is the one we > > >> follow. > > > > > > Can you point me to some pages that explain this on the web? > > > > Well, I can describe what I do: I simply load the Scheme file into > > emacs, which automatically selects Scheme mode. Then I press the tab > > key line per line to force a re-indentation. I could do this in one > > rush, but I prefer doing it more slowly... > > > > > > Werner > > Thanks for that, that seemed easy enough. Probably not as easy as running scripts/auxiliar/fixscm.sh on the file which does approximately the same.
Sign in to reply to this message.
Ah, nice! I wasn't aware of this script. Thanks for mentioning it.
Sign in to reply to this message.
Patch on countdown for Jan 29th
Sign in to reply to this message.
Can someone give me the translation in German for: "PDF files generated will be much larger than normal (due to little or no font optimization). However, if two or more PDF files are included within pdftex, xetex or luatex documents they can then be processed further via ghostscript (merging duplicated font data) resulting in significantly smaller PDF files." I can then put the TexInfo syntax around the appropriate parts and push this patch.
Sign in to reply to this message.
CHange of German Translation - if and when this compiles this will get pushed. The rest of the patch has done the countdown.
Sign in to reply to this message.
Knut, author Knut Petersen <knut_petersen@online.de> Thu, 8 Jan 2015 18:00:44 +0000 (18:00 +0000) committer James Lowe <pkx166h@gmail.com> Sat, 31 Jan 2015 12:57:08 +0000 (12:57 +0000) commit cd5b559ab016dad5100eab3105218df94ab9f402 Thanks for your work. James
Sign in to reply to this message.
|