|
|
Created:
9 years, 11 months ago by Valentin Villenave Modified:
6 years ago Reviewers:
dak CC:
lilypond-devel_gnu.org Visibility:
Public. |
DescriptionEmbed LilyPond source files inside generated PDF
This feature has been requested by Samuel Da Mota,
who even wrote a patch in 2012. Unlike his approach,
this attempt is fully integrated in Lily’s pipeline and
does not involve external programs (namely bash and the
Java-based pdftk).
What I’m using internally is the pdfmark EMBED feature,
unlike Reinhold’s comment on issue 2643 which involved
ANN (and therefore led to a visible modification
of the PDF). It should be handled by most PDF viewers,
and even those who don’t should degrade gracefully.
One caveat though: EMBED is not handled by the old version
of GhostScript we’re still bundling; it requires a newer
version.
I also had to mess with some of the source code and create
a new ly:source-files function. It involves a mixture
of Scheme and C++ code (no smobs though), as well as one
new toplevel variable. By default all files in DATADIR
will be disregarded (ly/*-init.ly etc.), but there’s a
boolean switch in case one would need them too.
Patch Set 1 #Patch Set 2 : (Syntax consistency nitpick) #
Total comments: 14
Patch Set 3 : Some improvements (but problems still exist) #Patch Set 4 : Use mark ___ pdfmark instead of [ ____ pdfmark #Patch Set 5 : Retrieve source file paths directly from Includable-lexer:: class #
Total comments: 1
Patch Set 6 : Access source files only through LilyPond. #
Total comments: 8
Patch Set 7 : Make parser arg optional #Patch Set 8 : Remove source file heading #
MessagesTotal messages: 24
Greetings everybody, since there have been some discussions about updating GhostScript lately, perhaps now would be a convenient time to resurrect this feature... https://lists.gnu.org/archive/html/lilypond-user/2012-07/msg00105.html https://code.google.com/p/lilypond/issues/detail?id=2643 https://lists.gnu.org/archive/html/lilypond-devel/2012-07/msg00603.html Unlike Sam Da Mota’s patch (or what Frescobaldi offered once upon a time), my approach doesn’t rely on external programs (pdftk is heavy and Java-centric, and has been banned from some distros due to licensing concerns). Instead, I’m using internally the pdfmark EMBED feature, a bit like Reinhold’s comment on issue 2643 (but he used ANN and a visible PDF annotation, where as my patch produces more cleanly-attached source files. Therefore there’s a good chance it *could* work on other OSes too (although that’s untested as of yet). EMBED is not supported by the old version of GhostScript we’re still bundling, so now would definitely be a good time for an upgrade. I also had to mess with some of the source code and create a new ly:source-files function. It involves a mixture of Scheme and C++ code (no smobs though), as well as one new toplevel variable. Unlike Sam’s patch, I didn’t bother to make a list of distribution-included init files; I’m just discarding everything in LILYPOND_DATADIR (ly/*-init.ly etc.). I don’t believe that ly:parser-include-strings hacks will be handled properly, however. Cheers, Valentin.
Sign in to reply to this message.
(Syntax consistency nitpick)
Sign in to reply to this message.
https://codereview.appspot.com/225040043/diff/40001/lily/include/sources.hh File lily/include/sources.hh (right): https://codereview.appspot.com/225040043/diff/40001/lily/include/sources.hh#n... lily/include/sources.hh:27: static SCM source_file_list; static? That means that every compilation unit including this header file gets a private copy of source_file_list. Hardly a good idea. https://codereview.appspot.com/225040043/diff/40001/lily/sources.cc File lily/sources.cc (right): https://codereview.appspot.com/225040043/diff/40001/lily/sources.cc#newcode78 lily/sources.cc:78: source_file_list = scm_cons (ly_string2scm (file_string), source_file_list); source_file_list is a global variable here (and I might add that you forget to protect it from garbage collection, likely inducing crashes, but that's a different problem). So if I do lilypond -fembed-source-files a.ly b.ly the PDF for b.ly will include the source(s) for a.ly. That's undesirable. Is there any reason you don't rely on the per-parser array source_files_ for your list of source files? https://codereview.appspot.com/225040043/diff/40001/scm/backend-library.scm File scm/backend-library.scm (right): https://codereview.appspot.com/225040043/diff/40001/scm/backend-library.scm#n... scm/backend-library.scm:85: "-P" I read in "man gs" -P Makes Ghostscript to look first in the current directory for library files. By default, Ghostscript no longer looks in the current directory, unless, of course, the first explicitly sup‐ plied directory is "." in -I. See also the INITIALIZATION FILES section below, and bundled Use.htm for detailed discussion on search paths and how Ghostcript finds files. This seems quite unrelated to the issue of this patch and the rationale for specifying this option is totally undocumented. What is this for, and should this not rather be a separate patch if necessary for some reason for newer versions of GhostScript? https://codereview.appspot.com/225040043/diff/40001/scm/framework-ps.scm File scm/framework-ps.scm (right): https://codereview.appspot.com/225040043/diff/40001/scm/framework-ps.scm#newc... scm/framework-ps.scm:421: (map (lambda (fname) Use of `map' instead of `for-each'. https://codereview.appspot.com/225040043/diff/40001/scm/framework-ps.scm#newc... scm/framework-ps.scm:422: (set! idx (1+ idx)) quite unschemish. It seems cleaner to just give two list arguments to for-each, the second being (iota (length source-list)). Either that, or use a named let for your loop. https://codereview.appspot.com/225040043/diff/40001/scm/framework-ps.scm#newc... scm/framework-ps.scm:425: [ /_objdef {ly~a_stream} /type /stream /OBJ pdfmark [ ... pdfmark is a really ugly pairing. I'd rather use mark ... pdfmark
Sign in to reply to this message.
Here’s a slightly better-looking version (but the C++ part needs more work once I figure out scoping and GC-related questions). https://codereview.appspot.com/225040043/diff/40001/lily/sources.cc File lily/sources.cc (right): https://codereview.appspot.com/225040043/diff/40001/lily/sources.cc#newcode78 lily/sources.cc:78: source_file_list = scm_cons (ly_string2scm (file_string), source_file_list); On 2015/04/09 13:37:42, dak wrote: > any reason you don't rely on the per-parser array source_files_ for your > list of source files? Assuming you’re referring to the sourcefiles_ vector, that was my plan initially. How can I access it outside its (Sources constructor/destructor) scope so that I can convert it into a Scheme list? Every attempt I made results in segfaulting. > if I do lilypond > -fembed-source-files a.ly b.ly > the PDF for b.ly will include the source(s) for a.ly. Hm, indeed. (And with more than two .ly files, Guile GC complains.) I hoped the SCM_INIT_FUNC mechanism would prevent it, but it obviously gets initialized only once. > I might add that you forget to protect it from garbage collection, likely inducing crashes I couldn’t find an example of a protected global variable (protected: is always used inside a class, and scm_gc_mark in engravers or iterators). But I’ll be happy to learn more. https://codereview.appspot.com/225040043/diff/40001/scm/backend-library.scm File scm/backend-library.scm (right): https://codereview.appspot.com/225040043/diff/40001/scm/backend-library.scm#n... scm/backend-library.scm:85: "-P" On 2015/04/09 13:37:42, dak wrote: > This seems quite unrelated to the issue of this patch and the rationale for > specifying this option is totally undocumented. The patch just doesn’t work without it. (Not does it work, as I said, with older Ghostscript releases like 8.7 that we’re distributing with GUB.) From what I understand, -P makes Ghostscript look for external files (not just libraries but also, as is the case here, embed-able files) in the current directory (or using paths relatively to the current dir). Which is the desired behavior here, since we want it to find the ly file (as well as any \includ-ed source files). https://codereview.appspot.com/225040043/diff/40001/scm/framework-ps.scm File scm/framework-ps.scm (right): https://codereview.appspot.com/225040043/diff/40001/scm/framework-ps.scm#newc... scm/framework-ps.scm:421: (map (lambda (fname) On 2015/04/09 13:37:42, dak wrote: > Use of `map' instead of `for-each'. Done. https://codereview.appspot.com/225040043/diff/40001/scm/framework-ps.scm#newc... scm/framework-ps.scm:422: (set! idx (1+ idx)) On 2015/04/09 13:37:42, dak wrote: > seems cleaner to just give two list arguments to for-each, > the second being (iota (length source-list)). Done. That’s beautiful. https://codereview.appspot.com/225040043/diff/40001/scm/framework-ps.scm#newc... scm/framework-ps.scm:425: [ /_objdef {ly~a_stream} /type /stream /OBJ pdfmark On 2015/04/09 13:37:42, dak wrote: > [ ... pdfmark is a really ugly pairing. I'd rather use mark ... pdfmark I’m not sure I can use anything else than pdfmark, since it appears to be the official keyword in Adobe’s specification: http://milan.kupcevic.net/ghostscript-ps-pdf/#marks
Sign in to reply to this message.
Some improvements (but problems still exist)
Sign in to reply to this message.
https://codereview.appspot.com/225040043/diff/40001/scm/framework-ps.scm File scm/framework-ps.scm (right): https://codereview.appspot.com/225040043/diff/40001/scm/framework-ps.scm#newc... scm/framework-ps.scm:425: [ /_objdef {ly~a_stream} /type /stream /OBJ pdfmark On 2015/04/10 08:10:57, Valentin Villenave wrote: > On 2015/04/09 13:37:42, dak wrote: > > [ ... pdfmark is a really ugly pairing. I'd rather use mark ... pdfmark > > I’m not sure I can use anything else than pdfmark, since it appears to be the > official keyword in Adobe’s specification: > http://milan.kupcevic.net/ghostscript-ps-pdf/#marks Oh, pdfmark is fine. I was talking about replacing "[" with "mark" here. They are equivalent in effect (namely pushing a mark on stack), but "[" is syntacticelly much more appropriate for pairing with "]" (which scoops up everything on the stack up to a mark which is removed and replaced by a vector containing the scooped content). PostScript, being a stack machine language, does not have much in the line of syntax (which is a feature it shares with Scheme) but there is little point in using the existing commands in a manner that appear to create unmatched brackets.
Sign in to reply to this message.
Use mark ___ pdfmark instead of [ ____ pdfmark
Sign in to reply to this message.
On 2015/04/10 08:58:36, dak wrote: > I was talking about replacing "[" with "mark" here. They > are equivalent in effect (namely pushing a mark on stack), but "[" is > syntactically much more appropriate for pairing with "]" Oh, I get it now. Thanks! (Adobe’s documentation only uses [, which does gives a weird feeling à la XKCD 859).
Sign in to reply to this message.
https://codereview.appspot.com/225040043/diff/40001/lily/sources.cc File lily/sources.cc (right): https://codereview.appspot.com/225040043/diff/40001/lily/sources.cc#newcode78 lily/sources.cc:78: source_file_list = scm_cons (ly_string2scm (file_string), source_file_list); On 2015/04/10 08:10:57, Valentin Villenave wrote: > On 2015/04/09 13:37:42, dak wrote: > > any reason you don't rely on the per-parser array source_files_ for your > > list of source files? > > Assuming you’re referring to the sourcefiles_ vector, that was my plan > initially. How can I access it outside its (Sources constructor/destructor) > scope so that I can convert it into a Scheme list? You cannot access any value after its destructor has been called or equivalently its dynamic scope ended. The destructor of some Smob data structure is called by the Scheme garbage collector. Sources is not a Scheme controlled structure, but its containing lexer is, and its contained Source_file elements are. Somewhat awkwardly, they are not protected through the usual mark procedures but by retaining them in protected state until the destructor of Sources is called. At any rate: while a lexer is Scheme-accessible, the elements of its sourcefiles_ structure should be accessible as well. In this particular case, it would appear that Includable_lexer::file_name_strings_ already contains properly expanded file names after the file name search has succeeded, so that would seem to be the smarter thing to reference. > > if I do lilypond > > -fembed-source-files a.ly b.ly > > the PDF for b.ly will include the source(s) for a.ly. > > Hm, indeed. (And with more than two .ly files, Guile GC complains.) I hoped the > SCM_INIT_FUNC mechanism would prevent it, but it obviously gets initialized only > once. > > > I might add that you forget to protect it from garbage collection, likely > inducing crashes > > I couldn’t find an example of a protected global variable (protected: is always > used inside a class, and scm_gc_mark in engravers or iterators). But I’ll be > happy to learn more. protected: has nothing to do with protection from garbage collection: it is a C++ visibility construct. A global variable that is automatically protected from garbage collection can be created as type Protected_scm (after including the relevant header file). Funnily, this type is barely used in the LilyPond code base even though available since 1998. https://codereview.appspot.com/225040043/diff/40001/lily/sources.cc#newcode126 lily/sources.cc:126: } There is a lot wrong with this loop. You destructively change source_file_list here. Which is bad. But any elements you remove from the start of lst are _not_ removed from source_file_list. Since you throw away lst after the loop ends, they will still be present in source_file_list and returned by this function. I think you would be better advised to just return the list of all source files (presumably when given a parser as argument) and do any pruning at the Scheme level using `filter'. Less opportunity for messing up.
Sign in to reply to this message.
Retrieve source file paths directly from Includable-lexer:: class
Sign in to reply to this message.
On 2015/04/10 09:42:38, dak wrote: > In this particular case, it would appear that > Includable_lexer::file_name_strings_ already contains properly expanded file > names after the file name search has succeeded, so that would seem to be the > smarter thing to reference. Here’s what I came up with; not sure how proper it is. > I think you would be better advised to just return the list of all > source files (presumably when given a parser as argument) and do any pruning at > the Scheme level using `filter'. Less opportunity for messing up. Requiring a parser arg might be safer and cleaner, but it’s a PITA to handle afterwards because until now the PDF preamble was written without any parser-specific information (and I don’t feel comfortable adding a parser arg to a dozen toplevel functions just to bring it where I need it). So, I’m not really sure what to do without heavily modifying the PDF output framework. (As a result, this patch doesn’t work in the current state. There’s a FIXME in the code where I got stuck.)
Sign in to reply to this message.
https://codereview.appspot.com/225040043/diff/100001/scm/framework-ps.scm File scm/framework-ps.scm (right): https://codereview.appspot.com/225040043/diff/100001/scm/framework-ps.scm#new... scm/framework-ps.scm:427: /F (~a) (r) file def I don't think it is a good idea to let Ghostscript read the source files. For one thing, we need to mess with its general search path and include the local directory, leading to all sorts of problems when there exist overlaps with its internal file names. For another, Ghostscript needs read permission for the files in question which is incompatible with settings like -dSAFER. Instead, LilyPond should be reading the files itself and embed them into the PostScript files.
Sign in to reply to this message.
Access source files only through LilyPond.
Sign in to reply to this message.
On 2016/02/23 18:07:39, Valentin Villenave wrote: > Access source files only through LilyPond. It would be nice if we could specify an author/description for each source file... but as far as I can see, there are a bunch of "not implemented yet" in Ghostscript: http://git.ghostscript.com/?p=ghostpdl.git;a=blob;f=devices/vector/gdevpdfm.c... (plus, line 1531 makes me cry :-)
Sign in to reply to this message.
On 2016/02/23 18:15:59, Valentin Villenave wrote: > <font><font>Ngày 2016/02/23 18:07:39, Valentin Villenave đã viết:</font></font><font></font><font><font> > > File nguồn truy cập chỉ qua LilyPond.</font></font><font></font> > <font></font><font><font> > Nó sẽ được tốt đẹp nếu chúng ta có thể chỉ định một tác giả / mô tả cho từng nguồn</font></font><font></font><font><font> > file ... nhưng như xa như tôi có thể thấy, có một loạt các "chưa được thực hiện" trong</font></font><font></font><font><font> > Ghostscript:</font></font><font></font> > http://git.ghostscript.com/?p=ghostpdl.git;a=blob;f=devices/vector/gdevpdfm.c... > (cộng, dòng 1531 làm cho tôi khóc :-)</font></font>
Sign in to reply to this message.
https://codereview.appspot.com/225040043/diff/120001/lily/sources.cc File lily/sources.cc (right): https://codereview.appspot.com/225040043/diff/120001/lily/sources.cc#newcode93 lily/sources.cc:93: (SCM parser_smob), Maybe make the parser_smob argument optional? It can be deduced via the %parser fluid in the same manner a number of other functions do. https://codereview.appspot.com/225040043/diff/120001/lily/sources.cc#newcode107 lily/sources.cc:107: return scm_reverse (lst); I'd use scm_reverse_x (lst, SCM_EOL) here since you built the list completely yourself, so there is no point in copying it for reversal. https://codereview.appspot.com/225040043/diff/120001/scm/framework-ps.scm File scm/framework-ps.scm (right): https://codereview.appspot.com/225040043/diff/120001/scm/framework-ps.scm#new... scm/framework-ps.scm:504: (ps-quote It's that simple? Wow.
Sign in to reply to this message.
On 2016/02/28 14:09:52, dak wrote: > https://codereview.appspot.com/225040043/diff/120001/lily/sources.cc<font></f... > Nộp lily / sources.cc (bên phải):</font></font><font></font> > <font></font> > https://codereview.appspot.com/225040043/diff/120001/lily/sources.cc#newcode9... > lily / sources.cc: 93: (SCM parser_smob),</font></font><font></font><font><font> > Có thể đưa ra lập luận parser_smob tùy chọn? </font><font>Nó có thể được rút ra thông qua phân tích cú pháp%</font></font><font></font><font><font> > chất lỏng trong cùng một cách thức một số chức năng khác làm.</font></font><font></font> > <font></font> > https://codereview.appspot.com/225040043/diff/120001/lily/sources.cc#newcode1... > lily / sources.cc: 107: trở về scm_reverse (lst);</font></font><font></font><font><font> > Tôi muốn sử dụng scm_reverse_x (lst, SCM_EOL) ở đây kể từ khi bạn xây dựng danh sách hoàn toàn</font></font><font></font><font><font> > mình, vì vậy không có điểm trong việc sao chép nó để đảo ngược.</font></font><font></font> > <font></font> > https://codereview.appspot.com/225040043/diff/120001/scm/framework-ps.scm<fon... > Tập tin SCM / framework-ps.scm (bên phải):</font></font><font></font> > <font></font> > https://codereview.appspot.com/225040043/diff/120001/scm/framework-ps.scm#new... > SCM / framework-ps.scm: 504: (ps-quote</font></font><font></font><font><font> > Nó là đơn giản? </font><font>Wow.</font></font>
Sign in to reply to this message.
Make parser arg optional
Sign in to reply to this message.
OK, new version. The new "fluid" parser variable is noticeably nicer to work with than what we had before. https://codereview.appspot.com/225040043/diff/120001/lily/sources.cc File lily/sources.cc (right): https://codereview.appspot.com/225040043/diff/120001/lily/sources.cc#newcode93 lily/sources.cc:93: (SCM parser_smob), On 2016/02/28 14:09:51, dak wrote: > Maybe make the parser_smob argument optional? It can be deduced via the %parser > fluid in the same manner a number of other functions do. Done. https://codereview.appspot.com/225040043/diff/120001/lily/sources.cc#newcode107 lily/sources.cc:107: return scm_reverse (lst); On 2016/02/28 14:09:51, dak wrote: > I'd use scm_reverse_x (lst, SCM_EOL) here since you built the list completely > yourself, so there is no point in copying it for reversal. Done. https://codereview.appspot.com/225040043/diff/120001/scm/framework-ps.scm File scm/framework-ps.scm (right): https://codereview.appspot.com/225040043/diff/120001/scm/framework-ps.scm#new... scm/framework-ps.scm:468: (format "% LilyPond source file: ~a\n\n" filename)) While I find it wise to add something here in case someone having downloaded the PDF would wonder what these embedded files are, I’m not entirely sure if it’s wise to alter the source file. Thoughts? https://codereview.appspot.com/225040043/diff/120001/scm/framework-ps.scm#new... scm/framework-ps.scm:504: (ps-quote On 2016/02/28 14:09:51, dak wrote: > It's that simple? Wow. Yeah, it’s pretty neat. We can thank Bertrand and Reinhold: http://git.savannah.gnu.org/cgit/lilypond.git/commit/?id=ee02cd08e6bf81884323... http://git.savannah.gnu.org/cgit/lilypond.git/commit/?id=dffe04e8e1a7c0b94fdf...
Sign in to reply to this message.
https://codereview.appspot.com/225040043/diff/120001/scm/framework-ps.scm File scm/framework-ps.scm (right): https://codereview.appspot.com/225040043/diff/120001/scm/framework-ps.scm#new... scm/framework-ps.scm:468: (format "% LilyPond source file: ~a\n\n" filename)) On 2016/02/28 19:15:03, Valentin Villenave wrote: > While I find it wise to add something here in case someone > having downloaded the PDF would wonder what these embedded > files are, I’m not entirely sure if it’s wise to alter the > source file. Thoughts? Oh, didn't see it. I definitely would refrain from modifying files in any way. If the author wants identifying information, he can add it to the top of the files on her own. Using this option should always be a conscious choice, and that includes making the files suitable for exporting.
Sign in to reply to this message.
On 2016/02/29 11:22:32, dak wrote: > Oh, didn't see it. I definitely would refrain from modifying files in any way. > If the author wants identifying information, he can add it to the top of the > files on her own. Using this option should always be a conscious choice, and > that includes making the files suitable for exporting. Besides, the main purpose of adding such files is that the recipient is able to recompile them on his own in the same manner. If every recompilation adds another crud header to the included files, that will eventually become annoying.
Sign in to reply to this message.
On 2016/02/29 11:25:37, dak wrote: > On 2016/02/29 11:22:32, dak wrote: > > > Oh, didn't see it. I definitely would refrain from modifying files in any > way. ^This, and: > Besides, the main purpose of adding such files is that the recipient is able to > recompile them on his own in the same manner. If every recompilation adds > another crud header to the included files, that will eventually become annoying. Yes, absolutely. (Then again, we can be smart about it and refrain from adding a header whenever we detect an already-existing one.) However, I’m not sure about that: > > If the author wants identifying information, he can add it to the top of the > > files on her own. Using this option should always be a conscious choice, and > > that includes making the files suitable for exporting. This is not a discussion we need to have right now, but I’d very much like for us to consider eventually turning this option on by default (just like we already do with point-and-click, which needs to be deliberately disabled by the user to produce a "normal" PDF file). Granted, this may well add security issues (but we already are producing potentially unsafe PDF files). Another option would be to embed an additional README (or NFO :-) text file, which would only contain a couple of lines explaining what these files are, why they’re there and how to download LilyPond. That way the original source code would be left in a pristine state. Then again, that may well be outside of the scope of this patch. V.
Sign in to reply to this message.
|