|
|
Created:
12 years, 5 months ago by shanemhansen Modified:
12 years, 1 month ago Reviewers:
CC:
dsymonds, dave_cheney.net, rog, remyoudompheng, chressie1, rsc, golang-dev Visibility:
Public. |
Descriptionarchive/tar: read/write extended pax/gnu tar archives
Support reading pax archives and applying extended attributes
like long names, subsecond mtime resolutions, etc. Default to
writing pax archives for long file and link names.
Support reading gnu archives using the ././@LongLink extended
header for file name and link target.
Fixes issue 3300
Patch Set 1 #Patch Set 2 : diff -r 39b31d81b947 https://code.google.com/p/go #
Total comments: 23
Patch Set 3 : diff -r 39b31d81b947 https://code.google.com/p/go #
Total comments: 18
Patch Set 4 : diff -r 39b31d81b947 https://code.google.com/p/go #
Total comments: 10
Patch Set 5 : diff -r 39b31d81b947 https://code.google.com/p/go #
Total comments: 22
Patch Set 6 : diff -r 39b31d81b947 https://code.google.com/p/go #
Total comments: 24
Patch Set 7 : diff -r 93dc7f0e302b https://code.google.com/p/go #
Total comments: 12
Patch Set 8 : diff -r 93dc7f0e302b https://code.google.com/p/go #
Total comments: 8
Patch Set 9 : diff -r 93dc7f0e302b https://code.google.com/p/go #
Total comments: 4
Patch Set 10 : diff -r 135a1d6e0a91 https://code.google.com/p/go #Patch Set 11 : diff -r 135a1d6e0a91 https://code.google.com/p/go #
Total comments: 8
Patch Set 12 : diff -r 1399878c6731 https://code.google.com/p/go #Patch Set 13 : diff -r 1399878c6731 https://code.google.com/p/go #
Total comments: 9
Patch Set 14 : diff -r 439cb8bad388 https://code.google.com/p/go #Patch Set 15 : diff -r 439cb8bad388 https://code.google.com/p/go #Patch Set 16 : diff -r 439cb8bad388 https://code.google.com/p/go #Patch Set 17 : diff -r 439cb8bad388 https://code.google.com/p/go #Patch Set 18 : diff -r 439cb8bad388 https://code.google.com/p/go #
Total comments: 19
Patch Set 19 : diff -r 439cb8bad388 https://code.google.com/p/go #Patch Set 20 : diff -r 439cb8bad388 https://code.google.com/p/go #
Total comments: 4
Patch Set 21 : diff -r 439cb8bad388 https://code.google.com/p/go #
Total comments: 9
Patch Set 22 : diff -r 439cb8bad388 https://code.google.com/p/go #Patch Set 23 : diff -r 439cb8bad388 https://code.google.com/p/go #
MessagesTotal messages: 74
Hello golang-dev@googlegroups.com, I'd like you to review this change to https://code.google.com/p/go
Sign in to reply to this message.
Please read http://golang.org/doc/contribute.html and http://golang.org/doc/effective_go.html, especially the part about matching the existing coding style. The new code here stands out like a sore thumb. https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/common.go File src/pkg/archive/tar/common.go (right): https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/common.g... src/pkg/archive/tar/common.go:37: TypeGnuLongName = 'L' // Next file has a long name keep "GNU" capitalised. ("Fifo" has an unfortunate capitalisation here that can't be changed now) https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:1: // Copyright 2009 The Go Authors. All rights resesrc/pkg/archive/tar/reader.gorved. uh? https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:63: if hdr != nil { if hdr == nil { return hdr, tr.err } // Check for PAX/GNU header. ... https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:106: //Note: the spec specifies that "hdrcharset" specifies the put a space after "//". https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:108: //which is what go already gives us or binary which would Go does not enforce that strings are UTF-8. What goes wrong if it's not? https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:166: func parseFloatToTime(t string) (time.Time, error) { why are you not using strconv.ParseFloat? This doesn't need to be extremely performant code.
Sign in to reply to this message.
Thank you for taking a stab at this. Some initial comments below. What is the licence for the pax.tar file ? Does this also resolve issue 3864 ? http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go... src/pkg/archive/tar/reader.go:1: // Copyright 2009 The Go Authors. All rights resesrc/pkg/archive/tar/reader.gorved. fat finger error http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go... src/pkg/archive/tar/reader.go:66: //We have a pax extended header. See http://pubs.opengroup.org/onlinepubs/009604599/utilities/pax.html two line please. space after // here and everywhere else http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go... src/pkg/archive/tar/reader.go:78: //We have a gnu long name header. It's contents are the real file name. Its http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go... src/pkg/archive/tar/reader.go:106: //Note: the spec specifies that "hdrcharset" specifies the s/Note: // this is a already a note http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go... src/pkg/archive/tar/reader.go:182: //pad as needed before converting to a decimal. space after // http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader_te... File src/pkg/archive/tar/reader_test.go (right): http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader_te... src/pkg/archive/tar/reader_test.go:168: continue continue testLoop http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/writer.go... src/pkg/archive/tar/writer.go:169: tw.Flush() should this go below the error check ?
Sign in to reply to this message.
On 2012/10/16 00:54:38, dfc wrote: > Thank you for taking a stab at this. Some initial comments below. > > What is the licence for the pax.tar file ? > > Does this also resolve issue 3864 ? > > http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go > File src/pkg/archive/tar/reader.go (right): > > http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go... > src/pkg/archive/tar/reader.go:1: // Copyright 2009 The Go Authors. All rights > resesrc/pkg/archive/tar/reader.gorved. > fat finger error > > http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go... > src/pkg/archive/tar/reader.go:66: //We have a pax extended header. See > http://pubs.opengroup.org/onlinepubs/009604599/utilities/pax.html > two line please. space after // here and everywhere else > > http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go... > src/pkg/archive/tar/reader.go:78: //We have a gnu long name header. It's > contents are the real file name. > Its > > http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go... > src/pkg/archive/tar/reader.go:106: //Note: the spec specifies that "hdrcharset" > specifies the > s/Note: // > > this is a already a note > > http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go... > src/pkg/archive/tar/reader.go:182: //pad as needed before converting to a > decimal. > space after // > > http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader_te... > File src/pkg/archive/tar/reader_test.go (right): > > http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader_te... > src/pkg/archive/tar/reader_test.go:168: continue > continue testLoop > > http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/writer.go > File src/pkg/archive/tar/writer.go (right): > > http://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/writer.go... > src/pkg/archive/tar/writer.go:169: tw.Flush() > should this go below the error check ? dsymonds and dfs, thank you both for taking the time to review this and help me get the hang of the style used in the go std libs, I really appreciate your feedback. 1) This doesn't resolve issue 3864. Sparse files have a typeflag of 'S'. I'd be happy to take a crack at it if there's interest. 2) The license for pax.tar is whatever you want it to be ; ). It was created using tar --format=pax. 3) What can I do to bring the coding style more in line with the surrounding code? parsePAX could be a method on tar.Reader and mergePAX could be a method on tar.Header. Would reflection and some sort of map be more idiogomatic for mergePAX? Most of the parsing I'm doing is not for fixed length formats, so I didn't see any opportunity to use slicer other than maybe for the final extraction of one of the PAX values.
Sign in to reply to this message.
Sorry for the spam, just realized my comments didn't make it out with my last email. https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/common.go File src/pkg/archive/tar/common.go (right): https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/common.g... src/pkg/archive/tar/common.go:37: TypeGnuLongName = 'L' // Next file has a long name On 2012/10/16 00:53:13, dsymonds wrote: > keep "GNU" capitalised. ("Fifo" has an unfortunate capitalisation here that > can't be changed now) Done. https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:1: // Copyright 2009 The Go Authors. All rights resesrc/pkg/archive/tar/reader.gorved. On 2012/10/16 00:54:39, dfc wrote: > fat finger error Done. https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:63: if hdr != nil { On 2012/10/16 00:53:13, dsymonds wrote: > if hdr == nil { > return hdr, tr.err > } > > // Check for PAX/GNU header. > ... Done. https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:66: //We have a pax extended header. See http://pubs.opengroup.org/onlinepubs/009604599/utilities/pax.html On 2012/10/16 00:54:39, dfc wrote: > two line please. space after // here and everywhere else Done. https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:78: //We have a gnu long name header. It's contents are the real file name. On 2012/10/16 00:54:39, dfc wrote: > Its Done. https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:108: //which is what go already gives us or binary which would Thanks for correcting me. A re-reading of the spec reveals that I was waaay off. I was under the impression that a string was a sequence of unicode points (encoded as UTF-8) and s[i] for a string was a rune. On 2012/10/16 00:53:13, dsymonds wrote: > Go does not enforce that strings are UTF-8. What goes wrong if it's not? https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:166: func parseFloatToTime(t string) (time.Time, error) { I actually just wrote this method to get the tests to pass since the reader tests do a byte for byte comparison. ParseFloat introduced representation/rounding errors that caused the tests to fail. Parsing seconds and nanoseconds separately serves 2 purposes: making the tests pass and reproducing the timestamp exactly (when written be tar --format=pax). On 2012/10/16 00:53:13, dsymonds wrote: > why are you not using strconv.ParseFloat? This doesn't need to be extremely > performant code. https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:182: //pad as needed before converting to a decimal. On 2012/10/16 00:54:39, dfc wrote: > space after // Done. https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader_t... File src/pkg/archive/tar/reader_test.go (right): https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/reader_t... src/pkg/archive/tar/reader_test.go:168: continue On 2012/10/16 00:54:39, dfc wrote: > continue testLoop Done. Am I right in thinking the original statement (break) was a bug? https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/1001/src/pkg/archive/tar/writer.g... src/pkg/archive/tar/writer.go:169: tw.Flush() On 2012/10/16 00:54:39, dfc wrote: > should this go below the error check ? Done. Not really sure on this one. Also realized I was ignoring the return value of flush. I guess there's no point in finishing the write of a record known to be broken.
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, dsymonds@golang.org, dave@cheney.net (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/reader.go#n... src/pkg/archive/tar/reader.go:153: hdr.Size = int64(size) Should there be a default condition here ? default: return errors.New("corrupt archive") https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/reader.go#n... src/pkg/archive/tar/reader.go:164: func parseFloatToTime(t string) (time.Time, error) { Is it really a float ? or a decimal ? https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/reader.go#n... src/pkg/archive/tar/reader.go:182: if len(nano_buf) < 9 { //nano==10^9 remove the comment, add a const for the magic number 9 https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/reader.go#n... src/pkg/archive/tar/reader.go:211: //or the header was empty to start with. space after // https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer.go#n... src/pkg/archive/tar/writer.go:135: // slip an extra header in there. some of your comments start with an upper case letter, others with a lower, please normalise. https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer.go#n... src/pkg/archive/tar/writer.go:143: pid := os.Getpid() this should be cached. https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer_test.go File src/pkg/archive/tar/writer_test.go (right): https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer_test... src/pkg/archive/tar/writer_test.go:201: err = writer.WriteHeader(hdr) if err := writer.WriteHeader(hdr) ; err != nil { t.Fatal(err) } https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer_test... src/pkg/archive/tar/writer_test.go:206: if err != nil { ditto https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer_test... src/pkg/archive/tar/writer_test.go:209: writer.Close() are there any trailing bytes written by writer.Close()? I think you should check the err return here.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/reader.go#n... src/pkg/archive/tar/reader.go:153: hdr.Size = int64(size) On 2012/10/16 03:25:12, dfc wrote: > Should there be a default condition here ? > > default: > return errors.New("corrupt archive") I believe here it makes sense to ignore headers we don't understand, but it's your call. Various versions of gnu have used these headers for sparse files. Another example would be the comment header. We have no useful way to bubble that up to the user, but it's a valid field. http://www.gnu.org/software/tar/manual/html_section/Sparse-Formats.html https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/reader.go#n... src/pkg/archive/tar/reader.go:164: func parseFloatToTime(t string) (time.Time, error) { On 2012/10/16 03:25:12, dfc wrote: > Is it really a float ? or a decimal ? An example (found in pax.tar) would be: 1350244992.023960108 While this is a float, it's obviously from clock_gettime, which is to say it's the string representation of a integer number of seconds and an integer number of nanoseconds. Treating this as a result of clock_gettime(3) allows us to perfectly reconstruct the timestamp so that byte by byte comparisons in the unit tests work. I don't mind not treating this as: "A float formed by combining seconds and nanoseconds", but we'll have to rework the unit tests to not to compare byte by byte. So the alternative is to use ParseFloat and take pax.tar out of the test harness and have pax.tar specific tests, would that be preferable? https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/reader.go#n... src/pkg/archive/tar/reader.go:182: if len(nano_buf) < 9 { //nano==10^9 On 2012/10/16 03:25:12, dfc wrote: > remove the comment, add a const for the magic number 9 Done. https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/reader.go#n... src/pkg/archive/tar/reader.go:211: //or the header was empty to start with. On 2012/10/16 03:25:12, dfc wrote: > space after // Done. https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer.go#n... src/pkg/archive/tar/writer.go:135: // slip an extra header in there. On 2012/10/16 03:25:12, dfc wrote: > some of your comments start with an upper case letter, others with a lower, > please normalise. Done. https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer.go#n... src/pkg/archive/tar/writer.go:143: pid := os.Getpid() On 2012/10/16 03:25:12, dfc wrote: > this should be cached. Done. https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer_test.go File src/pkg/archive/tar/writer_test.go (right): https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer_test... src/pkg/archive/tar/writer_test.go:201: err = writer.WriteHeader(hdr) On 2012/10/16 03:25:12, dfc wrote: > if err := writer.WriteHeader(hdr) ; err != nil { > t.Fatal(err) > } I wasn't familiar with that idiom. Thanks. https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer_test... src/pkg/archive/tar/writer_test.go:206: if err != nil { On 2012/10/16 03:25:12, dfc wrote: > ditto Done. https://codereview.appspot.com/6700047/diff/5/src/pkg/archive/tar/writer_test... src/pkg/archive/tar/writer_test.go:209: writer.Close() On 2012/10/16 03:25:12, dfc wrote: > are there any trailing bytes written by writer.Close()? I think you should check > the err return here. Done. Close calls flush, so my understanding is that it can both write out the trailing bytes for the record and then the 2 null records that indicate tar EOF.
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, dsymonds@golang.org, dave@cheney.net (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
Thank you. I would like to see unit tests for the parsing routines. https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:166: func parseFloatToTime(t string) (time.Time, error) { http://pubs.opengroup.org/onlinepubs/009604599/utilities/pax.html#tag_04_100_... mentions nothing about floats. I suggest this method be renamed parsePAXTime() and the comment adjusted.
Sign in to reply to this message.
A few trivial comments on the documentation style. https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:104: // Merge well known headers according to PAX standard. s/Merge/mergePAX merges/ https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:162: // Take a string of the form %d.%d and convert it to unix standard time. We // parseFloatToTime converts a string of the form %d.%d to // unix standard time. ... etc I think I'd move the second sentence inside the function (it describes an implementation detail) https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:200: // Parse PAX headers from tr following the rules outlined // parsePax parses PAX headers from tr ... etc https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:245: // Parse bytes as a NUL-terminated C-style string. // cString parses ...
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:104: // Merge well known headers according to PAX standard. On 2012/10/16 10:52:03, rog wrote: > s/Merge/mergePAX merges/ Done. https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:162: // Take a string of the form %d.%d and convert it to unix standard time. We On 2012/10/16 10:52:03, rog wrote: > // parseFloatToTime converts a string of the form %d.%d to > // unix standard time. ... etc > > I think I'd move the second sentence inside the function (it describes an > implementation detail) Done. https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:166: func parseFloatToTime(t string) (time.Time, error) { On 2012/10/16 04:24:56, dfc wrote: > http://pubs.opengroup.org/onlinepubs/009604599/utilities/pax.html#tag_04_100_... > mentions nothing about floats. I suggest this method be renamed parsePAXTime() > and the comment adjusted. Done. https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:200: // Parse PAX headers from tr following the rules outlined On 2012/10/16 10:52:03, rog wrote: > // parsePax parses PAX headers from tr ... etc Done. https://codereview.appspot.com/6700047/diff/5009/src/pkg/archive/tar/reader.g... src/pkg/archive/tar/reader.go:245: // Parse bytes as a NUL-terminated C-style string. On 2012/10/16 10:52:03, rog wrote: > // cString parses ... Done.
Sign in to reply to this message.
Sorry for the delay. There's a lot to digest here. I've only had a good review of the parsing code. https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:71: // We have a PAX extended header. See http://pubs.opengroup.org/onlinepubs/009604599/utilities/pax.html no need to repeat the reference here. // PAX extended header https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:198: // parsePAX parses PAX headers from tr following the rules outlined // parsePAX parses a PAX header. No need for extra detail; it's not hard to find this in pax.html. No need to spell out the particular error return; nothing is distinguishing that error. https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:201: func parsePAX(tr io.Reader) (map[string]string, error) { tr -> r ("tr" here returns to the local Reader type, not io.Reader) https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:207: // We are looking for records of the type: This is a bit confusing. // Each record is constructed as // "%d %s=%s\n", length, keyword, value (that is much closer to what pax.html says) https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:212: if len(buf) == 0 { make this the loop condition. https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:217: pos = bytes.IndexByte(buf, ' ') pos -> sp https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:222: length, err := strconv.ParseInt(string(buf[:pos]), 10, 0) length -> n https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:228: record := buf[pos+1 : int(length)-1] slice off the record and shrink buf in the same step to make it clearer what the operation is. var rec []byte rec, buf = buf[sp+1:int(n)-1], buf[n:] (are you sure you need this int cast?) https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:231: pos = bytes.IndexByte(record, '=') pos -> eq https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:235: key := record[:pos] key, value := rec[:pos], rec[pos+1:] https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader_... File src/pkg/archive/tar/reader_test.go (right): https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader_... src/pkg/archive/tar/reader_test.go:298: correctHeaders := []byte(`30 mtime=1350244992.023960108 make this a "" string, with a \n instead of a literal line break.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:243: // cString parse bytes as a NUL-terminated C-style string. s/parse/parses
Sign in to reply to this message.
dsymonds, I just have one question about ReadAll(tr) vs ReadAll(r). https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:71: // We have a PAX extended header. See http://pubs.opengroup.org/onlinepubs/009604599/utilities/pax.html On 2012/10/22 06:29:14, dsymonds wrote: > no need to repeat the reference here. > // PAX extended header Done. https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:198: // parsePAX parses PAX headers from tr following the rules outlined On 2012/10/22 06:29:14, dsymonds wrote: > // parsePAX parses a PAX header. > > No need for extra detail; it's not hard to find this in pax.html. No need to > spell out the particular error return; nothing is distinguishing that error. Done. https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:201: func parsePAX(tr io.Reader) (map[string]string, error) { On 2012/10/22 06:29:14, dsymonds wrote: > tr -> r > ("tr" here returns to the local Reader type, not io.Reader) I actually think this is what I want. ReadAll(tr) gives me the contents of the extended header, which is put in the body of the pax record. wouldn't ReadlAll(r) read the underlying reader until EOF? https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:207: // We are looking for records of the type: On 2012/10/22 06:29:14, dsymonds wrote: > This is a bit confusing. > > // Each record is constructed as > // "%d %s=%s\n", length, keyword, value > > (that is much closer to what pax.html says) Done. https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:212: if len(buf) == 0 { On 2012/10/22 06:29:14, dsymonds wrote: > make this the loop condition. Done. https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:217: pos = bytes.IndexByte(buf, ' ') On 2012/10/22 06:29:14, dsymonds wrote: > pos -> sp Done. https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:222: length, err := strconv.ParseInt(string(buf[:pos]), 10, 0) On 2012/10/22 06:29:14, dsymonds wrote: > length -> n Done. https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:228: record := buf[pos+1 : int(length)-1] On 2012/10/22 06:29:14, dsymonds wrote: > slice off the record and shrink buf in the same step to make it clearer what the > operation is. > > var rec []byte > rec, buf = buf[sp+1:int(n)-1], buf[n:] > > (are you sure you need this int cast?) Done.
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, remyoudompheng@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
Sorry again for the delay. I think I've got a handle on this; the rest of the review should proceed quicker. https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:201: func parsePAX(tr io.Reader) (map[string]string, error) { On 2012/10/23 02:54:14, shanemhansen wrote: > On 2012/10/22 06:29:14, dsymonds wrote: > > tr -> r > > ("tr" here returns to the local Reader type, not io.Reader) > > I actually think this is what I want. ReadAll(tr) gives me the contents of the > extended header, which is put in the body of the pax record. wouldn't > ReadlAll(r) read the underlying reader until EOF? I mean here, rather than the call site. This function only uses the functionality of io.Reader (the type it takes), so it should name its parameter accordingly: "r". The call site correctly invokes this with tr. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:208: for len(buf) != 0 { for len(buf) > 0 { https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/reader_... File src/pkg/archive/tar/reader_test.go (right): https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/reader_... src/pkg/archive/tar/reader_test.go:343: "mtime": "1350244992.023960108"} put } on its own line https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/reader_... src/pkg/archive/tar/reader_test.go:348: // Assert the values were written correctly. want := &Header{ ... } if !reflect.DeepEqual(hdr, want) { t.Errorf("incorrect merge: got %+v, want %+v", hdr, want) } https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:26: // The pid of the current process. Assumes go doesn't fork. Don't make this assumption. Just call os.Getpid() as needed; it's not that expensive compared to all the other things going on. If this was a big concern, capturing it in NewWriter would be the better approach, but I don't think it's warranted. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:143: use_pax_header := len(hdr.Name) > 100 || len(hdr.Linkname) > 100 these are not Go names. Drop your underscores and use camelCase. See if you can shorten the names too. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:144: if use_pax_header { I think the contents of this if could be its own method, and then you can just inline the condition. if len(hdr.Name) > 100 || len(hdr.Linkname) > 100 { if err := tw.writePAXHeader(hdr); err != nil { return err } } https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... File src/pkg/archive/tar/writer_test.go (right): https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:186: // Create a pretend large archive. s/pretend // It looks real enough to me. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:189: t.Fatal(err) Please use a better error message throughout here. If we get a test failure that just says "permission denied" then it's not going to be as easy to track down as "os.Stat: permission denied". https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:196: long_name := strings.Repeat("0", 200) longName something more interesting would be good. strings.Repeat("ab", 51)? https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:214: // Test that we can get a nice long name back out of the archive. s/nice // https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:215: reader := NewReader(bytes.NewBuffer(buf.Bytes())) you shouldn't have to wrap it in a new buffer. NewReader(&buf) should work. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:220: if !strings.EqualFold(hdr.Name, long_name) { why fold?
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/18001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:201: func parsePAX(tr io.Reader) (map[string]string, error) { Ok, that makes sense. Change made. On 2012/10/31 02:44:37, dsymonds wrote: > On 2012/10/23 02:54:14, shanemhansen wrote: > > On 2012/10/22 06:29:14, dsymonds wrote: > > > tr -> r > > > ("tr" here returns to the local Reader type, not io.Reader) > > > > I actually think this is what I want. ReadAll(tr) gives me the contents of the > > extended header, which is put in the body of the pax record. wouldn't > > ReadlAll(r) read the underlying reader until EOF? > > I mean here, rather than the call site. This function only uses the > functionality of io.Reader (the type it takes), so it should name its parameter > accordingly: "r". The call site correctly invokes this with tr. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:208: for len(buf) != 0 { On 2012/10/31 02:44:37, dsymonds wrote: > for len(buf) > 0 { Done. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/reader_... File src/pkg/archive/tar/reader_test.go (right): https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/reader_... src/pkg/archive/tar/reader_test.go:343: "mtime": "1350244992.023960108"} On 2012/10/31 02:44:37, dsymonds wrote: > put } on its own line Done. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/reader_... src/pkg/archive/tar/reader_test.go:348: // Assert the values were written correctly. On 2012/10/31 02:44:37, dsymonds wrote: > want := &Header{ > ... > } > if !reflect.DeepEqual(hdr, want) { > t.Errorf("incorrect merge: got %+v, want %+v", hdr, want) > } Done. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:26: // The pid of the current process. Assumes go doesn't fork. No problem. Someone asked me to cache the value, I'll change it back. On 2012/10/31 02:44:37, dsymonds wrote: > Don't make this assumption. Just call os.Getpid() as needed; it's not that > expensive compared to all the other things going on. > > If this was a big concern, capturing it in NewWriter would be the better > approach, but I don't think it's warranted. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:143: use_pax_header := len(hdr.Name) > 100 || len(hdr.Linkname) > 100 On 2012/10/31 02:44:37, dsymonds wrote: > these are not Go names. Drop your underscores and use camelCase. See if you can > shorten the names too. Done. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:144: if use_pax_header { On 2012/10/31 02:44:37, dsymonds wrote: > I think the contents of this if could be its own method, and then you can just > inline the condition. > > if len(hdr.Name) > 100 || len(hdr.Linkname) > 100 { > if err := tw.writePAXHeader(hdr); err != nil { > return err > } > } Done. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... File src/pkg/archive/tar/writer_test.go (right): https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:186: // Create a pretend large archive. On 2012/10/31 02:44:37, dsymonds wrote: > s/pretend // > > It looks real enough to me. Done. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:189: t.Fatal(err) On 2012/10/31 02:44:37, dsymonds wrote: > Please use a better error message throughout here. If we get a test failure that > just says "permission denied" then it's not going to be as easy to track down as > "os.Stat: permission denied". Done. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:196: long_name := strings.Repeat("0", 200) On 2012/10/31 02:44:37, dsymonds wrote: > longName > > something more interesting would be good. strings.Repeat("ab", 51)? Done. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:214: // Test that we can get a nice long name back out of the archive. On 2012/10/31 02:44:37, dsymonds wrote: > s/nice // Done. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:215: reader := NewReader(bytes.NewBuffer(buf.Bytes())) On 2012/10/31 02:44:37, dsymonds wrote: > you shouldn't have to wrap it in a new buffer. NewReader(&buf) should work. Done. https://codereview.appspot.com/6700047/diff/27001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:220: if !strings.EqualFold(hdr.Name, long_name) { Ignorance. I was looking for a strings.Equal function. A closer reading of the spec reveals that == is defined for the string type. Thanks for pointing that out. On 2012/10/31 02:44:37, dsymonds wrote: > why fold?
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, remyoudompheng@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:184: extendedHdr := new(Header) extendedHdr -> ext https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:189: // The spec asks that we namespace our psuedo files "pseudo" https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:202: fmt.Fprintf(&buf, "%d%s", size, msg) I am confused by this construction. Shouldn't it be something more like this? msg := fmt.Sprintf(" path=%s\n", hdr.Name) fmt.Fprintf(&buf, "%d %s", len(msg), msg) https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:212: err := tw.WriteHeader(extendedHdr) snuggle these where possible. if err := tw.WriteHeader(ext); err != nil { return err } etc. https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer_... File src/pkg/archive/tar/writer_test.go (right): https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:193: t.Fatal("os.Stat: " + err.Error()) t.Fatalf("os.Stat: %v", err) https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:196: long_name := strings.Repeat("ab", 100) long_name -> longName https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:220: if !(hdr.Name == long_name) { if hdr.Name != longName
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:202: fmt.Fprintf(&buf, "%d%s", size, msg) The pax standard is weird here. It states that the %d must be the length of the entire message (including) %d. So somehow we have to add that length on. I'm convinced what I have here is wrong, I'll try and submit something better. Here is an example of a real live pax header: "30 mtime=1350244992.023960108\n" I'm submitting code which I think does the length adjustment correctly. On 2012/10/31 12:16:25, dsymonds wrote: > I am confused by this construction. Shouldn't it be something more like this? > > msg := fmt.Sprintf(" path=%s\n", hdr.Name) > fmt.Fprintf(&buf, "%d %s", len(msg), msg) https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:212: err := tw.WriteHeader(extendedHdr) On 2012/10/31 12:16:25, dsymonds wrote: > snuggle these where possible. > if err := tw.WriteHeader(ext); err != nil { > return err > } > etc. Done. https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer_... File src/pkg/archive/tar/writer_test.go (right): https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:193: t.Fatal("os.Stat: " + err.Error()) On 2012/10/31 12:16:25, dsymonds wrote: > t.Fatalf("os.Stat: %v", err) Done. https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:196: long_name := strings.Repeat("ab", 100) On 2012/10/31 12:16:25, dsymonds wrote: > long_name -> longName Done. https://codereview.appspot.com/6700047/diff/29004/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:220: if !(hdr.Name == long_name) { On 2012/10/31 12:16:25, dsymonds wrote: > if hdr.Name != longName Done.
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, remyoudompheng@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
On Thu, Nov 1, 2012 at 1:40 PM, <shanemhansen@gmail.com> wrote: > The pax standard is weird here. It states that the %d must be the length > of the entire message (including) %d. So somehow we have to add that > length on. I'm convinced what I have here is wrong, I'll try and submit > something better. Here is an example of a real live pax header: > > "30 mtime=1350244992.023960108\n" > > I'm submitting code which I think does the length adjustment correctly. That seems nuts. That would mean "9 a=1234\n" and "10 a=1234\n" mean the same thing. Did that header you pasted come from a real life tar implementation? Which one?
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:198: msg := fmt.Sprintf(" path=%s\n", hdr.Name) pull this block into its own function // paxHeader returns a PAX header for the given "key=value" message. func paxHeader(m string) string { ... } and just use fmt.Fprint(&buf, paxHeader(fmt.Sprintf("path=%s", hdr.Name))) I think such a function would warrant a test of its own. (also, please use "hg gofmt" before uploading new snapshots) https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:213: _, err := tw.Write(buf.Bytes()) snuggle
Sign in to reply to this message.
It was produced by tar (GNU tar) 1.26 Here's a script to reproduce the behaviour. http://pastebin.com/WVDnyGhR It's pretty weird. I also got that specific example from the pax.tar file in testdata/ On Wed, Oct 31, 2012 at 8:44 PM, David Symonds <dsymonds@golang.org> wrote: > On Thu, Nov 1, 2012 at 1:40 PM, <shanemhansen@gmail.com> wrote: > > > The pax standard is weird here. It states that the %d must be the length > > of the entire message (including) %d. So somehow we have to add that > > length on. I'm convinced what I have here is wrong, I'll try and submit > > something better. Here is an example of a real live pax header: > > > > "30 mtime=1350244992.023960108\n" > > > > I'm submitting code which I think does the length adjustment correctly. > > That seems nuts. That would mean "9 a=1234\n" and "10 a=1234\n" mean > the same thing. Did that header you pasted come from a real life tar > implementation? Which one? >
Sign in to reply to this message.
Okay, just pull it out into a function as I suggest and carefully test it.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:192: dir, file := filepath.Split(hdr.Name) i think the delimiter in these archives is always a plain '/'. at least i couldn't find a counter example :)
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:193: ext.Name = filepath.Join(dir, another issue i can see is that Join cleans the resulting path. maybe it makes sense to path.Clean(hdr.Name) anyway? but that seems to be an incompatibility with the go1 promise. any thoughts?
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, remyoudompheng@gmail.com, chressie@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:192: dir, file := filepath.Split(hdr.Name) According to my reading of the python tarfile implementation, they use /. On 2012/11/05 15:51:08, chressie1 wrote: > i think the delimiter in these archives is always a plain '/'. at least i > couldn't find a counter example :) https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:193: ext.Name = filepath.Join(dir, Sorry, I'm not sure what you mean by cleaning the path. On 2012/11/05 17:44:47, chressie1 wrote: > another issue i can see is that Join cleans the resulting path. maybe it makes > sense to path.Clean(hdr.Name) anyway? but that seems to be an incompatibility > with the go1 promise. any thoughts?
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/30004/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/30004/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:198: fmt.Fprint(&buf, paxHeader(fmt.Sprintf("path=%s", hdr.Name))) no need for fmt.Sprintf just to concatenate two strings. "path=" + hdr.Name same for hdr.Linkname https://codereview.appspot.com/6700047/diff/30004/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:219: padding := 2 // Extra padding for space and newline const padding = 2 https://codereview.appspot.com/6700047/diff/30004/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:221: size += len(strconv.Itoa(size)) This still can't be right. What if this += involves a decimal carry? Then the output will be one longer than what is computed here. please add a test case that covers this, and fix this code. https://codereview.appspot.com/6700047/diff/30004/src/pkg/archive/tar/writer_... File src/pkg/archive/tar/writer_test.go (right): https://codereview.appspot.com/6700047/diff/30004/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:228: paxTests := map[string]string{"name=/etc/hosts": "19 name=/etc/hosts\n", one test case per line so we can read it please. also add a case that yields a single digit length.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:192: dir, file := filepath.Split(hdr.Name) On 2012/11/07 00:59:48, shanemhansen wrote: > According to my reading of the python tarfile implementation, they use /. exactly. and therefore you should use the path package (which handles / separated paths) and not path/filepath (which uses os.PathSeparator separated paths) https://codereview.appspot.com/6700047/diff/34007/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:193: ext.Name = filepath.Join(dir, On 2012/11/07 00:59:48, shanemhansen wrote: > Sorry, I'm not sure what you mean by cleaning the path. From [0]: "Clean returns the shortest path name equivalent to path by purely lexical processing." that means if someone wants to have a path like "a/b/../../c///" in his tarball, he'll get instead the cleaned path equivalent "c". that is bad because of 2 things: (1) it's not what the tarball creator expects, because its not documented; (2) the trailing slash is gone. i think cleaning the path would make sense, but then it needs to be done (1) for all hdr.Name (not only the ones which are >100 chars) and (2) the trailing slash needs to be appended (that's what gnu tar 1.26 does).. but this might break go1 compatibility. [0]: http://golang.org/pkg/path/#Clean
Sign in to reply to this message.
Hi Shane, Any progress on this? It seems like this CL is close to being done.
Sign in to reply to this message.
On 2012/12/04 00:00:03, dsymonds wrote: > Hi Shane, > > Any progress on this? It seems like this CL is close to being done. My apologies, didn't mean to be flaky. Thanksgiving and a laptop failure got in the way, thanks for reminding me.
Sign in to reply to this message.
ping. Time for another progress report? This CL is pretty close to being done.
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, remyoudompheng@gmail.com, chressie@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, remyoudompheng@gmail.com, chressie@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
This looks pretty close to me. https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:248: // Check for binary format first. This is already in the tree; I think you need to sync your client and do an 'hg upload'. https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:205: dir, file := filepath.Split(hdr.Name) yeah, as mentioned by chressie1 these should use path rather than filepath. tar names are always / separated. https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:220: _, err := tw.Write(buf.Bytes()) please snuggle this. https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/writer_... File src/pkg/archive/tar/writer_test.go (right): https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:232: {"a=name", "10 a=name\n"}, //Test case involving multiple acceptable lengths a single space after //
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, remyoudompheng@gmail.com, chressie@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:248: // Check for binary format first. On 2013/01/09 02:13:17, dsymonds wrote: > This is already in the tree; I think you need to sync your client and do an 'hg > upload'. Done. I think. I certainly did a hg sync and hg upload. https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:205: dir, file := filepath.Split(hdr.Name) On 2013/01/09 02:13:17, dsymonds wrote: > yeah, as mentioned by chressie1 these should use path rather than filepath. tar > names are always / separated. Done. https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:220: _, err := tw.Write(buf.Bytes()) On 2013/01/09 02:13:17, dsymonds wrote: > please snuggle this. Done. https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/writer_... File src/pkg/archive/tar/writer_test.go (right): https://codereview.appspot.com/6700047/diff/51001/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:232: {"a=name", "10 a=name\n"}, //Test case involving multiple acceptable lengths On 2013/01/09 02:13:17, dsymonds wrote: > a single space after // Done.
Sign in to reply to this message.
LGTM This looks good to me. Given the subtlety and complexity of this CL I'll let it sit for a couple of days to allow others to chime in (even with just a "LGTM"); I'll submit it after that if there's no problems turned up. Thanks for working through this!
Sign in to reply to this message.
On 2013/01/09 02:42:34, dsymonds wrote: > LGTM > > This looks good to me. > > Given the subtlety and complexity of this CL I'll let it sit for a couple of > days to allow others to chime in (even with just a "LGTM"); I'll submit it after > that if there's no problems turned up. > > Thanks for working through this! Thanks for everyone's patience and feedback, it's been a great introduction to an unfamiliar toolset and process. I've appreciated getting a better understanding of idiomatic go.
Sign in to reply to this message.
FYI https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:179: nano_buf := string(buf[pos+1:]) nanoBuf https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:213: if sp == -1 { sp < 0 https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:228: if eq == -1 { eq < 0 https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:136: // Decide whether or not to use PAX extensions Please don't use PAX headers for names that are only 101 bytes long. An option that should be used before then is to use the split ustar header form, in which a prefix of at most 155 bytes appears after the Devminor bytes. The full name is prefix + "/" + name. I expect that more readers will be able to handle this than will know about PAX extensions. I am looking at one right now. My suggestion also matches what BSD tar does: $ tar c tmp/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/long/name | xd -c | sed 200q 0000000 v e r y / v e r y / v e r y / v 0000010 e r y / v e r y / v e r y / v e 0000020 r y / v e r y / v e r y / v e r 0000030 y / v e r y / v e r y / v e r y 0000040 / v e r y / v e r y / v e r y / 0000050 v e r y / v e r y / l o n g / n 0000060 a m e 00 0 0 0 6 4 4 00 0 5 2 4 0000070 5 5 00 0 0 0 0 0 0 00 0 0 0 0 0000080 0 0 0 0 0 0 3 1 2 0 7 3 3 4 3 0000090 4 0 1 0 5 1 5 7 2 00 0 00 00 00 00000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000100 00 u s t a r 00 0 0 r s c 00 00 00 00 0000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000120 00 00 00 00 00 00 00 00 00 w h e e l 00 00 0000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000140 00 00 00 00 00 00 00 00 00 0 0 0 0 0 0 0000150 00 0 0 0 0 0 0 00 t m p / v e r 0000160 y / v e r y / v e r y / v e r y 0000170 / v e r y / v e r y / v e r y / 0000180 v e r y / v e r y / v e r y / v 0000190 e r y / v e r y / v e r y 00 00 00 00001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 (OS X). I tried the same on Linux and I got a tar file using the GNU ././@LongLink extension: 0000000 . / . / @ L o n g L i n k 00 00 00 0000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000060 00 00 00 00 0 0 0 0 0 0 0 00 0 0 0 0 0000070 0 0 0 00 0 0 0 0 0 0 0 00 0 0 0 0 0000080 0 0 0 0 2 5 1 00 0 0 0 0 0 0 0 0 0000090 0 0 0 00 0 1 1 5 6 3 00 L 00 00 00 00000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000100 00 u s t a r 00 r o o t 00 00 00 0000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000120 00 00 00 00 00 00 00 00 00 r o o t 00 00 00 0000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00001c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00001e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000200 t m p / v e r y / v e r y / v e 0000210 r y / v e r y / v e r y / v e r 0000220 y / v e r y / v e r y / v e r y 0000230 / v e r y / v e r y / v e r y / 0000240 v e r y / v e r y / v e r y / v 0000250 e r y / v e r y / v e r y / v e 0000260 r y / v e r y / v e r y / v e r 0000270 y / v e r y / v e r y / v e r y 0000280 / v e r y / v e r y / v e r y / 0000290 v e r y / v e r y / v e r y / l 00002a0 o n g / n a m e 00 00 00 00 00 00 00 00 00002b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00002c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00002d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00002e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00002f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000300 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 In neither case did I get a PAX header. I am worried that nothing generates PAX headers and therefore nothing will read them. I would rather see either of these forms (preferably the BSD one, which I believe is more widespread) before the PAX header here. I have no problem with this package reading the PAX stuff, only with writing it.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/reader.go File src/pkg/archive/tar/reader.go (right): https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:179: nano_buf := string(buf[pos+1:]) On 2013/01/09 19:43:49, rsc wrote: > nanoBuf Done. https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:213: if sp == -1 { On 2013/01/09 19:43:49, rsc wrote: > sp < 0 Done. https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/reader.... src/pkg/archive/tar/reader.go:228: if eq == -1 { On 2013/01/09 19:43:49, rsc wrote: > eq < 0 Done. https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:136: // Decide whether or not to use PAX extensions On 2013/01/09 19:43:49, rsc wrote: > Please don't use PAX headers for names that are only 101 bytes long. > An option that should be used before then is to use the split ustar header form, > in which a prefix of at most 155 bytes appears after the Devminor bytes. The > full name is prefix + "/" + name. I expect that more readers will be able to > handle this than will know about PAX extensions. I am looking at one right now. > > My suggestion also matches what BSD tar does: > > $ tar c > tmp/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/very/long/name > | xd -c | sed 200q > 0000000 v e r y / v e r y / v e r y / v > 0000010 e r y / v e r y / v e r y / v e > 0000020 r y / v e r y / v e r y / v e r > 0000030 y / v e r y / v e r y / v e r y > 0000040 / v e r y / v e r y / v e r y / > 0000050 v e r y / v e r y / l o n g / n > 0000060 a m e 00 0 0 0 6 4 4 00 0 5 2 4 > 0000070 5 5 00 0 0 0 0 0 0 00 0 0 0 0 > 0000080 0 0 0 0 0 0 3 1 2 0 7 3 3 4 3 > 0000090 4 0 1 0 5 1 5 7 2 00 0 00 00 00 > 00000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000100 00 u s t a r 00 0 0 r s c 00 00 00 00 > 0000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000120 00 00 00 00 00 00 00 00 00 w h e e l 00 00 > 0000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000140 00 00 00 00 00 00 00 00 00 0 0 0 0 0 0 > 0000150 00 0 0 0 0 0 0 00 t m p / v e r > 0000160 y / v e r y / v e r y / v e r y > 0000170 / v e r y / v e r y / v e r y / > 0000180 v e r y / v e r y / v e r y / v > 0000190 e r y / v e r y / v e r y 00 00 00 > 00001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > (OS X). > > I tried the same on Linux and I got a tar file using the GNU ././@LongLink > extension: > > 0000000 . / . / @ L o n g L i n k 00 00 00 > 0000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000060 00 00 00 00 0 0 0 0 0 0 0 00 0 0 0 0 > 0000070 0 0 0 00 0 0 0 0 0 0 0 00 0 0 0 0 > 0000080 0 0 0 0 2 5 1 00 0 0 0 0 0 0 0 0 > 0000090 0 0 0 00 0 1 1 5 6 3 00 L 00 00 00 > 00000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000100 00 u s t a r 00 r o o t 00 00 00 > 0000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000120 00 00 00 00 00 00 00 00 00 r o o t 00 00 00 > 0000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00001c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00001e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000200 t m p / v e r y / v e r y / v e > 0000210 r y / v e r y / v e r y / v e r > 0000220 y / v e r y / v e r y / v e r y > 0000230 / v e r y / v e r y / v e r y / > 0000240 v e r y / v e r y / v e r y / v > 0000250 e r y / v e r y / v e r y / v e > 0000260 r y / v e r y / v e r y / v e r > 0000270 y / v e r y / v e r y / v e r y > 0000280 / v e r y / v e r y / v e r y / > 0000290 v e r y / v e r y / v e r y / l > 00002a0 o n g / n a m e 00 00 00 00 00 00 00 00 > 00002b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00002c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00002d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00002e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00002f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0000300 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > In neither case did I get a PAX header. I am worried that nothing generates > PAX headers and therefore nothing will read them. I would rather see either > of these forms (preferably the BSD one, which I believe is more widespread) > before the PAX header here. > > I have no problem with this package reading the PAX stuff, only with writing it. I'm not super familiar with bsd. I've downloaded freebsd, and my experiments seem to indicate that for filenames < 100 characters, you get a standard tar file. For 100 < len(name) < 255, you get ustar headers. BSD tar goes ahead and uses pax extensions for larger files. Is that an acceptable behavior for go?
Sign in to reply to this message.
Sorry for the late reply. Using PAX is fine as long as it's restricted to namelen >= 255.
Sign in to reply to this message.
On 2013/01/23 04:47:56, rsc wrote: > Sorry for the late reply. Using PAX is fine as long as it's restricted to > namelen >= 255. i think this is very close to be finished. if there's help required, i'd be happy to update this CL..
Sign in to reply to this message.
On 2013/01/30 13:04:24, chressie1 wrote: > i think this is very close to be finished. if there's help required, i'd be > happy to update this CL.. to be clear, i'm thinking about merging https://codereview.appspot.com/6814084/ into this CL.
Sign in to reply to this message.
Hi Chressie1, Sorry I've been so busy, but I haven't gone mia. I'd be happy to have you merge your changes into this review. I'm generally on #go-nuts if you have any questions. On 2013/01/30 13:12:54, chressie1 wrote: > On 2013/01/30 13:04:24, chressie1 wrote: > > > i think this is very close to be finished. if there's help required, i'd be > > happy to update this CL.. > > to be clear, i'm thinking about merging https://codereview.appspot.com/6814084/ > into this CL.
Sign in to reply to this message.
On 2013/01/31 03:00:18, shanemhansen wrote: > Hi Chressie1, > Sorry I've been so busy, but I haven't gone mia. I'd be happy to have you merge > your changes into this review. I'm generally on #go-nuts if you have any > questions. no worries :) unfortunately i cannot update this CL, so i created a merge CL here: https://codereview.appspot.com/7229066 dsymonds, i'd suppose to submit this now as is and to address rsc's concern in the other CL? thanks, chressie
Sign in to reply to this message.
I think we should just get Shane's CL finished off (I believe it's very close), and add chressie's changes in separately.
Sign in to reply to this message.
Sounds good to me. I'll work on adding support for ustar long names. Thanks for your patience. On Thu, Jan 31, 2013 at 3:56 PM, David Symonds <dsymonds@golang.org> wrote: > I think we should just get Shane's CL finished off (I believe it's > very close), and add chressie's changes in separately. >
Sign in to reply to this message.
sorry, i'm afraid, i wasn't clear in the first place. the merge CL https://codereview.appspot.com/7229066 is exactly about adding ustar long names (i.e. using the 155 chars prefix field) to the current CL. feel free to copy'n'paste it, if you want. On Wed, Feb 6, 2013 at 4:48 AM, Shane Hansen <shanemhansen@gmail.com> wrote: > Sounds good to me. I'll work on adding support for ustar long names. > > Thanks for your patience. > On Thu, Jan 31, 2013 at 3:56 PM, David Symonds <dsymonds@golang.org> wrote: >> >> I think we should just get Shane's CL finished off (I believe it's >> very close), and add chressie's changes in separately. > >
Sign in to reply to this message.
one last thing.. https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/writer_... File src/pkg/archive/tar/writer_test.go (right): https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:225: func aTestPAXHeader(t *testing.T) { this test is never executed (due to the leading 'a').. i noticed that it fails, because of the ambiguity of the "multiple acceptable lengths" tests. this should be cleaned up before activating the test.
Sign in to reply to this message.
Good find. This would be done last night except I switched to an OSX system and learned that: 1) bsd tar uses ' ' (20) to pad octal group names instead of \0 (0) 2) bsd and gnu tar use different heuristics for where to split the filename. So it looks like in order to be go tar compatible we need to use gnu tar for the test ustar file (for \0 padding), which means I have to update the file splitting heuristic. For a brief mention of \0 and ' ' see: http://en.wikipedia.org/wiki/Tar_(computing)#UStar_format Should be done soon. On 2013/02/06 07:22:17, chressie1 wrote: > one last thing.. > > https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/writer_... > File src/pkg/archive/tar/writer_test.go (right): > > https://codereview.appspot.com/6700047/diff/53007/src/pkg/archive/tar/writer_... > src/pkg/archive/tar/writer_test.go:225: func aTestPAXHeader(t *testing.T) { > this test is never executed (due to the leading 'a').. > > i noticed that it fails, because of the ambiguity of the "multiple acceptable > lengths" tests. this should be cleaned up before activating the test.
Sign in to reply to this message.
Hello dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, remyoudompheng@gmail.com, chressie@gmail.com, rsc@golang.org (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
Hello dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, remyoudompheng@gmail.com, chressie@gmail.com, rsc@golang.org (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
Sorry for all the spam. I had a little trouble doing a hg file XXXX somefile to add the new ustar test data. On Wed, Feb 6, 2013 at 8:36 PM, <shanemhansen@gmail.com> wrote: > Hello dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, > remyoudompheng@gmail.com, chressie@gmail.com, rsc@golang.org (cc: > golang-dev@googlegroups.com), > > Please take another look. > > > https://codereview.appspot.**com/6700047/<https://codereview.appspot.com/6700... >
Sign in to reply to this message.
You can use hg upload NNNN to avoid sending mail until you are happy with the results, at that point hg mail. On 07/02/2013, at 14:57, Shane jt Hansen <shanemhansen@gmail.com> wrote: > Sorry for all the spam. I had a little trouble doing a hg file XXXX somefile to add the new ustar test data. > > On Wed, Feb 6, 2013 at 8:36 PM, <shanemhansen@gmail.com> wrote: >> Hello dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, >> remyoudompheng@gmail.com, chressie@gmail.com, rsc@golang.org (cc: >> golang-dev@googlegroups.com), >> >> Please take another look. >> >> >> https://codereview.appspot.com/6700047/ >
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/common.go File src/pkg/archive/tar/common.go (right): https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/common.... src/pkg/archive/tar/common.go:62: FileNameSize = 100 these don't need to be exported. they could also do with slightly more detailed comments. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/reader_... File src/pkg/archive/tar/reader_test.go (right): https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/reader_... src/pkg/archive/tar/reader_test.go:308: t.Fatalf("Couldn't parse correctly formatted headers %s", err) "...: %v" at the end of the error string. t.Errorf + continue would be better here too so that we get to see all the failures, not just the first. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/reader_... src/pkg/archive/tar/reader_test.go:311: t.Fatalf("mtime header incorrectly parsed") include the got/want in the error output, and make this t.Errorf+continue too. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:116: // Calling afte a Close will return ErrWriteAfterClose. "after" https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:137: // or the link name is too long, use PAX headers. s/,/;/ https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:142: } any other error should be returned, no? https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:149: tw.cString(s.next(FileNameSize), suffix) if this is a cstring, don't the limits above need to be >= instead of >? https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:177: copy(header[257:265], []byte("ustar\000")) please add a comment explaining why this magic is different to the one a few lines above. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:196: // writeUSTARLongName splits a USTAR long name hdr.Name add a period, then add "name" to the start of the next line. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:200: func (tw *Writer) splitUSTARLongName(hdr *Header) (prefix, suffix string, err error) { make the argument "name string", and pass hdr.Name above. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer_... File src/pkg/archive/tar/writer_test.go (right): https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer_... src/pkg/archive/tar/writer_test.go:252: {"a=names", "11 a=names\n"}, // Test case involving carrys "carries"
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/common.go File src/pkg/archive/tar/common.go (right): https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/common.... src/pkg/archive/tar/common.go:62: FileNameSize = 100 On 2013/02/07 04:16:27, dsymonds wrote: > these don't need to be exported. they could also do with slightly more detailed > comments. Done. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/reader_... File src/pkg/archive/tar/reader_test.go (right): https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/reader_... src/pkg/archive/tar/reader_test.go:308: t.Fatalf("Couldn't parse correctly formatted headers %s", err) On 2013/02/07 04:16:27, dsymonds wrote: > "...: %v" at the end of the error string. > > t.Errorf + continue would be better here too so that we get to see all the > failures, not just the first. Done. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/reader_... src/pkg/archive/tar/reader_test.go:311: t.Fatalf("mtime header incorrectly parsed") On 2013/02/07 04:16:27, dsymonds wrote: > include the got/want in the error output, and make this t.Errorf+continue too. Done. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:137: // or the link name is too long, use PAX headers. On 2013/02/07 04:16:27, dsymonds wrote: > s/,/;/ Done. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:149: tw.cString(s.next(FileNameSize), suffix) I could be wrong, but I don't think we need to change the limits. cString() doesn't necessarily place a null terminator in the if the value fills the space in the struct. That's why you can have 100 character names and not 99 character names in standard tar. On 2013/02/07 04:16:27, dsymonds wrote: > if this is a cstring, don't the limits above need to be >= instead of >? https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:177: copy(header[257:265], []byte("ustar\000")) Not sure what would be appropriate. It's the ustar (POSIX 1998) magic rather than the gnu magic. We're using it because we are using ustar long names. Should I add a reference to the POSIX 1998 spec? On 2013/02/07 04:16:27, dsymonds wrote: > please add a comment explaining why this magic is different to the one a few > lines above. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:196: // writeUSTARLongName splits a USTAR long name hdr.Name On 2013/02/07 04:16:27, dsymonds wrote: > add a period, then add "name" to the start of the next line. Done. https://codereview.appspot.com/6700047/diff/65002/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:200: func (tw *Writer) splitUSTARLongName(hdr *Header) (prefix, suffix string, err error) { On 2013/02/07 04:16:27, dsymonds wrote: > make the argument "name string", and pass hdr.Name above. Done.
Sign in to reply to this message.
Hello dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, remyoudompheng@gmail.com, chressie@gmail.com, rsc@golang.org (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
LGTM patchset #20 now looks fine to me. Let's see what other people say. https://codereview.appspot.com/6700047/diff/69010/src/pkg/archive/tar/reader_... File src/pkg/archive/tar/reader_test.go (right): https://codereview.appspot.com/6700047/diff/69010/src/pkg/archive/tar/reader_... src/pkg/archive/tar/reader_test.go:308: t.Errorf("Couldn't parse correctly formatted headers %v", err) still missing a colon before the " %v" https://codereview.appspot.com/6700047/diff/69010/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/69010/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:178: // indicating ah, I see. This is fine. drop this line and add a period to the end of the previous one.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/69010/src/pkg/archive/tar/reader_... File src/pkg/archive/tar/reader_test.go (right): https://codereview.appspot.com/6700047/diff/69010/src/pkg/archive/tar/reader_... src/pkg/archive/tar/reader_test.go:308: t.Errorf("Couldn't parse correctly formatted headers %v", err) On 2013/02/07 04:45:29, dsymonds wrote: > still missing a colon before the " %v" Done. https://codereview.appspot.com/6700047/diff/69010/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/69010/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:178: // indicating On 2013/02/07 04:45:29, dsymonds wrote: > ah, I see. This is fine. drop this line and add a period to the end of the > previous one. Done.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:202: func (tw *Writer) splitUSTARLongName(name string) (prefix, suffix string, err error) { it looks like the method is splitting also the file names < 100 chars.. if length <= 100 { return "", name, nil } https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:207: length-- shouldn't this be length++ ? (explanation below) https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:210: nlen := length - i - 1 if name ends with a slash and the length of name is <= fileNamePrefixSize, nlen will be -2.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:202: func (tw *Writer) splitUSTARLongName(name string) (prefix, suffix string, err error) { It is, it assumes only long filenames are being passed to it. On 2013/02/07 08:03:39, chressie1 wrote: > it looks like the method is splitting also the file names < 100 chars.. > > if length <= 100 { > return "", name, nil > } https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:210: nlen := length - i - 1 You have an excellent point. I transliterated this algorithm from gnu tar since our test suite requires we be compatible with a real world tar file. Here's the original source: http://git.gag.com/?p=debian/tar;a=blob;f=src/create.c#l591 Did I make a mistake in transliterating, or do you think there's a bug in the gnu implementation? On 2013/02/07 08:03:39, chressie1 wrote: > if name ends with a slash and the length of name is <= fileNamePrefixSize, nlen > will be -2.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:210: nlen := length - i - 1 On 2013/02/07 19:50:27, shanemhansen wrote: > You have an excellent point. I transliterated this algorithm from gnu tar since > our test suite requires we be compatible with a real world tar file. Here's the > original source: > http://git.gag.com/?p=debian/tar;a=blob;f=src/create.c#l591 > > Did I make a mistake in transliterating, or do you think there's a bug in the > gnu implementation? they use length to init the for loop, so if name ends with a slash, we want to look for the 2nd last slash in effect. i think the equivalent should be something like: strings.LastIndex(name[:length], "/") https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:211: if i == 0 || nlen > fileNameSize || nlen == 0 { in addition, i think you need to check for i <= 0, because if i == -1, we have a single really long name which can't be splitted.
Sign in to reply to this message.
https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.go File src/pkg/archive/tar/writer.go (right): https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:210: nlen := length - i - 1 On 2013/02/08 07:32:10, chressie1 wrote: > On 2013/02/07 19:50:27, shanemhansen wrote: > > You have an excellent point. I transliterated this algorithm from gnu tar > since > > our test suite requires we be compatible with a real world tar file. Here's > the > > original source: > > http://git.gag.com/?p=debian/tar;a=blob;f=src/create.c#l591 > > > > Did I make a mistake in transliterating, or do you think there's a bug in the > > gnu implementation? > > they use length to init the for loop, so if name ends with a slash, we want to > look for the 2nd last slash in effect. i think the equivalent should be > something like: > > strings.LastIndex(name[:length], "/") > Done. https://codereview.appspot.com/6700047/diff/71008/src/pkg/archive/tar/writer.... src/pkg/archive/tar/writer.go:211: if i == 0 || nlen > fileNameSize || nlen == 0 { On 2013/02/08 07:32:10, chressie1 wrote: > in addition, i think you need to check for i <= 0, because if i == -1, we have a > single really long name which can't be splitted. Done.
Sign in to reply to this message.
Hello dsymonds@golang.org, dave@cheney.net, rogpeppe@gmail.com, remyoudompheng@gmail.com, chressie@gmail.com, rsc@golang.org (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
LGTM thanks!
Sign in to reply to this message.
LGTM Unless anyone pounces with yet more changes to make, I'll submit this later today. Thanks for battling through this, Shane!
Sign in to reply to this message.
Thanks Shane, I think all the blood has been squeezed from this stone.
Sign in to reply to this message.
*** Submitted as https://code.google.com/p/go/source/detail?r=7ad748c63169 *** archive/tar: read/write extended pax/gnu tar archives Support reading pax archives and applying extended attributes like long names, subsecond mtime resolutions, etc. Default to writing pax archives for long file and link names. Support reading gnu archives using the ././@LongLink extended header for file name and link target. Fixes issue 3300 R=dsymonds, dave, rogpeppe, remyoudompheng, chressie, rsc CC=golang-dev https://codereview.appspot.com/6700047 Committer: David Symonds <dsymonds@golang.org>
Sign in to reply to this message.
Yay. Thanks for all the helpful feedback along the way, it was much appreciated. --Shane On Sun, Feb 10, 2013 at 5:36 PM, <dsymonds@golang.org> wrote: > *** Submitted as > https://code.google.com/p/go/**source/detail?r=7ad748c63169<https://code.goog... > > archive/tar: read/write extended pax/gnu tar archives > > Support reading pax archives and applying extended attributes > like long names, subsecond mtime resolutions, etc. Default to > writing pax archives for long file and link names. > Support reading gnu archives using the ././@LongLink extended > header for file name and link target. > > Fixes issue 3300 > > R=dsymonds, dave, rogpeppe, remyoudompheng, chressie, rsc > CC=golang-dev > https://codereview.appspot.**com/6700047<https://codereview.appspot.com/6700047> > > Committer: David Symonds <dsymonds@golang.org> > > > https://codereview.appspot.**com/6700047/<https://codereview.appspot.com/6700... >
Sign in to reply to this message.
|