|
|
Created:
10 years, 7 months ago by james.robinson Modified:
10 years, 2 months ago CC:
golang-codereviews, bradfitz, kortschak, adg, nigeltao, jamesr1 Visibility:
Public. |
Descriptioncompress/flate: add Reset() to allow reusing large buffers to compress multiple buffers
This adds a Reset() to compress/flate's decompressor and plumbs that through
to compress/zlib and compress/gzip's Readers so callers can avoid large
allocations when performing many inflate operations. In particular this
preserves the allocation of the decompressor.hist buffer, which is 32kb and
overwritten as needed while inflating.
On the benchmark described in issue 6317, produces the following speedup on
my 2.3ghz Intel Core i7 MBP with go version devel +6b696a34e0af Sun Aug 03
15:14:59 2014 -0700 darwin/amd64:
blocked.text w/out patch vs blocked.text w/ patch:
benchmark old ns/op new ns/op delta
BenchmarkGunzip 8371577533 7927917687 -5.30%
benchmark old allocs new allocs delta
BenchmarkGunzip 176818 148519 -16.00%
benchmark old bytes new bytes delta
BenchmarkGunzip 292184936 12739528 -95.64%
flat.text vs blocked.text w/patch:
benchmark old ns/op new ns/op delta
BenchmarkGunzip 7939447827 7927917687 -0.15%
benchmark old allocs new allocs delta
BenchmarkGunzip 90702 148519 +63.74%
benchmark old bytes new bytes delta
BenchmarkGunzip 9959528 12739528 +27.91%
Similar speedups to those bradfitz saw in https://codereview.appspot.com/13416045.
Fixes issue 6317.
Fixes issue 7950.
Patch Set 1 #Patch Set 2 : diff -r e473e77e84ff https://code.google.com/p/go #Patch Set 3 : diff -r e473e77e84ff https://code.google.com/p/go #Patch Set 4 : diff -r e473e77e84ff https://code.google.com/p/go #Patch Set 5 : diff -r 6b696a34e0af https://code.google.com/p/go #Patch Set 6 : diff -r 6b696a34e0af https://code.google.com/p/go #Patch Set 7 : diff -r 6b696a34e0af https://code.google.com/p/go #
Total comments: 1
Patch Set 8 : diff -r e1a081e6ddf8a77b267eb0d2b9747b2181524106 https://code.google.com/p/go #Patch Set 9 : diff -r e1a081e6ddf8a77b267eb0d2b9747b2181524106 https://code.google.com/p/go #
Total comments: 23
Patch Set 10 : diff -r e1a081e6ddf8a77b267eb0d2b9747b2181524106 https://code.google.com/p/go #
Total comments: 7
Patch Set 11 : diff -r e1a081e6ddf8a77b267eb0d2b9747b2181524106 https://code.google.com/p/go #
Total comments: 2
Patch Set 12 : diff -r e1a081e6ddf8a77b267eb0d2b9747b2181524106 https://code.google.com/p/go #
Total comments: 14
Patch Set 13 : diff -r e1a081e6ddf8a77b267eb0d2b9747b2181524106 https://code.google.com/p/go #Patch Set 14 : diff -r b91b31b57771240403a02a9257fbf8f0c6ea5f09 https://code.google.com/p/go #
Total comments: 6
Patch Set 15 : diff -r b91b31b57771240403a02a9257fbf8f0c6ea5f09 https://code.google.com/p/go #
MessagesTotal messages: 34
Hello golang-dev@googlegroups.com (cc: golang-codereviews@googlegroups.com, nigeltao@golang.org), I'd like you to review this change to https://code.google.com/p/go
Sign in to reply to this message.
Replacing golang-dev with golang-codereviews. To the author of this CL: If you are using 'hg mail -r golang-dev' to mail the CL, use simply 'hg mail' instead. If you did not name golang-dev explicitly and it was still added to the CL, it means your working copy of the repo has a stale codereview.cfg (or lib/codereview/codereview.cfg). Please run 'hg sync' to update your client to the most recent codereview.cfg. If the most recent codereview.cfg has defaultcc set to golang-dev instead of golang-codereviews, please send a CL correcting it. Thanks very much.
Sign in to reply to this message.
The tree is in stabilization mode until Go 1.3 is out. (We intentionally don't have a crazy branch to force everybody to work on stabilization) Please bring this up after Go 1.3 is out (June 1st-ish). But please research first related attempts like https://codereview.appspot.com/13416045/ and its history. Search the mailing list and bug trackers. On Wed, May 7, 2014 at 2:18 PM, <gobot@golang.org> wrote: > Replacing golang-dev with golang-codereviews. > > To the author of this CL: > > If you are using 'hg mail -r golang-dev' to mail the CL, use simply 'hg > mail' instead. > > If you did not name golang-dev explicitly and it was still added to the > CL, > it means your working copy of the repo has a stale codereview.cfg > (or lib/codereview/codereview.cfg). > Please run 'hg sync' to update your client to the most recent > codereview.cfg. > If the most recent codereview.cfg has defaultcc set to golang-dev > instead of > golang-codereviews, please send a CL correcting it. > > Thanks very much. > > > https://codereview.appspot.com/97140043/ > > -- > You received this message because you are subscribed to the Google Groups > "golang-codereviews" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-codereviews+unsubscribe@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. >
Sign in to reply to this message.
On 2014/05/07 22:00:37, bradfitz wrote: > > But please research first related attempts like > https://codereview.appspot.com/13416045/ and its history. Search the > mailing list and bug trackers. Thank you for the pointer - it's very helpful. https://code.google.com/p/go/issues/detail?id=6086 seems like the most relevant discussion and I totally missed that bufio.Reader had a Reset(). I'll add a Reset() to flate.Reader and see if that can be used to fix https://code.google.com/p/go/issues/detail?id=6317 as well but... > The tree is in stabilization mode until Go 1.3 is out. (We intentionally > don't have a crazy branch to force everybody to work on stabilization) > > Please bring this up after Go 1.3 is out (June 1st-ish). ... I'll do so after 1.3 is out. Cheers.
Sign in to reply to this message.
There was also some relevant discussion here https://codereview.appspot.com/13512052 Thanks for working on this. On Thu, 2014-05-08 at 00:02 +0000, jamesr.gatech@gmail.com wrote: > On 2014/05/07 22:00:37, bradfitz wrote: > > > But please research first related attempts like > > https://codereview.appspot.com/13416045/ and its history. Search the > > mailing list and bug trackers. > > Thank you for the pointer - it's very helpful. > https://code.google.com/p/go/issues/detail?id=6086 seems like the most > relevant discussion and I totally missed that bufio.Reader had a > Reset(). I'll add a Reset() to flate.Reader and see if that can be used > to fix https://code.google.com/p/go/issues/detail?id=6317 as well but... > > > The tree is in stabilization mode until Go 1.3 is out. (We > intentionally > > don't have a crazy branch to force everybody to work on stabilization) > > > Please bring this up after Go 1.3 is out (June 1st-ish). > > ... I'll do so after 1.3 is out. Cheers. > > > https://codereview.appspot.com/97140043/
Sign in to reply to this message.
Will ping again for formal review post 1.3, but since this proved promising throwing the patch up now for anyone who is curious. A Reset() does provide the speedup you might expect on this testcase and on the one in https://code.google.com/p/go/issues/detail?id=6317, but the way I've done it requires exposing lots of ugly new interfaces to expand an io.ReadCloser into something that has a Reset(). Given that bufio also has a Reset() with the same interface (except for the error, which could perhaps just always be nil for bufio.Reader.Reset()), maybe this interface could go somewhere more common like package io?
Sign in to reply to this message.
PTAL at the latest patchset. It adds a Reset() to the reader types provided by compress/flate, compress/zlib and compress/gzip. In order to do these flate.NewReader and zlib.NewReader have to return something other than io.ReadCloser. I've made them return a flate.InflateReader and *zlib.Reader, respectively, to mirror bufio.NewReader and gzip.NewReader. I think it'd be better if flate.NewReader returned a *flate.Reader, but the name flate.Reader is already taken by the interface io.Reader + io.ByteReader.
Sign in to reply to this message.
https://codereview.appspot.com/97140043/diff/120001/src/pkg/archive/zip/regis... File src/pkg/archive/zip/register.go (right): https://codereview.appspot.com/97140043/diff/120001/src/pkg/archive/zip/regis... src/pkg/archive/zip/register.go:23: type Decompressor func(io.Reader) flate.InflateReader not acceptable. this is an API change. all.bash will fail the API compatibility check if you run it.
Sign in to reply to this message.
On 2014/08/04 23:56:02, bradfitz wrote: > https://codereview.appspot.com/97140043/diff/120001/src/pkg/archive/zip/regis... > File src/pkg/archive/zip/register.go (right): > > https://codereview.appspot.com/97140043/diff/120001/src/pkg/archive/zip/regis... > src/pkg/archive/zip/register.go:23: type Decompressor func(io.Reader) > flate.InflateReader > not acceptable. this is an API change. all.bash will fail the API compatibility > check if you run it. Ah yes, in all the syscall spam I missed that it was complaining legitimately about the change. I don't think I'm knowledgable about the language to know how to proceed here so I asked golang-dev@ for help. If that thread resolves with a way forward for this patch I'll do what it suggests, otherwise I'll close this out and use a fork for my project.
Sign in to reply to this message.
On 2014/08/06 18:10:30, james.robinson wrote: > Ah yes, in all the syscall spam I missed that it was complaining legitimately > about the change. I don't think I'm knowledgable about the language to know how > to proceed here so I asked golang-dev@ for help. If that thread resolves with a > way forward for this patch I'll do what it suggests, otherwise I'll close this > out and use a fork for my project. You can just return the io.ReadCloser and advertise in the comments that it satisfies interface{ Reset(r io.Reader) error }. Then it is the client's option to conditionally assert the type and reset if possible. It's not wildly satisfying, but it seems like the best that can be done given the compatibility guarantee.
Sign in to reply to this message.
PTAL and sorry for the extreme delay. This patch adds a new interface flate.Resettable with Reset() and ResetDict() functions and advertises (via comment) that the thing returned by flate.NewReader()/flate.NewReaderDict() implements the flate.Resettable interface. compress/gunzip/ and compress/zlib/ use unconditional type assertions to use reset since they control the allocation of their decompressor objects.
Sign in to reply to this message.
Hi, this will have to wait until after December 1 when the release freeze is over. Please update this thread at that time. On 28 September 2014 06:29, <jamesr.gatech@gmail.com> wrote: > PTAL and sorry for the extreme delay. This patch adds a new interface > flate.Resettable with Reset() and ResetDict() functions and advertises > (via comment) that the thing returned by > flate.NewReader()/flate.NewReaderDict() implements the flate.Resettable > interface. compress/gunzip/ and compress/zlib/ use unconditional type > assertions to use reset since they control the allocation of their > decompressor objects. > > https://codereview.appspot.com/97140043/ > > -- > You received this message because you are subscribed to the Google Groups > "golang-codereviews" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-codereviews+unsubscribe@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. >
Sign in to reply to this message.
I thought there were open bugs for this? On Sun, Sep 28, 2014 at 2:00 AM, Andrew Gerrand <adg@golang.org> wrote: > Hi, this will have to wait until after December 1 when the release freeze > is over. Please update this thread at that time. > > On 28 September 2014 06:29, <jamesr.gatech@gmail.com> wrote: > >> PTAL and sorry for the extreme delay. This patch adds a new interface >> flate.Resettable with Reset() and ResetDict() functions and advertises >> (via comment) that the thing returned by >> flate.NewReader()/flate.NewReaderDict() implements the flate.Resettable >> interface. compress/gunzip/ and compress/zlib/ use unconditional type >> assertions to use reset since they control the allocation of their >> decompressor objects. >> >> https://codereview.appspot.com/97140043/ >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "golang-codereviews" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to golang-codereviews+unsubscribe@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > >
Sign in to reply to this message.
On 2014/09/28 16:42:25, bradfitz wrote: > I thought there were open bugs for this? > There are, https://code.google.com/p/go/issues/detail?id=6317 and https://code.google.com/p/go/issues/detail?id=7950. The former is marked Priority-Later / Release-None, the latter is marked Release-Go1.4. > > On Sun, Sep 28, 2014 at 2:00 AM, Andrew Gerrand <mailto:adg@golang.org> wrote: > > > Hi, this will have to wait until after December 1 when the release freeze > > is over. Please update this thread at that time.
Sign in to reply to this message.
https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... File src/compress/flate/inflate.go (right): https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:59: // This interface allows resetting a reader to read from a new stream s/This interface/<Type>/ Comment on Go1 promise probably not necessary. https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:65: type Resettable interface { Resetter. It would simplify things if you had just Reset(r io.Reader, dict []byte) or ResetDict(r io.Reader, dict []byte) (whichever and with appropriate interface name) and only set the dictionary if dict != nil. https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.go File src/compress/zlib/reader.go (right): https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.... src/compress/zlib/reader.go:68: func NewReaderEx(r io.Reader) (*Reader, error) { Add doc comment. https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.... src/compress/zlib/reader.go:72: func NewReaderDictEx(r io.Reader, dict []byte) (*Reader, error) { Add doc comment. https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.... src/compress/zlib/reader.go:81: func (z *Reader) Read(p []byte) (n int, err error) { Add doc comment. https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.... src/compress/zlib/reader.go:142: if z.decompressor == nil { If Reset and ResetDict are merged into a single method, this move out of the outer conditional and you either set dict to nil when z.scratch[1]&0x20 == 0, or leave it as the reusable dictionary. https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.... src/compress/zlib/reader.go:158: func (z *Reader) Reset(r io.Reader) error { Add doc comment.
Sign in to reply to this message.
https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... File src/compress/flate/inflate.go (right): https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:59: // This interface allows resetting a reader to read from a new stream On 2014/09/28 23:33:55, kortschak wrote: > s/This interface/<Type>/ > > Comment on Go1 promise probably not necessary. Yes. Perhaps: // Resetter resets a ReadCloser returned by NewReader or NewReaderDict to // to switch to a new underlying Reader while reusing internal buffers. https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:65: type Resettable interface { On 2014/09/28 23:33:55, kortschak wrote: > Resetter. > > It would simplify things if you had just Reset(r io.Reader, dict []byte) or > ResetDict(r io.Reader, dict []byte) (whichever and with appropriate interface > name) and only set the dictionary if dict != nil. Yes. This. https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:714: // NewReader returns a new io.ReadCloser that can be used drop the "io." addition https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:716: // responsibility to call Close on the io.ReadCloser when likewise https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:719: // The io.ReadCloser returned by NewReader also implements // The returned ReadCloser also implements Resetter. https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:737: // The io.ReadCloser returned by NewReader also implements // The returned ReadCloser also implements Resetter.
Sign in to reply to this message.
https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.go File src/compress/zlib/reader.go (right): https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.... src/compress/zlib/reader.go:68: func NewReaderEx(r io.Reader) (*Reader, error) { we're not adding APIs ending in "Ex".
Sign in to reply to this message.
Thanks for all the review comments, as you can tell I'm still learning things here. PTAL. https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... File src/compress/flate/inflate.go (right): https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:59: // This interface allows resetting a reader to read from a new stream On 2014/09/29 18:39:49, bradfitz wrote: > On 2014/09/28 23:33:55, kortschak wrote: > > s/This interface/<Type>/ > > > > Comment on Go1 promise probably not necessary. > > Yes. > > Perhaps: > > // Resetter resets a ReadCloser returned by NewReader or NewReaderDict to > // to switch to a new underlying Reader while reusing internal buffers. Done. https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:65: type Resettable interface { On 2014/09/29 18:39:49, bradfitz wrote: > On 2014/09/28 23:33:55, kortschak wrote: > > Resetter. > > > > It would simplify things if you had just Reset(r io.Reader, dict []byte) or > > ResetDict(r io.Reader, dict []byte) (whichever and with appropriate interface > > name) and only set the dictionary if dict != nil. > > Yes. This. I didn't do this because it would be inconsistent with NewReader() and NewReaderDict() and feels a little bit like default argument values, but could add it if it's preferred. How about // Reset discards any buffered data and resets the Resettable as if it was newly initialized with // the given reader and, if |dict| is not null, the given dictionary. Reset(r io.Reader, dict []byte) ? For personal knowledge, if this whole interface was being designed today do you think that there would just be one NewReader() function that accepted a possibly-null dict ? https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:714: // NewReader returns a new io.ReadCloser that can be used On 2014/09/29 18:39:49, bradfitz wrote: > drop the "io." addition Done. https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:716: // responsibility to call Close on the io.ReadCloser when On 2014/09/29 18:39:49, bradfitz wrote: > likewise Done. https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:719: // The io.ReadCloser returned by NewReader also implements On 2014/09/29 18:39:50, bradfitz wrote: > // The returned ReadCloser also implements Resetter. Done. https://codereview.appspot.com/97140043/diff/160001/src/compress/flate/inflat... src/compress/flate/inflate.go:737: // The io.ReadCloser returned by NewReader also implements On 2014/09/29 18:39:50, bradfitz wrote: > // The returned ReadCloser also implements Resetter. Done. https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.go File src/compress/zlib/reader.go (right): https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.... src/compress/zlib/reader.go:68: func NewReaderEx(r io.Reader) (*Reader, error) { On 2014/09/29 18:41:16, bradfitz wrote: > we're not adding APIs ending in "Ex". Sorry, meant to update this to use the same pattern as compress/flate/ but forgot to do so. Will update. https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.... src/compress/zlib/reader.go:142: if z.decompressor == nil { On 2014/09/28 23:33:55, kortschak wrote: > If Reset and ResetDict are merged into a single method, this move out of the > outer conditional and you either set dict to nil when z.scratch[1]&0x20 == 0, or > leave it as the reusable dictionary. Done. https://codereview.appspot.com/97140043/diff/160001/src/compress/zlib/reader.... src/compress/zlib/reader.go:158: func (z *Reader) Reset(r io.Reader) error { On 2014/09/28 23:33:55, kortschak wrote: > Add doc comment. This function is no longer (directly) public, but the Resettable interface is documented.
Sign in to reply to this message.
https://codereview.appspot.com/97140043/diff/180001/src/compress/flate/inflat... File src/compress/flate/inflate.go (right): https://codereview.appspot.com/97140043/diff/180001/src/compress/flate/inflat... src/compress/flate/inflate.go:62: type Resettable interface { s/Resettable/Resetter/g https://codereview.appspot.com/97140043/diff/180001/src/compress/gzip/gunzip.go File src/compress/gzip/gunzip.go (right): https://codereview.appspot.com/97140043/diff/180001/src/compress/gzip/gunzip.... src/compress/gzip/gunzip.go:213: z.decompressor.(flate.Resettable).Reset(z.r, nil) s/Resettable/Resetter/ https://codereview.appspot.com/97140043/diff/180001/src/compress/zlib/reader.go File src/compress/zlib/reader.go (right): https://codereview.appspot.com/97140043/diff/180001/src/compress/zlib/reader.... src/compress/zlib/reader.go:54: // Resetter resets a ReadCloser returned by NewReader or NewReaderDict to Why is this here? You have this interface defined in compress/flate. https://codereview.appspot.com/97140043/diff/180001/src/compress/zlib/reader.... src/compress/zlib/reader.go:156: z.decompressor.(flate.Resettable).Reset(z.r, dict) s/Resettable/Resetter/
Sign in to reply to this message.
I've updated everything to 'Resetter', PTAL. https://codereview.appspot.com/97140043/diff/180001/src/compress/zlib/reader.go File src/compress/zlib/reader.go (right): https://codereview.appspot.com/97140043/diff/180001/src/compress/zlib/reader.... src/compress/zlib/reader.go:54: // Resetter resets a ReadCloser returned by NewReader or NewReaderDict to On 2014/09/30 12:09:44, kortschak wrote: > Why is this here? You have this interface defined in compress/flate. Actually no, the interface in compress/flate does not return an error (since there's no error on initialization for compress/flate). That aside, I think (nearly) duplicating this interface makes sense for the same reason that the compression constants are duplicated: > These constants are copied from the flate package, so that code that imports "compress/zlib" does not also have to import "compress/flate". http://golang.org/pkg/compress/zlib/#pkg-constants
Sign in to reply to this message.
https://codereview.appspot.com/97140043/diff/180001/src/compress/zlib/reader.go File src/compress/zlib/reader.go (right): https://codereview.appspot.com/97140043/diff/180001/src/compress/zlib/reader.... src/compress/zlib/reader.go:54: // Resetter resets a ReadCloser returned by NewReader or NewReaderDict to Sorry, I missed that. It seems this then is space for confusion - maybe just provide a single interface that returns an error (nil in the flate case). With the constants, they are the same, here the names are the same and the signatures are similar - this seems risky to me from a code comprehension perspective. On 2014/10/03 04:37:52, james.robinson wrote: > On 2014/09/30 12:09:44, kortschak wrote: > > Why is this here? You have this interface defined in compress/flate. > > Actually no, the interface in compress/flate does not return an error (since > there's no error on initialization for compress/flate). That aside, I think > (nearly) duplicating this interface makes sense for the same reason that the > compression constants are duplicated: > > > These constants are copied from the flate package, so that code that imports > "compress/zlib" does not also have to import "compress/flate". > > http://golang.org/pkg/compress/zlib/#pkg-constants https://codereview.appspot.com/97140043/diff/200001/src/compress/zlib/reader.go File src/compress/zlib/reader.go (right): https://codereview.appspot.com/97140043/diff/200001/src/compress/zlib/reader.... src/compress/zlib/reader.go:162: func (z *reader) Reset(r io.Reader, dict []byte) error { I missed this before, but it seems to me that this is redundant if you rename (*reader).reset to Reset.
Sign in to reply to this message.
Brad - Dan and I are starting to go back and forth about which of two reasonable designs makes more sense. What would you suggest? The debate is about whether to reuse a similar interface across compress/flate and compress/zlib or provide independent-but-similar interfaces in each. https://codereview.appspot.com/97140043/diff/180001/src/compress/zlib/reader.go File src/compress/zlib/reader.go (right): https://codereview.appspot.com/97140043/diff/180001/src/compress/zlib/reader.... src/compress/zlib/reader.go:54: // Resetter resets a ReadCloser returned by NewReader or NewReaderDict to On 2014/10/03 04:53:52, kortschak wrote: > Sorry, I missed that. It seems this then is space for confusion - maybe just > provide a single interface that returns an error (nil in the flate case). > > With the constants, they are the same, here the names are the same and the > signatures are similar - this seems risky to me from a code comprehension > perspective. > That's true for folks reading both compress/flate/ and compress/zlib/, but I suspect most developers are going to have one of a flate or zlib compressed buffers and just want to read one interface definition to figure out how to process it. The fact that compress/zlib/ uses compress/flate/ is an implementation detail, not something that's exposed in the public interface or types. Having compress/zlib/ be standalone introduces some duplication but makes life easier for users of compress/zlib/. > > > These constants are copied from the flate package, so that code that > imports > > "compress/zlib" does not also have to import "compress/flate". > > > > http://golang.org/pkg/compress/zlib/#pkg-constants > https://codereview.appspot.com/97140043/diff/200001/src/compress/zlib/reader.go File src/compress/zlib/reader.go (right): https://codereview.appspot.com/97140043/diff/200001/src/compress/zlib/reader.... src/compress/zlib/reader.go:162: func (z *reader) Reset(r io.Reader, dict []byte) error { On 2014/10/03 04:53:52, kortschak wrote: > I missed this before, but it seems to me that this is redundant if you rename > (*reader).reset to Reset. Oh, right you are. Fixed.
Sign in to reply to this message.
Ping
Sign in to reply to this message.
Sorry, I can't look at this for another couple days at least. Perhaps Nigel can in the meantime. On Thu, Oct 9, 2014 at 6:47 AM, <jamesr.gatech@gmail.com> wrote: > Ping > > https://codereview.appspot.com/97140043/ >
Sign in to reply to this message.
I haven't been following this code review too closely, as there already seemed to be a lot of reviewers, and too many cooks spoil the broth. I haven't read the previous discussions. Is there anything in particular that I should focus on? In the meantime, here's a bunch of stylistic suggestions (apologies if they contradict previous suggestions by other reviewers) and some more substantial concerns over the benchmarking. https://codereview.appspot.com/97140043/diff/220001/src/compress/flate/inflat... File src/compress/flate/inflate_test.go (right): https://codereview.appspot.com/97140043/diff/220001/src/compress/flate/inflat... src/compress/flate/inflate_test.go:14: d := []string{"lorem ipsum izzle fo rizzle", I'd line-break it so that the sibling elements are at equal indentation: ss := []string{ "lorem ipsum izzle fo rizzle", "the quick brown fox jumped over", } https://codereview.appspot.com/97140043/diff/220001/src/compress/flate/inflat... src/compress/flate/inflate_test.go:18: for i := range d { I'd write it like this: for i, s := range ss { w, _ := NewWriter(&deflated[i], 1) w.Write([]byte(s)) w.Close() } https://codereview.appspot.com/97140043/diff/220001/src/compress/flate/inflat... src/compress/flate/inflate_test.go:32: for i := range d { I'd write: for i, s := range ss { https://codereview.appspot.com/97140043/diff/220001/src/compress/flate/inflat... src/compress/flate/inflate_test.go:34: t.Errorf("inflated[%d] was %s, want %s", i, inflated[i], d[i]) I'd write: t.Errorf("inflated[%d]:\ngot %q\nwant %q", i, inflated[i], s) https://codereview.appspot.com/97140043/diff/220001/src/compress/gzip/gunzip_... File src/compress/gzip/gunzip_test.go (right): https://codereview.appspot.com/97140043/diff/220001/src/compress/gzip/gunzip_... src/compress/gzip/gunzip_test.go:373: var file = flag.String("gz.file", "", "file to gunzip and discard") The flag means that I can't reproduce your benchmark result on my machine. I'd rather hard-code a path to ../testdata/something instead of taking a flag. Actually, I'd rather not add the benchmark code at all. It just seems broken. (See below.) https://codereview.appspot.com/97140043/diff/220001/src/compress/gzip/gunzip_... src/compress/gzip/gunzip_test.go:375: func BenchmarkGunzip(b *testing.B) { This isn't how benchmarks work. http://golang.org/pkg/testing/ says that "The benchmark function must run the target code b.N times". https://codereview.appspot.com/97140043/diff/220001/src/compress/gzip/gunzip_... src/compress/gzip/gunzip_test.go:379: f, err := os.Open(*file) I'd expect the time taken to be dominated by disk performance (and whether or not the file is cached in the kernel buffers, which changes from benchmark run to benchmark run), rather than flate performance. If you really want to benchmark flate performance, read the entire file into memory first, then call b.ResetTimer. https://codereview.appspot.com/97140043/diff/220001/src/compress/gzip/gunzip_... src/compress/gzip/gunzip_test.go:384: gz, err := NewReader(f) Is this even interesting to add this benchmark in this changelist, given that you don't call Reset at all?
Sign in to reply to this message.
https://codereview.appspot.com/97140043/diff/220001/src/compress/gzip/gunzip_... File src/compress/gzip/gunzip_test.go (right): https://codereview.appspot.com/97140043/diff/220001/src/compress/gzip/gunzip_... src/compress/gzip/gunzip_test.go:381: fmt.Fprintln(os.Stderr, err) b.Fatal(err)
Sign in to reply to this message.
Thanks Nigel. The main issue that I think needs resolution is about whether to add a type Resetter interface { } to both compress/flate/ and compress/zlib/ (which is what this patch does) or to try to reuse an interface definition in some way. The two interfaces in this patch differ in that compress/flate/ can't fail with an error while compress/zlib/'s version can. In the go standard library, compress/gzip/ and bufio/ both expose a Reset() function for their own Reader types, but the readers in compress/flate/ and compress/zlib/ are just io.ReadCloser instances and not structs that we can add new functions for. My preference is to add the two new interfaces so that users of compress/zlib/ do not need to use anything from compress/flate/, but Dan expressed a preference for the opposite. I'm a newbie at this language and this seems like a standard library design taste question. https://codereview.appspot.com/97140043/diff/220001/src/compress/flate/inflat... File src/compress/flate/inflate_test.go (right): https://codereview.appspot.com/97140043/diff/220001/src/compress/flate/inflat... src/compress/flate/inflate_test.go:14: d := []string{"lorem ipsum izzle fo rizzle", On 2014/10/10 02:38:10, nigeltao wrote: > I'd line-break it so that the sibling elements are at equal indentation: > > ss := []string{ > "lorem ipsum izzle fo rizzle", > "the quick brown fox jumped over", > } Done. https://codereview.appspot.com/97140043/diff/220001/src/compress/flate/inflat... src/compress/flate/inflate_test.go:18: for i := range d { On 2014/10/10 02:38:10, nigeltao wrote: > I'd write it like this: > > for i, s := range ss { > w, _ := NewWriter(&deflated[i], 1) > w.Write([]byte(s)) > w.Close() > } Done. https://codereview.appspot.com/97140043/diff/220001/src/compress/flate/inflat... src/compress/flate/inflate_test.go:32: for i := range d { On 2014/10/10 02:38:10, nigeltao wrote: > I'd write: > for i, s := range ss { Done. https://codereview.appspot.com/97140043/diff/220001/src/compress/flate/inflat... src/compress/flate/inflate_test.go:34: t.Errorf("inflated[%d] was %s, want %s", i, inflated[i], d[i]) On 2014/10/10 02:38:10, nigeltao wrote: > I'd write: > t.Errorf("inflated[%d]:\ngot %q\nwant %q", i, inflated[i], s) I see, that way the output will line up in the case of errors. Done. https://codereview.appspot.com/97140043/diff/220001/src/compress/gzip/gunzip_... File src/compress/gzip/gunzip_test.go (right): https://codereview.appspot.com/97140043/diff/220001/src/compress/gzip/gunzip_... src/compress/gzip/gunzip_test.go:384: gz, err := NewReader(f) On 2014/10/10 02:38:10, nigeltao wrote: > Is this even interesting to add this benchmark in this changelist, given that > you don't call Reset at all? This benchmark is derived from https://code.google.com/p/go/issues/detail?id=6317, which is about the cost of inflating an archived compressed into many separate blocks. gunzip.go internally resets itself many times while inflating the buffer: http://golang.org/src/pkg/compress/gzip/gunzip.go#L242 this patch allows reusing the underlying flate decompressor instead of reallocating it, which produces the speedup. Without this patch every new header in the archive allocates a fresh 40kb chunk of memory which means the program spends a large amount of time in the garbage collector instead of actually pushing bytes around. I found this benchmark useful when writing this patch to validate that it actually fixed the allocations observed in https://code.google.com/p/go/issues/detail?id=6317 but don't think it is necessarily of a lot of value checked in. The test data for this is really big.
Sign in to reply to this message.
I'm OK with having two separate but equivalent interfaces in two separate packages. I'd like their method set to match, though. https://codereview.appspot.com/97140043/diff/260001/src/compress/flate/inflat... File src/compress/flate/inflate.go (right): https://codereview.appspot.com/97140043/diff/260001/src/compress/flate/inflat... src/compress/flate/inflate.go:65: Reset(r io.Reader, dict []byte) I'd rather that the two interfaces have the same signatures, even if they're different types (in different packages), so add an "error" at the end here, and below, add a "return nil". https://codereview.appspot.com/97140043/diff/260001/src/compress/flate/inflat... File src/compress/flate/inflate_test.go (right): https://codereview.appspot.com/97140043/diff/260001/src/compress/flate/inflat... src/compress/flate/inflate_test.go:16: "the quick brown fox jumped over"} I'd still add a comma and a new-line after the final ". https://codereview.appspot.com/97140043/diff/260001/src/compress/zlib/reader.go File src/compress/zlib/reader.go (right): https://codereview.appspot.com/97140043/diff/260001/src/compress/zlib/reader.... src/compress/zlib/reader.go:58: // Reset discards any buffered data and attempts to reset the Resetter as s/attempts to reset/resets/ and then move the "as" to the next line to match compress/flate's Resetter.
Sign in to reply to this message.
https://codereview.appspot.com/97140043/diff/260001/src/compress/flate/inflat... File src/compress/flate/inflate.go (right): https://codereview.appspot.com/97140043/diff/260001/src/compress/flate/inflat... src/compress/flate/inflate.go:65: Reset(r io.Reader, dict []byte) On 2014/10/13 07:12:57, nigeltao wrote: > I'd rather that the two interfaces have the same signatures, even if they're > different types (in different packages), so add an "error" at the end here, and > below, add a "return nil". OK, added an error. The NewReader interfaces on flate and zlib are different, FWIW. https://codereview.appspot.com/97140043/diff/260001/src/compress/flate/inflat... File src/compress/flate/inflate_test.go (right): https://codereview.appspot.com/97140043/diff/260001/src/compress/flate/inflat... src/compress/flate/inflate_test.go:16: "the quick brown fox jumped over"} On 2014/10/13 07:12:57, nigeltao wrote: > I'd still add a comma and a new-line after the final ". Done. https://codereview.appspot.com/97140043/diff/260001/src/compress/zlib/reader.go File src/compress/zlib/reader.go (right): https://codereview.appspot.com/97140043/diff/260001/src/compress/zlib/reader.... src/compress/zlib/reader.go:58: // Reset discards any buffered data and attempts to reset the Resetter as On 2014/10/13 07:12:57, nigeltao wrote: > s/attempts to reset/resets/ and then move the "as" to the next line to match > compress/flate's Resetter. Fixed and rewrapped this comment in both places
Sign in to reply to this message.
LGTM. Have you (or your organization) submitted a contributor license agreement yet? https://golang.org/doc/contribute.html#copyright
Sign in to reply to this message.
I'm a google employee, so the copyright is already google's. I started on this work while on leave, hence the personal email address.
Sign in to reply to this message.
On 2014/10/17 06:44:51, james.robinson wrote: > I'm a google employee, so the copyright is already google's. I started on this > work while on leave, hence the personal email address. ^^ is me
Sign in to reply to this message.
*** Submitted as https://code.google.com/p/go/source/detail?r=e1e2e3c3cdb0 *** compress/flate: add Reset() to allow reusing large buffers to compress multiple buffers This adds a Reset() to compress/flate's decompressor and plumbs that through to compress/zlib and compress/gzip's Readers so callers can avoid large allocations when performing many inflate operations. In particular this preserves the allocation of the decompressor.hist buffer, which is 32kb and overwritten as needed while inflating. On the benchmark described in issue 6317, produces the following speedup on my 2.3ghz Intel Core i7 MBP with go version devel +6b696a34e0af Sun Aug 03 15:14:59 2014 -0700 darwin/amd64: blocked.text w/out patch vs blocked.text w/ patch: benchmark old ns/op new ns/op delta BenchmarkGunzip 8371577533 7927917687 -5.30% benchmark old allocs new allocs delta BenchmarkGunzip 176818 148519 -16.00% benchmark old bytes new bytes delta BenchmarkGunzip 292184936 12739528 -95.64% flat.text vs blocked.text w/patch: benchmark old ns/op new ns/op delta BenchmarkGunzip 7939447827 7927917687 -0.15% benchmark old allocs new allocs delta BenchmarkGunzip 90702 148519 +63.74% benchmark old bytes new bytes delta BenchmarkGunzip 9959528 12739528 +27.91% Similar speedups to those bradfitz saw in https://codereview.appspot.com/13416045. Fixes issue 6317. Fixes issue 7950. LGTM=nigeltao R=golang-codereviews, bradfitz, dan.kortschak, adg, nigeltao, jamesr CC=golang-codereviews https://codereview.appspot.com/97140043 Committer: Nigel Tao <nigeltao@golang.org>
Sign in to reply to this message.
|