|
|
Descriptionio: reuse Copy buffers beween copies
Patch Set 1 #Patch Set 2 : diff -r 606430dd2eca https://go.googlecode.com/hg/ #Patch Set 3 : diff -r 606430dd2eca https://go.googlecode.com/hg/ #
Total comments: 5
MessagesTotal messages: 21
Hello golang-dev@googlegroups.com, I'd like you to review this change to https://go.googlecode.com/hg/
Sign in to reply to this message.
On 2013/01/24 00:00:09, bradfitz wrote: > Hello mailto:golang-dev@googlegroups.com, > > I'd like you to review this change to > https://go.googlecode.com/hg/ Hi Brad. Thanks your for this proposal. I would like to see the magic numbers of the buffer size and channel size made constants for easier tweaking.
Sign in to reply to this message.
> Hi Brad. Thanks your for this proposal. I would like to see the magic numbers of > the buffer size and channel size made constants for easier tweaking. Urgh, I mean, *thank you* for this proposal.
Sign in to reply to this message.
https://codereview.appspot.com/7206048/diff/5001/src/pkg/io/io.go File src/pkg/io/io.go (right): https://codereview.appspot.com/7206048/diff/5001/src/pkg/io/io.go#newcode349 src/pkg/io/io.go:349: return make([]byte, 32*1024) Would you please make this a constant just below var copyBufs https://codereview.appspot.com/7206048/diff/5001/src/pkg/io/io_test.go File src/pkg/io/io_test.go (right): https://codereview.appspot.com/7206048/diff/5001/src/pkg/io/io_test.go#newcod... src/pkg/io/io_test.go:251: var dst = struct{ Writer }{ioutil.Discard} nice
Sign in to reply to this message.
This is mostly just for discussion. I'm not even sure how I feel about it, since Readers and Writers could illegally retain references to the provided Read/Write []byte arguments, causing mysterious corruption later. (although arguably they were already broken before, since the second 32KB chunk would then be corrupt... but with this CL the corruption would last longer. Maybe that's a good thing so it goes noticed?) Static escape analysis can't help here, since Read and Write go through interfaces and who knows what impl you'll get. Maybe we should also document more on Reader and Writer that the impl must not retain references to those. In any case, I saw this on dl.google.com memory profiles a few months ago and made a note to myself to do this, but didn't get around to it until now. Thoughts welcome. On Wed, Jan 23, 2013 at 4:00 PM, <bradfitz@golang.org> wrote: > Reviewers: golang-dev_googlegroups.com, > > Message: > Hello golang-dev@googlegroups.com, > > I'd like you to review this change to > https://go.googlecode.com/hg/ > > > Description: > io: reuse Copy buffers beween copies > > Please review this at https://codereview.appspot.**com/7206048/<https://codereview.appspot.com/7206... > > Affected files: > M src/pkg/io/io.go > M src/pkg/io/io_test.go > > > Index: src/pkg/io/io.go > ==============================**==============================**======= > --- a/src/pkg/io/io.go > +++ b/src/pkg/io/io.go > @@ -338,6 +338,24 @@ > return written, err > } > > +var copyBufs = make(chan []byte, 4) > + > +func getCopyBuf() []byte { > + select { > + case b := <-copyBufs: > + return b > + default: > + } > + return make([]byte, 32*1024) > +} > + > +func putCopyBuf(b []byte) { > + select { > + case copyBufs <- b: > + default: > + } > +} > + > // Copy copies from src to dst until either EOF is reached > // on src or an error occurs. It returns the number of bytes > // copied and the first error encountered while copying, if any. > @@ -360,7 +378,8 @@ > if wt, ok := src.(WriterTo); ok { > return wt.WriteTo(dst) > } > - buf := make([]byte, 32*1024) > + buf := getCopyBuf() > + defer putCopyBuf(buf) > for { > nr, er := src.Read(buf) > if nr > 0 { > Index: src/pkg/io/io_test.go > ==============================**==============================**======= > --- a/src/pkg/io/io_test.go > +++ b/src/pkg/io/io_test.go > @@ -7,6 +7,7 @@ > import ( > "bytes" > . "io" > + "io/ioutil" > "strings" > "testing" > ) > @@ -235,3 +236,20 @@ > } > } > } > + > +type neverending byte > + > +func (b neverending) Read(p []byte) (n int, err error) { > + for i := 0; i < len(p); i++ { > + p[i] = byte(b) > + } > + return len(p), nil > +} > + > +func BenchmarkCopy(b *testing.B) { > + b.ReportAllocs() > + var dst = struct{ Writer }{ioutil.Discard} > + for i := 0; i < b.N; i++ { > + Copy(dst, LimitReader(neverending('a'), 128<<10)) > + } > +} > > >
Sign in to reply to this message.
I think it's implied that you shouldn't hold references to the []byte, by the existing Reader and Writer interface definitions. I'm not too worried about potential corruption. Just wondering if the channel operation will ever be slower than the allocation, particularly in a program with many io.Copy operations happening simultaneously. https://codereview.appspot.com/7206048/diff/5001/src/pkg/io/io.go File src/pkg/io/io.go (right): https://codereview.appspot.com/7206048/diff/5001/src/pkg/io/io.go#newcode341 src/pkg/io/io.go:341: var copyBufs = make(chan []byte, 4) from where comes the magical 4?
Sign in to reply to this message.
I understand your concerns. I wonder if it would be better to leave io.Copy as it is and dl.google copy that method and implement its own buffer caching. On Thu, Jan 24, 2013 at 11:12 AM, Brad Fitzpatrick <bradfitz@golang.org> wrote: > This is mostly just for discussion. > > I'm not even sure how I feel about it, since Readers and Writers could > illegally retain references to the provided Read/Write []byte arguments, > causing mysterious corruption later. (although arguably they were already > broken before, since the second 32KB chunk would then be corrupt... but with > this CL the corruption would last longer. Maybe that's a good thing so it > goes noticed?) > > Static escape analysis can't help here, since Read and Write go through > interfaces and who knows what impl you'll get. > > Maybe we should also document more on Reader and Writer that the impl must > not retain references to those. > > In any case, I saw this on dl.google.com memory profiles a few months ago > and made a note to myself to do this, but didn't get around to it until now. > > Thoughts welcome. > > On Wed, Jan 23, 2013 at 4:00 PM, <bradfitz@golang.org> wrote: >> >> Reviewers: golang-dev_googlegroups.com, >> >> Message: >> Hello golang-dev@googlegroups.com, >> >> I'd like you to review this change to >> https://go.googlecode.com/hg/ >> >> >> Description: >> io: reuse Copy buffers beween copies >> >> Please review this at https://codereview.appspot.com/7206048/ >> >> Affected files: >> M src/pkg/io/io.go >> M src/pkg/io/io_test.go >> >> >> Index: src/pkg/io/io.go >> =================================================================== >> --- a/src/pkg/io/io.go >> +++ b/src/pkg/io/io.go >> @@ -338,6 +338,24 @@ >> return written, err >> } >> >> +var copyBufs = make(chan []byte, 4) >> + >> +func getCopyBuf() []byte { >> + select { >> + case b := <-copyBufs: >> + return b >> + default: >> + } >> + return make([]byte, 32*1024) >> +} >> + >> +func putCopyBuf(b []byte) { >> + select { >> + case copyBufs <- b: >> + default: >> + } >> +} >> + >> // Copy copies from src to dst until either EOF is reached >> // on src or an error occurs. It returns the number of bytes >> // copied and the first error encountered while copying, if any. >> @@ -360,7 +378,8 @@ >> if wt, ok := src.(WriterTo); ok { >> return wt.WriteTo(dst) >> } >> - buf := make([]byte, 32*1024) >> + buf := getCopyBuf() >> + defer putCopyBuf(buf) >> for { >> nr, er := src.Read(buf) >> if nr > 0 { >> Index: src/pkg/io/io_test.go >> =================================================================== >> --- a/src/pkg/io/io_test.go >> +++ b/src/pkg/io/io_test.go >> @@ -7,6 +7,7 @@ >> import ( >> "bytes" >> . "io" >> + "io/ioutil" >> "strings" >> "testing" >> ) >> @@ -235,3 +236,20 @@ >> } >> } >> } >> + >> +type neverending byte >> + >> +func (b neverending) Read(p []byte) (n int, err error) { >> + for i := 0; i < len(p); i++ { >> + p[i] = byte(b) >> + } >> + return len(p), nil >> +} >> + >> +func BenchmarkCopy(b *testing.B) { >> + b.ReportAllocs() >> + var dst = struct{ Writer }{ioutil.Discard} >> + for i := 0; i < b.N; i++ { >> + Copy(dst, LimitReader(neverending('a'), 128<<10)) >> + } >> +} >> >> >
Sign in to reply to this message.
On Thu, Jan 24, 2013 at 11:17 AM, Dave Cheney <dave@cheney.net> wrote: > I understand your concerns. I wonder if it would be better to leave > io.Copy as it is and dl.google copy that method and implement its own > buffer caching. That might not be possible if the io.Copy call is in another part of the standard library (e.g. net/http).
Sign in to reply to this message.
On Wed, Jan 23, 2013 at 4:17 PM, Dave Cheney <dave@cheney.net> wrote: > I understand your concerns. I wonder if it would be better to leave > io.Copy as it is and dl.google copy that method and implement its own > buffer caching. > grep io.Copy in net/http and the rest of the standard library.
Sign in to reply to this message.
Yeah, fair call. Would implementing ReadFrom, WriteTo help, or has that path already been investigated ? I forget which types grew support for those interfaces. On Thu, Jan 24, 2013 at 11:19 AM, Brad Fitzpatrick <bradfitz@golang.org> wrote: > On Wed, Jan 23, 2013 at 4:17 PM, Dave Cheney <dave@cheney.net> wrote: >> >> I understand your concerns. I wonder if it would be better to leave >> io.Copy as it is and dl.google copy that method and implement its own >> buffer caching. > > > grep io.Copy in net/http and the rest of the standard library. >
Sign in to reply to this message.
https://codereview.appspot.com/7206048/diff/5001/src/pkg/io/io.go File src/pkg/io/io.go (right): https://codereview.appspot.com/7206048/diff/5001/src/pkg/io/io.go#newcode341 src/pkg/io/io.go:341: var copyBufs = make(chan []byte, 4) On 2013/01/24 00:17:02, adg wrote: > from where comes the magical 4? Made it up. Like all the other constants in this CL. Ideally I'd like it to be burstable somewhat (i.e. not just 1 here), but also slowly drain if idle (but I haven't profiled that cost, maintaining those timers. Reset probably helps, but still have to update that heap (the other type of heap)) https://codereview.appspot.com/7206048/diff/5001/src/pkg/io/io.go#newcode349 src/pkg/io/io.go:349: return make([]byte, 32*1024) On 2013/01/24 00:06:20, dfc wrote: > Would you please make this a constant just below var copyBufs will do, if we decide to go ahead with this.
Sign in to reply to this message.
On Wed, Jan 23, 2013 at 4:22 PM, Dave Cheney <dave@cheney.net> wrote: > Yeah, fair call. Would implementing ReadFrom, WriteTo help, or has > that path already been investigated ? Yes.
Sign in to reply to this message.
On 24 January 2013 11:23, <bradfitz@golang.org> wrote: > Ideally I'd like it to be burstable somewhat (i.e. not just 1 here), but > also slowly drain if idle (but I haven't profiled that cost, maintaining > those timers. Reset probably helps, but still have to update that heap > (the other type of heap)) > It could probably be done reasonably cheaply if you do the accounting on each get/put.
Sign in to reply to this message.
the specificity bothers me. why just here? there are reasons to keep it all private but if the goal is to reduce GC pressure maybe it should be a service more generally available. or maybe the collector should not need this sort of workaround in the first place. there is similar nonsense in fmt for instance, and it is distasteful but important. maybe the idea should be generalized or maybe rendered unnecessary. I would like to have a strategy discussion before committing more such tactical tweaks. also some benchmarks? doesn't the allocator already cache like this? what are the issues about caching that is not locality-aware? etc. in short, should this be done at all, and if so, where? -rob On Jan 24, 2013 1:28 AM, "Andrew Gerrand" <adg@golang.org> wrote: > > On 24 January 2013 11:23, <bradfitz@golang.org> wrote: > >> Ideally I'd like it to be burstable somewhat (i.e. not just 1 here), but >> also slowly drain if idle (but I haven't profiled that cost, maintaining >> those timers. Reset probably helps, but still have to update that heap >> (the other type of heap)) >> > > It could probably be done reasonably cheaply if you do the accounting on > each get/put. >
Sign in to reply to this message.
On Wed, Jan 23, 2013 at 9:48 PM, Rob Pike <r@golang.org> wrote: > the specificity bothers me. why just here? > Because this is frequently called and allocates pretty large buffers (32KB). The more bytes you allocate, the sooner that leads to a GC event. > there are reasons to keep it all private but if the goal is to reduce GC > pressure maybe it should be a service more generally available. > I have a byte allocator package inside Google with an API like: package pool func NewBytePool(...opts...) *BytePool func (*BytePool) Alloc(n int) []byte func (*BytePool) Free(buf []byte) If that's not objectionable, we could add that to the standard library somewhere. Personally I wish we could have a private packages that the standard library could use but which we didn't need to commit to being public long-term. This would be such a candidate, so fmt and io and http and everybody didn't have to re-invent this, often poorly. > or maybe the collector should not need this sort of workaround in the > first place. > The collector doesn't even run until N bytes are allocated, where N is the size of the heap at the last collection. How would the collector avoid this if the collector isn't involved until it's already too late? The goal is to avoid allocation in the first place, and make([]byte...) allocates. > there is similar nonsense in fmt for instance, and it is distasteful but > important. maybe the idea should be generalized or maybe rendered > unnecessary. I would like to have a strategy discussion before committing > more such tactical tweaks. > > also some benchmarks? > Oh, I forgot to paste them into the commit message. It went from 33200 bytes/op to 50 bytes/op. So before it could do 32K operations before it allocated 1 GB of memory. Now it can do 21 million. Which means many fewer GC events and thus fewer pauses (which is the goal) and less CPU (doing GCs). > doesn't the allocator already cache like this? > The allocator has per-size pools like tcmalloc, but it doesn't know immediately when the 32 KB buf is no longer needed and could be re-used. It doesn't discover that and re-pool the memory until a GC occurs, which is too late. > what are the issues about caching that is not locality-aware? etc. in > short, should this be done at all, and if so, where? > I'd like to see a general allocation pool package so we can move all the fmt/http/io/etc stuff into a common place. It could have a specialized API for []byte (a common case, as above) and maybe also for interface{}s (for pools of variables, e.g. *MyStruct to be reused).
Sign in to reply to this message.
Per request from rsc, filed https://code.google.com/p/go/issues/detail?id=4720 for discussion. On Wed, Jan 23, 2013 at 10:51 PM, Brad Fitzpatrick <bradfitz@golang.org>wrote: > On Wed, Jan 23, 2013 at 9:48 PM, Rob Pike <r@golang.org> wrote: > >> the specificity bothers me. why just here? >> > Because this is frequently called and allocates pretty large buffers > (32KB). The more bytes you allocate, the sooner that leads to a GC event. > >> there are reasons to keep it all private but if the goal is to reduce GC >> pressure maybe it should be a service more generally available. >> > I have a byte allocator package inside Google with an API like: > > package pool > func NewBytePool(...opts...) *BytePool > func (*BytePool) Alloc(n int) []byte > func (*BytePool) Free(buf []byte) > > If that's not objectionable, we could add that to the standard library > somewhere. > > Personally I wish we could have a private packages that the standard > library could use but which we didn't need to commit to being public > long-term. This would be such a candidate, so fmt and io and http and > everybody didn't have to re-invent this, often poorly. > >> or maybe the collector should not need this sort of workaround in the >> first place. >> > The collector doesn't even run until N bytes are allocated, where N is the > size of the heap at the last collection. > > How would the collector avoid this if the collector isn't involved until > it's already too late? The goal is to avoid allocation in the first place, > and make([]byte...) allocates. > >> there is similar nonsense in fmt for instance, and it is distasteful but >> important. maybe the idea should be generalized or maybe rendered >> unnecessary. I would like to have a strategy discussion before committing >> more such tactical tweaks. >> >> also some benchmarks? >> > Oh, I forgot to paste them into the commit message. > > It went from 33200 bytes/op to 50 bytes/op. So before it could do 32K > operations before it allocated 1 GB of memory. Now it can do 21 million. > Which means many fewer GC events and thus fewer pauses (which is the goal) > and less CPU (doing GCs). > >> doesn't the allocator already cache like this? >> > The allocator has per-size pools like tcmalloc, but it doesn't know > immediately when the 32 KB buf is no longer needed and could be re-used. > It doesn't discover that and re-pool the memory until a GC occurs, which > is too late. > >> what are the issues about caching that is not locality-aware? etc. in >> short, should this be done at all, and if so, where? >> > I'd like to see a general allocation pool package so we can move all the > fmt/http/io/etc stuff into a common place. > > It could have a specialized API for []byte (a common case, as above) and > maybe also for interface{}s (for pools of variables, e.g. *MyStruct to be > reused). > > >
Sign in to reply to this message.
*** Abandoned ***
Sign in to reply to this message.
If Issue 4720 (sync.Cache or sync.Pool whatever) is stalled, can we revisit this CL? io.Copy is used everywhere, and generates 32 KB of garbage per call. On Fri, May 17, 2013 at 1:45 PM, <bradfitz@golang.org> wrote: > *** Abandoned *** > > https://codereview.appspot.**com/7206048/<https://codereview.appspot.com/7206... >
Sign in to reply to this message.
ping. This isn't an API change. On Wed, Aug 7, 2013 at 3:20 PM, Brad Fitzpatrick <bradfitz@golang.org>wrote: > If Issue 4720 (sync.Cache or sync.Pool whatever) is stalled, can we > revisit this CL? > > io.Copy is used everywhere, and generates 32 KB of garbage per call. > > > > On Fri, May 17, 2013 at 1:45 PM, <bradfitz@golang.org> wrote: > >> *** Abandoned *** >> >> https://codereview.appspot.**com/7206048/<https://codereview.appspot.com/7206... >> > >
Sign in to reply to this message.
Message was sent while issue was closed.
I have the same concerns about this change that I do about reusing buffers in bufio.
Sign in to reply to this message.
Message was sent while issue was closed.
On 2013/08/09 01:27:11, bradfitz wrote: > ping. > > This isn't an API change. > > > > On Wed, Aug 7, 2013 at 3:20 PM, Brad Fitzpatrick <mailto:bradfitz@golang.org>wrote: > > > If Issue 4720 (sync.Cache or sync.Pool whatever) is stalled, can we > > revisit this CL? > > > > io.Copy is used everywhere, and generates 32 KB of garbage per call. > > > > > > > > On Fri, May 17, 2013 at 1:45 PM, <mailto:bradfitz@golang.org> wrote: > > > >> *** Abandoned *** > >> > >> > https://codereview.appspot.**com/7206048/%3Chttps://codereview.appspot.com/72... > >> > > > > io.CopyBuffer is cool io: add CopyBuffer, a version of Copy in which the user provides a buffer https://go-review.googlesource.com/#/c/8730/
Sign in to reply to this message.
|