|
|
Created:
13 years, 8 months ago by dvyukov Modified:
13 years, 8 months ago Reviewers:
CC:
r, rsc, golang-dev Visibility:
Public. |
Descriptionsync: add benchmark for Once.
Patch Set 1 #Patch Set 2 : diff -r 6f1145ee588d https://go.googlecode.com/hg/ #Patch Set 3 : diff -r 6f1145ee588d https://go.googlecode.com/hg/ #Patch Set 4 : diff -r 6f1145ee588d https://go.googlecode.com/hg/ #
Total comments: 2
Patch Set 5 : diff -r b295b8bda20b https://go.googlecode.com/hg/ #
Total comments: 3
Patch Set 6 : diff -r b295b8bda20b https://go.googlecode.com/hg/ #
Total comments: 2
Patch Set 7 : diff -r b295b8bda20b https://go.googlecode.com/hg/ #
Total comments: 1
Patch Set 8 : diff -r 9e53a1312e25 https://go.googlecode.com/hg/ #MessagesTotal messages: 20
Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://go.googlecode.com/hg/
Sign in to reply to this message.
Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
http://codereview.appspot.com/4650054/diff/9001/src/pkg/sync/once_test.go File src/pkg/sync/once_test.go (right): http://codereview.appspot.com/4650054/diff/9001/src/pkg/sync/once_test.go#new... src/pkg/sync/once_test.go:14: const OnceGrainSize = 1000 move this down closer to its use. i suggest putting it right inside BenchmarkOnce also it's a cute name but not as clear as it could be. put it inside the benchmark, as suggested, and it could be CallsPerSched http://codereview.appspot.com/4650054/diff/9001/src/pkg/sync/once_test.go#new... src/pkg/sync/once_test.go:51: f := func() {} this could be outside the procs loop
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, r@golang.org (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
On 2011/06/25 22:57:09, r wrote: > http://codereview.appspot.com/4650054/diff/9001/src/pkg/sync/once_test.go > File src/pkg/sync/once_test.go (right): > > http://codereview.appspot.com/4650054/diff/9001/src/pkg/sync/once_test.go#new... > src/pkg/sync/once_test.go:14: const OnceGrainSize = 1000 > move this down closer to its use. i suggest putting it right inside > BenchmarkOnce > > also it's a cute name but not as clear as it could be. put it inside the > benchmark, as suggested, and it could be CallsPerSched Consts can be function-local! Woohoo! :) > http://codereview.appspot.com/4650054/diff/9001/src/pkg/sync/once_test.go#new... > src/pkg/sync/once_test.go:51: f := func() {} > this could be outside the procs loop I wanted to be safe of potential false-sharing problems. But I think it's OK to move it to 'global' scope in this case.
Sign in to reply to this message.
http://codereview.appspot.com/4650054/diff/9003/src/pkg/sync/once_test.go File src/pkg/sync/once_test.go (right): http://codereview.appspot.com/4650054/diff/9003/src/pkg/sync/once_test.go#new... src/pkg/sync/once_test.go:43: b.SetBytes(1) Since this is not processing data, it seems better just to omit this call, which will make the benchmark output drop the MB/s part. http://codereview.appspot.com/4650054/diff/9003/src/pkg/sync/once_test.go#new... src/pkg/sync/once_test.go:45: N := int32(b.N / CallsPerSched) Should this be b.N / CallsPerSched / procs? http://codereview.appspot.com/4650054/diff/9003/src/pkg/sync/once_test.go#new... src/pkg/sync/once_test.go:46: once := new(Once) var once Once
Sign in to reply to this message.
On 2011/06/27 15:59:23, rsc wrote: > http://codereview.appspot.com/4650054/diff/9003/src/pkg/sync/once_test.go > File src/pkg/sync/once_test.go (right): > > http://codereview.appspot.com/4650054/diff/9003/src/pkg/sync/once_test.go#new... > src/pkg/sync/once_test.go:43: b.SetBytes(1) > Since this is not processing data, it seems > better just to omit this call, which will make > the benchmark output drop the MB/s part. The problem with ns/op is that it's a way too coarse-grained. For example, for some benchmarks (and I expect it for Once as well after some more improvements) I see numbers like 2 ns/op. That means that all improvements that make Once up to 2 times faster will be indiscernible. I especially patched gotest to provide enough precision for MB/s metric. So I see two ways: (1) leave SetBytes() or (2) patch ns/op metric to provide at least 2 digits of precision. > http://codereview.appspot.com/4650054/diff/9003/src/pkg/sync/once_test.go#new... > src/pkg/sync/once_test.go:45: N := int32(b.N / CallsPerSched) > Should this be b.N / CallsPerSched / procs? It can be either (1) b.N / CallsPerSched then linear scalability looks like: GOMAXPROCS=1 100 ns/op GOMAXPROCS=2 50 ns/op GOMAXPROCS=4 25 ns/op ... or (2) b.N / CallsPerSched * procs then linear scalability looks like: GOMAXPROCS=1 100 ns/op GOMAXPROCS=2 100 ns/op GOMAXPROCS=4 100 ns/op However since gotest automatically adjusts amount of work it's mostly irrelevant. Moreover for benchmarks like matrix multiplication it's quite difficult to scale amount of work, so I would prefer to keep amount of work independent of number of procs in all benchmarks.
Sign in to reply to this message.
Hello r@golang.org, rsc@golang.org (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
> It can be either > (1) b.N / CallsPerSched > then linear scalability looks like: > GOMAXPROCS=1 100 ns/op > GOMAXPROCS=2 50 ns/op > GOMAXPROCS=4 25 ns/op > ... > or (2) b.N / CallsPerSched * procs > then linear scalability looks like: > GOMAXPROCS=1 100 ns/op > GOMAXPROCS=2 100 ns/op > GOMAXPROCS=4 100 ns/op > > However since gotest automatically adjusts amount of work it's mostly > irrelevant. Moreover for benchmarks like matrix multiplication it's > quite difficult to scale amount of work, so I would prefer to keep > amount of work independent of number of procs in all benchmarks. But you're not. In this benchmark, 4 procs do 4x as much work (collectively) as 1 proc. The amount of work is changing. Adding the / procs as I suggested will keep the *total* amount of work constant, which is what you'd want to do in the matrix multiplication case, or any case that doesn't parallelize well, too. Russ
Sign in to reply to this message.
http://codereview.appspot.com/4650054/diff/9004/src/pkg/sync/once_test.go File src/pkg/sync/once_test.go (right): http://codereview.appspot.com/4650054/diff/9004/src/pkg/sync/once_test.go#new... src/pkg/sync/once_test.go:43: b.SetBytes(1) Delete. http://codereview.appspot.com/4650054/diff/9004/src/pkg/sync/once_test.go#new... src/pkg/sync/once_test.go:51: for atomic.AddInt32(&N, -1) >= 0 { Aha: I missed that N was shared here. This is fine without the /procs.
Sign in to reply to this message.
Hello r@golang.org, rsc@golang.org (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
On 2011/06/27 17:55:09, rsc wrote: > http://codereview.appspot.com/4650054/diff/9004/src/pkg/sync/once_test.go > File src/pkg/sync/once_test.go (right): > > http://codereview.appspot.com/4650054/diff/9004/src/pkg/sync/once_test.go#new... > src/pkg/sync/once_test.go:43: b.SetBytes(1) > Delete. I've also replaced: N := int32(b.N / CallsPerSched) with: N := int32(b.N) so now it shows ns per CallsPerSched calls. Otherwise it's going to look like: sync_test.BenchmarkOnce 500000000 5 ns/op sync_test.BenchmarkOnce-2 500000000 2 ns/op sync_test.BenchmarkOnce-4 2000000000 1 ns/op sync_test.BenchmarkOnce-8 2000000000 1 ns/op sync_test.BenchmarkOnce-16 2000000000 1 ns/op which is unacceptable.
Sign in to reply to this message.
> so now it shows ns per CallsPerSched calls. don't do that; now it's printing lies. > Otherwise it's going to look > like: > > sync_test.BenchmarkOnce 500000000 5 ns/op > sync_test.BenchmarkOnce-2 500000000 2 ns/op > sync_test.BenchmarkOnce-4 2000000000 1 ns/op > sync_test.BenchmarkOnce-8 2000000000 1 ns/op > sync_test.BenchmarkOnce-16 2000000000 1 ns/op > > which is unacceptable. i am not sure how unacceptable this is, but if more precision is necessary, let's add the precision instead of printing lies. russ
Sign in to reply to this message.
On 2011/06/27 18:32:10, rsc wrote: > > so now it shows ns per CallsPerSched calls. > > don't do that; now it's printing lies. > > > Otherwise it's going to look > > like: > > > > sync_test.BenchmarkOnce 500000000 5 ns/op > > sync_test.BenchmarkOnce-2 500000000 2 ns/op > > sync_test.BenchmarkOnce-4 2000000000 1 ns/op > > sync_test.BenchmarkOnce-8 2000000000 1 ns/op > > sync_test.BenchmarkOnce-16 2000000000 1 ns/op > > > > which is unacceptable. > > i am not sure how unacceptable this is, Of course it is. It says that 4, 8 and 16 worker threads has *exactly* the same performance, which is even worse lie. Moreover, any further improvements are indiscernible. > but if more precision is necessary, let's > add the precision instead of printing lies. Fixing gotest again? :) What about providing at least 2 digits of precision? That is: 100 ns/op 99.9 ns/op ... 10.0 ns/op 9.99 ns/op Another route is to print raw execution time, since number of iterations is already output it will be trivial to calculate any metrics with any precision.
Sign in to reply to this message.
>> > which is unacceptable. > >> i am not sure how unacceptable this is, > > Of course it is. It says that 4, 8 and 16 worker threads has *exactly* > the same performance, which is even worse lie. Moreover, any further > improvements are indiscernible. It doesn't say they have exactly the same performance. Everyone who knows how to read a benchmark result knows that 1 means something in the range 0.5 to 1.5. I'll fix gotest.
Sign in to reply to this message.
http://codereview.appspot.com/4650054/diff/16001/src/pkg/sync/once_test.go File src/pkg/sync/once_test.go (right): http://codereview.appspot.com/4650054/diff/16001/src/pkg/sync/once_test.go#ne... src/pkg/sync/once_test.go:44: N := int32(b.N) Waiting for this to get fixed and then I will submit.
Sign in to reply to this message.
Hello r@golang.org, rsc@golang.org (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
LGTM Do you get significantly different results if you set CallsPerSched to 10000?
Sign in to reply to this message.
*** Submitted as http://code.google.com/p/go/source/detail?r=d4c5ffc402ba *** sync: add benchmark for Once. R=r, rsc CC=golang-dev http://codereview.appspot.com/4650054 Committer: Russ Cox <rsc@golang.org>
Sign in to reply to this message.
|