Code review - Issue 6819123: code review 6819123: cmd/gc: add division rewrite to walk pass.https://codereview.appspot.com/2012-11-26T22:45:29+00:00rietveld
Message from unknown
2012-11-11T10:40:21+00:00remyoudomphengurn:md5:5c305d9a1e4d0135bba1d91f34da03d4
Message from unknown
2012-11-11T10:40:24+00:00remyoudomphengurn:md5:f01a5d9ae57248d7fe9defe482ac8c2c
Message from unknown
2012-11-11T11:02:44+00:00remyoudomphengurn:md5:7f531c72bf3cf5e9183b3d0c4d8cd3da
Message from remyoudompheng@gmail.com
2012-11-11T11:02:49+00:00remyoudomphengurn:md5:f0c75982c4061342a4ebebf18cdbd5e3
Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com),
I'd like you to review this change to
https://go.googlecode.com/hg/
Message from unknown
2012-11-11T11:20:26+00:00remyoudomphengurn:md5:045d9bdb330c7e242d6df933cc864cc1
Message from remyoudompheng@gmail.com
2012-11-11T11:20:30+00:00remyoudomphengurn:md5:8d77632750a6a5017b6102c301c05e8b
Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com),
Please take another look.
Message from unknown
2012-11-11T11:24:12+00:00remyoudomphengurn:md5:871a019f4c6e74996040e25f820c7ae1
Message from remyoudompheng@gmail.com
2012-11-11T11:24:17+00:00remyoudomphengurn:md5:339d4ef62e00618f48912ce2d041f0ed
Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com),
Please take another look.
Message from remyoudompheng@gmail.com
2012-11-11T11:25:19+00:00remyoudomphengurn:md5:c58bc78d12f040d282505c3f84cb08c1
The latest patch adds support for the /= %= case.
Results on benchamrks from CL 6819087
Before (arm):
BenchmarkDiv1 20000000 25.3 ns/op
BenchmarkDiv2 20000000 38.2 ns/op
BenchmarkDiv4 20000000 38.2 ns/op
BenchmarkMod1 20000000 26.5 ns/op
BenchmarkMod2 20000000 36.5 ns/op
BenchmarkMod4 20000000 36.5 ns/op
After (arm):
BenchmarkDiv1 500000000 1.77 ns/op
BenchmarkDiv2 500000000 2.65 ns/op
BenchmarkDiv4 100000000 3.53 ns/op
BenchmarkMod1 500000000 2.06 ns/op
BenchmarkMod2 100000000 5.29 ns/op
BenchmarkMod4 100000000 7.64 ns/op
Message from dave@cheney.net
2012-11-11T12:03:33+00:00dfcurn:md5:0883f65dbdac311a2ec87d638e452556
Wow.
On 11/11/2012, at 22:25, remyoudompheng@gmail.com wrote:
> The latest patch adds support for the /= %= case.
> Results on benchamrks from CL 6819087
>
> Before (arm):
> BenchmarkDiv1 20000000 25.3 ns/op
> BenchmarkDiv2 20000000 38.2 ns/op
> BenchmarkDiv4 20000000 38.2 ns/op
> BenchmarkMod1 20000000 26.5 ns/op
> BenchmarkMod2 20000000 36.5 ns/op
> BenchmarkMod4 20000000 36.5 ns/op
>
> After (arm):
> BenchmarkDiv1 500000000 1.77 ns/op
> BenchmarkDiv2 500000000 2.65 ns/op
> BenchmarkDiv4 100000000 3.53 ns/op
> BenchmarkMod1 500000000 2.06 ns/op
> BenchmarkMod2 100000000 5.29 ns/op
> BenchmarkMod4 100000000 7.64 ns/op
>
>
> http://codereview.appspot.com/6819123/
Message from remyoudompheng@gmail.com
2012-11-11T12:15:39+00:00remyoudomphengurn:md5:74430dd2f74250f12c3339f921e534b5
On 386 (actually a Core Quad with GOARCH=386)
benchmark old ns/op new ns/op delta
BenchmarkDiv1 7 0.43 -93.97%
BenchmarkDiv2 7 1.79 -75.41%
BenchmarkDiv4 7 2.71 -61.99%
BenchmarkDiv10 7 7 -2.47%
BenchmarkMod1 7 0.86 -88.79%
BenchmarkMod2 7 3.69 -51.19%
BenchmarkMod4 7 3.90 -49.28%
BenchmarkMod10 7 7 -1.33%
Message from unknown
2012-11-11T13:08:12+00:00remyoudomphengurn:md5:c808ad46e071f43fe980be26a06d4182
Message from remyoudompheng@gmail.com
2012-11-11T13:08:17+00:00remyoudomphengurn:md5:7852251afbe89ede0b8d3a9be49b962c
Hello golang-dev@googlegroups.com, dave@cheney.net (cc: golang-dev@googlegroups.com),
Please take another look.
Message from remyoudompheng@gmail.com
2012-11-11T13:17:17+00:00remyoudomphengurn:md5:e95bc37f7c40e3d82586a092f50d745d
Added unsigned magic multiply treatment:
* 32-bit case is suboptimal because it needs a 32*32->64 bit multiply for which we don't have a portable AST representation (other than uint64 product) but would be faster (64bit*64bit needs 3 MUL, 32*32->64 needs one MUL).
* 64-bit case is not implemented and left to 6g (it would need a 128-bit value).
In the benchmarks (UDiv is uint32, UDivh is uint16):
386
benchmark old ns/op new ns/op delta
BenchmarkUDiv10 6 3 -54.78%
BenchmarkUDiv123456 6 6 +0.29%
BenchmarkUDivh10 7 2 -67.00%
BenchmarkUDivh123 7 3 -50.25%
arm
benchmark old ns/op new ns/op delta
BenchmarkUDiv10 43 10 -74.65%
BenchmarkUDiv123456 37 13 -64.10%
BenchmarkUDivh10 38 9 -74.40%
BenchmarkUDivh123 38 13 -64.66%
5g generates codes like this for 16-bit division (R3 /= 10).
0572 (divmul_test.go:129) MOVHU R3,R0
0573 (divmul_test.go:129) MOVHU R0,R1
0574 (divmul_test.go:129) MOVW R1,R0
0575 (divmul_test.go:129) MOVW $52429,R1
0576 (divmul_test.go:129) MULU R1,R0,R0
0577 (divmul_test.go:129) MOVW R0>>19,R0
0578 (divmul_test.go:129) MOVHU R0,R1
0579 (divmul_test.go:129) MOVHU R1,R3
It could be simplified to
MOVHU R3, R0
MOVW $52429, R1
MULU R1, R0, R0
MOVW R0>>19, R0
MOVHU R0, R3
Message from unknown
2012-11-11T13:43:03+00:00remyoudomphengurn:md5:021029d57c7e5618bfa539a406bfc4ed
Message from remyoudompheng@gmail.com
2012-11-11T13:43:07+00:00remyoudomphengurn:md5:f62f5b499e2f90e9df9283c13a210bf4
Hello golang-dev@googlegroups.com, dave@cheney.net (cc: golang-dev@googlegroups.com),
Please take another look.
Message from remyoudompheng@gmail.com
2012-11-11T13:43:48+00:00remyoudomphengurn:md5:476b5a262ebb7e7e170156d4a7099f70
Added the signed magic multiply. Feel free to benchmark. Adding support for constants in cgen64.c for OMUL could be useful to improve further.
Message from dave@cheney.net
2012-11-11T22:51:51+00:00dfcurn:md5:1d172fca424a70f9ea264449d15392b3
Some minor suggestions. I don't feel qualified to review the gc changes, apart from observing the benchmark numbers are impressive.
pando(~/go/test/bench/go) % ~/go/misc/benchcmp {old,new}.txt
benchmark old ns/op new ns/op delta
BenchmarkDiv1 43 3 -92.68%
BenchmarkDiv2 65 4 -93.07%
BenchmarkDiv3 65 21 -67.80%
BenchmarkDiv4 65 6 -90.79%
BenchmarkDiv10 65 24 -63.17%
BenchmarkUDiv1 45 3 -93.34%
BenchmarkUDiv2 65 3 -94.61%
BenchmarkUDiv3 65 17 -73.77%
BenchmarkUDiv4 65 3 -94.61%
BenchmarkUDiv10 65 17 -73.77%
BenchmarkMod1 45 3 -92.23%
BenchmarkMod2 62 9 -85.49%
BenchmarkMod3 62 25 -59.71%
BenchmarkMod4 62 13 -79.61%
BenchmarkMod10 62 29 -53.24%
https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c
File src/cmd/gc/walk.c (right):
https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode894
src/cmd/gc/walk.c:894: * on 386, rewrite float ops into l = l op r.
should this be 386/arm ?
https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode961
src/cmd/gc/walk.c:961: }
why not move the block into this switch then push the default down to the bottom of the switch and make goto ret; unconditional.
https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode2899
src/cmd/gc/walk.c:2899: // by a constant
nit: maybe move this comment to like 2908
https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode2910
src/cmd/gc/walk.c:2910: // just sign bit
i know this came from 6g, but can the comment be improved ? ie.
// nothing to do because divisor is larger than operand ?
Message from mtj@google.com
2012-11-11T23:19:39+00:00mtj1urn:md5:b3b74f78bc124334a2759a99f9f9601f
This is all awesome, but, don't we already know that some of these are more
than special...
div 1 means noop
div 2 means shift by 1
div 4 means shift by 2
mod 1 means 0
mod 2 means x & 1
mod 4 means x & 3
On Sun, Nov 11, 2012 at 5:51 PM, <dave@cheney.net> wrote:
> Some minor suggestions. I don't feel qualified to review the gc changes,
> apart from observing the benchmark numbers are impressive.
>
> pando(~/go/test/bench/go) % ~/go/misc/benchcmp {old,new}.txt
>
> benchmark old ns/op new ns/op delta
> BenchmarkDiv1 43 3 -92.68%
> BenchmarkDiv2 65 4 -93.07%
> BenchmarkDiv3 65 21 -67.80%
> BenchmarkDiv4 65 6 -90.79%
> BenchmarkDiv10 65 24 -63.17%
> BenchmarkUDiv1 45 3 -93.34%
> BenchmarkUDiv2 65 3 -94.61%
> BenchmarkUDiv3 65 17 -73.77%
> BenchmarkUDiv4 65 3 -94.61%
> BenchmarkUDiv10 65 17 -73.77%
> BenchmarkMod1 45 3 -92.23%
> BenchmarkMod2 62 9 -85.49%
> BenchmarkMod3 62 25 -59.71%
> BenchmarkMod4 62 13 -79.61%
> BenchmarkMod10 62 29 -53.24%
>
>
>
> https://codereview.appspot.**com/6819123/diff/13001/src/**cmd/gc/walk.c<https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c>
> File src/cmd/gc/walk.c (right):
>
> https://codereview.appspot.**com/6819123/diff/13001/src/**
> cmd/gc/walk.c#newcode894<https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode894>
> src/cmd/gc/walk.c:894: * on 386, rewrite float ops into l = l op r.
> should this be 386/arm ?
>
> https://codereview.appspot.**com/6819123/diff/13001/src/**
> cmd/gc/walk.c#newcode961<https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode961>
> src/cmd/gc/walk.c:961: }
> why not move the block into this switch then push the default down to
> the bottom of the switch and make goto ret; unconditional.
>
> https://codereview.appspot.**com/6819123/diff/13001/src/**
> cmd/gc/walk.c#newcode2899<https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode2899>
> src/cmd/gc/walk.c:2899: // by a constant
> nit: maybe move this comment to like 2908
>
> https://codereview.appspot.**com/6819123/diff/13001/src/**
> cmd/gc/walk.c#newcode2910<https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode2910>
> src/cmd/gc/walk.c:2910: // just sign bit
> i know this came from 6g, but can the comment be improved ? ie.
>
> // nothing to do because divisor is larger than operand ?
>
> https://codereview.appspot.**com/6819123/<https://codereview.appspot.com/6819123/>
>
--
Michael T. Jones | Chief Technology Advocate | mtj@google.com | +1
650-335-5765
Message from unknown
2012-11-12T00:37:57+00:00remyoudomphengurn:md5:d088323afe22d40d5b65f17eeed2a07d
Message from remyoudompheng@gmail.com
2012-11-12T00:38:03+00:00remyoudomphengurn:md5:d021ffdca6e129faf3afeecd23a47c22
Hello golang-dev@googlegroups.com, dave@cheney.net, mtj@google.com (cc: golang-dev@googlegroups.com),
Please take another look.
Message from dave@cheney.net
2012-11-12T04:13:21+00:00dfcurn:md5:e3dc017793dfae4b9c1c123f1218f9ff
There may be an error with patch set #8. ./all.bash fails the second phase with
# runtime/cgo
cannot load imported symbols from ELF file $WORK/runtime/cgo/_obj/_cgo_.o: length of symbol section is not a multiple of SymSize
Message from remyoudompheng@gmail.com
2012-11-12T06:55:23+00:00remyoudomphengurn:md5:1ca8522bdc31c5857a718e2045e75808
http://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c
File src/cmd/gc/walk.c (right):
http://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode894
src/cmd/gc/walk.c:894: * on 386, rewrite float ops into l = l op r.
I don't know.
http://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode961
src/cmd/gc/walk.c:961: }
It made the diff smaller. but it's weird indeed.
http://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode2899
src/cmd/gc/walk.c:2899: // by a constant
Makes a good description for the function.
http://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode2910
src/cmd/gc/walk.c:2910: // just sign bit
i'm not even sure what it means. We could add a special treatment here.
Message from unknown
2012-11-12T06:55:59+00:00remyoudomphengurn:md5:8a71e825a2c3992ea9db8a4d835fff8f
Message from remyoudompheng@gmail.com
2012-11-12T06:56:05+00:00remyoudomphengurn:md5:0d6bd1dc008a26ed881f2d79c463e1f8
Hello golang-dev@googlegroups.com, dave@cheney.net, mtj@google.com (cc: golang-dev@googlegroups.com),
Please take another look.
Message from dave@cheney.net
2012-11-12T07:34:14+00:00dfcurn:md5:d21845314c5644892e9cc92f361657bf
Thank you, patch set #9 works well.
linux/arm:
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 45683930000 46160431000 +1.04%
BenchmarkFannkuch11 31897766000 32261871000 +1.14%
BenchmarkGobDecode 114538550 116271950 +1.51%
BenchmarkGobEncode 57059940 55352780 -2.99%
BenchmarkGzip 5586822000 5547333000 -0.71%
BenchmarkGunzip 1000793000 999066200 -0.17%
BenchmarkJSONEncode 674365200 660949600 -1.99%
BenchmarkJSONDecode 1574219000 1392120000 -11.57%
BenchmarkMandelbrot200 45697620 45659180 -0.08%
BenchmarkParse 55767820 56271380 +0.90%
BenchmarkRevcomp 118286100 129815600 +9.75%
BenchmarkTemplate 1715119000 1657440000 -3.36%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 6.70 6.60 0.99x
BenchmarkGobEncode 13.45 13.87 1.03x
BenchmarkGzip 3.47 3.50 1.01x
BenchmarkGunzip 19.39 19.42 1.00x
BenchmarkJSONEncode 2.88 2.94 1.02x
BenchmarkJSONDecode 1.23 1.39 1.13x
BenchmarkParse 1.04 1.03 0.99x
BenchmarkRevcomp 21.49 19.58 0.91x
BenchmarkTemplate 1.13 1.17 1.04x
Message from remyoudompheng@gmail.com
2012-11-13T06:51:30+00:00remyoudomphengurn:md5:f64aaefa3e2f618f5e8509374b3039fd
I'm not sure why JSONDecode moves. Does it use division or is it an effect of stack size changes?
Message from dave@cheney.net
2012-11-13T06:52:41+00:00dfcurn:md5:3b78b52251a854d7035bb1016cfacd0b
I'll add in some debugging and see.
Also, https://codereview.appspot.com/6842045/
On Tue, Nov 13, 2012 at 5:51 PM, <remyoudompheng@gmail.com> wrote:
> I'm not sure why JSONDecode moves. Does it use division or is it an
> effect of stack size changes?
>
> http://codereview.appspot.com/6819123/
Message from dave@cheney.net
2012-11-13T21:20:24+00:00dfcurn:md5:6e44b084b354493fb57b3262e75de598
here is an additional datapoint from linux/386
220887(~/go/test/bench/go1) % ~/go/misc/benchcmp {old,new}.txt
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 2147483647 2147483647 +0.24%
BenchmarkFannkuch11 2147483647 2147483647 -0.84%
BenchmarkGobDecode 108116600 108685200 +0.53%
BenchmarkGobEncode 45225340 46541140 +2.91%
BenchmarkGzip 2147483647 2147483647 +0.30%
BenchmarkGunzip 603468800 600955000 -0.42%
BenchmarkJSONEncode 421999200 765123600 +81.31%
BenchmarkJSONDecode 1043933000 1255016000 +20.22%
BenchmarkMandelbrot200 54298660 54281840 -0.03%
BenchmarkParse 37363320 37803220 +1.18%
BenchmarkRevcomp 2147483647 2147483647 +3.66%
BenchmarkTemplate 1478883000 1601209000 +8.27%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 7.10 7.06 0.99x
BenchmarkGobEncode 16.97 16.49 0.97x
BenchmarkGzip 5.83 5.81 1.00x
BenchmarkGunzip 32.16 32.29 1.00x
BenchmarkJSONEncode 4.60 2.54 0.55x
BenchmarkJSONDecode 1.86 1.55 0.83x
BenchmarkParse 1.55 1.53 0.99x
BenchmarkRevcomp 52.19 50.35 0.96x
BenchmarkTemplate 1.31 1.21 0.92x
Perhaps walkdiv should be skipped for 386 ?
Message from unknown
2012-11-15T07:27:50+00:00remyoudomphengurn:md5:0068fa322d41ab0af2f919d9ccb353c3
Message from unknown
2012-11-15T07:33:11+00:00remyoudomphengurn:md5:a12f97a2b5917569b1734521a0d0f6f1
Message from dave@cheney.net
2012-11-15T11:37:56+00:00dfcurn:md5:122ef8cd1b31e94b7b46fa61231181ed
patch set 11 fails for me on linux/386
220887(~/go/src) % go run ~/go/test/ken/modconst.go
u8 121 121 60 1
panic: fail
goroutine 1 [running]:
main.u8test(0x23c7979, 0xb0bff12)
/home/dfc/go/test/ken/modconst.go:578 +0xee
main.u8run()
/home/dfc/go/test/ken/modconst.go:613 +0x313
main.main()
/home/dfc/go/test/ken/modconst.go:630 +0x3f
goroutine 2 [syscall]:
created by runtime.main
/home/dfc/go/src/pkg/runtime/proc.c:225
exit status 2
Message from dave@cheney.net
2012-11-15T11:45:12+00:00dfcurn:md5:0d86638272508613042e44705c5ba53b
The previous comment notwithstanding, the 386 regression is gone in patchset 11
220887(~/go/test/bench/go1) % ~/go/misc/benchcmp {old,new}.txt
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 2147483647 2147483647 -0.06%
BenchmarkFannkuch11 2147483647 2147483647 -0.11%
BenchmarkGobDecode 107863800 107488150 -0.35%
BenchmarkGobEncode 46741000 45501680 -2.65%
BenchmarkGzip 2147483647 2147483647 -0.11%
BenchmarkGunzip 601875800 602400400 +0.09%
BenchmarkJSONEncode 726187800 722774600 -0.47%
BenchmarkJSONDecode 1154932000 1045558000 -9.47%
BenchmarkMandelbrot200 54279720 54285060 +0.01%
BenchmarkParse 38234860 36985280 -3.27%
BenchmarkRevcomp 2147483647 2147483647 -3.04%
BenchmarkTemplate 1498506000 1470787000 -1.85%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 7.12 7.14 1.00x
BenchmarkGobEncode 16.42 16.87 1.03x
BenchmarkGzip 5.83 5.84 1.00x
BenchmarkGunzip 32.24 32.21 1.00x
BenchmarkJSONEncode 2.67 2.68 1.00x
BenchmarkJSONDecode 1.68 1.86 1.11x
BenchmarkParse 1.51 1.57 1.04x
BenchmarkRevcomp 50.79 52.39 1.03x
BenchmarkTemplate 1.29 1.32 1.02x
Message from remyoudompheng@gmail.com
2012-11-15T15:45:25+00:00remyoudomphengurn:md5:d909273426677927b141dbd1e0902feb
Sorry, the latest patchset is a work in progress and it misses ARM support. It cannot work properly due to a bug in 8l. See CL 6846057.
I'll upload a complete patch later.
Message from unknown
2012-11-16T22:40:57+00:00remyoudomphengurn:md5:4fb347a366a45377903fac7aa4aa7179
Message from unknown
2012-11-16T23:51:10+00:00remyoudomphengurn:md5:31d048f5cc662fb99593d2ec48544d4e
Message from unknown
2012-11-17T00:06:39+00:00remyoudomphengurn:md5:eb4ab243b50b68faadab3faf6c94ea39
Message from unknown
2012-11-17T00:21:43+00:00remyoudomphengurn:md5:f12c7781117fce84de42e9c4ac9bc9a3
Message from remyoudompheng@gmail.com
2012-11-17T00:21:54+00:00remyoudomphengurn:md5:144d285e93a86a2a6f5dd5c35504af79
Hello golang-dev@googlegroups.com, dave@cheney.net, mtj@google.com (cc: golang-dev@googlegroups.com),
Please take another look.
Message from remyoudompheng@gmail.com
2012-11-17T00:27:01+00:00remyoudomphengurn:md5:195df56032a08401b78839db1af6302b
The CL now implements ARM with less instructions (using the HMUL operation).
GOARCH=amd64 (Core 2 Quad)
benchmark old ns/op new ns/op delta
BenchmarkUint16Div1 0 0 -30.65%
BenchmarkUint16Div2 0 0 +0.00%
BenchmarkUint16Div4 0 0 +0.00%
BenchmarkUint16Div13 3 3 +12.79%
BenchmarkUint32Div1 0 0 +0.00%
BenchmarkUint32Div2 0 0 +0.00%
BenchmarkUint32Div4 0 0 +0.00%
BenchmarkUint32Div13 3 2 -15.68%
BenchmarkUint64Div1 0 0 +0.00%
BenchmarkUint64Div2 0 0 +0.00%
BenchmarkUint64Div4 0 0 +0.00%
BenchmarkUint64Div13 4 4 -10.84%
BenchmarkInt32Div1 0 0 +0.00%
BenchmarkInt32Div2 2 1 -34.80%
BenchmarkInt32Div4 3 2 -11.97%
BenchmarkInt32Div13 3 4 +12.89%
BenchmarkUint16Mod1 0 0 +0.00%
BenchmarkUint16Mod2 0 0 +0.00%
BenchmarkUint16Mod4 0 0 +0.00%
BenchmarkUint16Mod13 7 7 +4.20%
BenchmarkUint32Mod1 0 0 +0.00%
BenchmarkUint32Mod2 0 0 +0.00%
BenchmarkUint32Mod4 0 0 +0.00%
BenchmarkUint32Mod13 5 4 -13.99%
BenchmarkUint64Mod1 0 1 +36.05%
BenchmarkUint64Mod2 1 0 -14.00%
BenchmarkUint64Mod4 0 0 +0.00%
BenchmarkUint64Mod13 8 7 -10.73%
BenchmarkInt32Mod1 0 0 +0.00%
BenchmarkInt32Mod2 4 2 -48.80%
BenchmarkInt32Mod4 4 2 -42.77%
BenchmarkInt32Mod13 6 5 -11.32%
GOARCH=arm (Odroid-x)
benchmark old ns/op new ns/op delta
BenchmarkUint16Div1 30 3 -88.23%
BenchmarkUint16Div2 38 3 -90.41%
BenchmarkUint16Div4 38 3 -90.41%
BenchmarkUint16Div13 38 9 -74.25%
BenchmarkUint32Div1 25 1 -93.15%
BenchmarkUint32Div2 38 2 -94.61%
BenchmarkUint32Div4 38 2 -94.61%
BenchmarkUint32Div13 38 4 -89.06%
BenchmarkUint64Div1 51 1 -96.56%
BenchmarkUint64Div2 64 3 -94.96%
BenchmarkUint64Div4 64 3 -95.01%
BenchmarkUint64Div13 64 64 +0.00%
BenchmarkInt32Div1 25 1 -93.04%
BenchmarkInt32Div2 38 2 -93.06%
BenchmarkInt32Div4 38 3 -90.76%
BenchmarkInt32Div13 38 5 -84.61%
BenchmarkUint16Mod1 30 3 -87.77%
BenchmarkUint16Mod2 38 3 -90.41%
BenchmarkUint16Mod4 38 3 -90.41%
BenchmarkUint16Mod13 38 17 -54.64%
BenchmarkUint32Mod1 25 2 -92.05%
BenchmarkUint32Mod2 35 2 -94.26%
BenchmarkUint32Mod4 35 2 -94.26%
BenchmarkUint32Mod13 35 7 -78.69%
BenchmarkUint64Mod1 51 2 -96.02%
BenchmarkUint64Mod2 64 4 -93.63%
BenchmarkUint64Mod4 64 4 -93.63%
BenchmarkUint64Mod13 64 72 +12.67%
BenchmarkInt32Mod1 26 2 -92.23%
BenchmarkInt32Mod2 36 3 -90.60%
BenchmarkInt32Mod4 36 3 -89.18%
BenchmarkInt32Mod13 36 9 -75.04%
GOARCH=386 (Core 2 Quad)
benchmark old ns/op new ns/op delta
BenchmarkUint16Div1 7 0 -94.39%
BenchmarkUint16Div2 7 0 -88.74%
BenchmarkUint16Div4 7 0 -88.74%
BenchmarkUint16Div13 7 3 -58.32%
BenchmarkUint32Div1 6 0 -93.79%
BenchmarkUint32Div2 6 0 -87.57%
BenchmarkUint32Div4 6 0 -87.57%
BenchmarkUint32Div13 6 3 -51.59%
BenchmarkUint64Div1 12 0 -93.06%
BenchmarkUint64Div2 12 1 -87.26%
BenchmarkUint64Div4 12 1 -87.26%
BenchmarkUint64Div13 12 12 +0.00%
BenchmarkInt32Div1 7 0 -93.89%
BenchmarkInt32Div2 7 1 -75.76%
BenchmarkInt32Div4 7 2 -61.31%
BenchmarkInt32Div13 7 3 -45.03%
BenchmarkUint16Mod1 7 0 -88.77%
BenchmarkUint16Mod2 7 0 -88.77%
BenchmarkUint16Mod4 7 0 -88.77%
BenchmarkUint16Mod13 7 5 -31.07%
BenchmarkUint32Mod1 6 0 -87.63%
BenchmarkUint32Mod2 6 0 -87.66%
BenchmarkUint32Mod4 6 0 -87.64%
BenchmarkUint32Mod13 6 5 -27.69%
BenchmarkUint64Mod1 12 0 -93.06%
BenchmarkUint64Mod2 12 0 -93.06%
BenchmarkUint64Mod4 12 0 -93.06%
BenchmarkUint64Mod13 12 16 +31.45%
BenchmarkInt32Mod1 7 0 -88.82%
BenchmarkInt32Mod2 7 2 -69.40%
BenchmarkInt32Mod4 7 2 -63.41%
BenchmarkInt32Mod13 7 4 -35.29%
Message from dave@cheney.net
2012-11-17T02:59:06+00:00dfcurn:md5:dc19be62257c79c2cd5b60c0782a9814
Wonderful. Any idea why
BenchmarkUint64Mod13
regresses on arm and i386 ?
Message from dave@cheney.net
2012-11-20T23:26:57+00:00dfcurn:md5:0acfab01d5ea2419d54ca65ce77f707f
gentle ping
Message from remyoudompheng@gmail.com
2012-11-21T07:28:54+00:00remyoudomphengurn:md5:dff5f4744fa3047efd2de6c48cbfe567
I did not take a look yet. I suppose there is one more instruction or something.
Does it look good to you at least?
Message from dave@cheney.net
2012-11-21T07:33:40+00:00dfcurn:md5:a3dedbe686a297ef234cbc7696392e52
Yes, LGTM with thanks. I think it is important to wait for Ian Nigel or Russ before committing.
On 21/11/2012, at 18:28, remyoudompheng@gmail.com wrote:
> I did not take a look yet. I suppose there is one more instruction or
> something.
>
> Does it look good to you at least?
>
> http://codereview.appspot.com/6819123/
Message from rsc@golang.org
2012-11-26T16:28:34+00:00rscurn:md5:413a43722ac1cda7674a5d774607a31b
LGTM
Thank you for the new comments.
For a future CL if you are looking for something fun: the unsigned magic division algorithm can be improved slightly: http://ridiculousfish.com/blog/posts/labor-of-division-episode-iii.html. Most Go code does not use unsigned integers anyway, so the benefit here is minimal.
https://codereview.appspot.com/6819123/diff/16015/src/cmd/5g/cgen.c
File src/cmd/5g/cgen.c (right):
https://codereview.appspot.com/6819123/diff/16015/src/cmd/5g/cgen.c#newcode266
src/cmd/5g/cgen.c:266: case OHMUL:
tab please
https://codereview.appspot.com/6819123/diff/16015/src/cmd/5g/ggen.c
File src/cmd/5g/ggen.c (right):
https://codereview.appspot.com/6819123/diff/16015/src/cmd/5g/ggen.c#newcode478
src/cmd/5g/ggen.c:478: * res = (nl * nr) >> w
s/w/wordsize/
https://codereview.appspot.com/6819123/diff/16015/src/cmd/6g/ggen.c
File src/cmd/6g/ggen.c (right):
https://codereview.appspot.com/6819123/diff/16015/src/cmd/6g/ggen.c#newcode612
src/cmd/6g/ggen.c:612: goto divbymul;
Can delete this goto and the label.
https://codereview.appspot.com/6819123/diff/16015/src/cmd/6g/ggen.c#newcode614
src/cmd/6g/ggen.c:614: divbymul:
// Front end handled 32-bit division. We only need to handle 64-bit.
https://codereview.appspot.com/6819123/diff/16015/src/cmd/6g/ggen.c#newcode621
src/cmd/6g/ggen.c:621: case TUINT8:
delete these cases, leaving only TUINT64.
https://codereview.appspot.com/6819123/diff/16015/src/cmd/6g/ggen.c#newcode655
src/cmd/6g/ggen.c:655: case TINT8:
same
https://codereview.appspot.com/6819123/diff/16015/src/cmd/gc/walk.c
File src/cmd/gc/walk.c (right):
https://codereview.appspot.com/6819123/diff/16015/src/cmd/gc/walk.c#newcode959
src/cmd/gc/walk.c:959: * rewrite div and mod into function calls
rewrite 64-bit div and mod...
https://codereview.appspot.com/6819123/diff/16015/src/cmd/gc/walk.c#newcode2895
src/cmd/gc/walk.c:2895: int pow; // if >= 0, nr is 1<<n
1<<pow?
Message from unknown
2012-11-26T21:18:38+00:00remyoudomphengurn:md5:a42af4e2dfa1aae28c61e9a0142d4735
Message from remyoudompheng@gmail.com
2012-11-26T21:18:55+00:00remyoudomphengurn:md5:416557ecbb42524a536d85e88716dfba
https://codereview.appspot.com/6819123/diff/16015/src/cmd/5g/cgen.c
File src/cmd/5g/cgen.c (right):
https://codereview.appspot.com/6819123/diff/16015/src/cmd/5g/cgen.c#newcode266
src/cmd/5g/cgen.c:266: case OHMUL:
On 2012/11/26 16:28:34, rsc wrote:
> tab please
Done.
https://codereview.appspot.com/6819123/diff/16015/src/cmd/5g/ggen.c
File src/cmd/5g/ggen.c (right):
https://codereview.appspot.com/6819123/diff/16015/src/cmd/5g/ggen.c#newcode478
src/cmd/5g/ggen.c:478: * res = (nl * nr) >> w
On 2012/11/26 16:28:34, rsc wrote:
> s/w/wordsize/
Done.
https://codereview.appspot.com/6819123/diff/16015/src/cmd/6g/ggen.c
File src/cmd/6g/ggen.c (right):
https://codereview.appspot.com/6819123/diff/16015/src/cmd/6g/ggen.c#newcode612
src/cmd/6g/ggen.c:612: goto divbymul;
On 2012/11/26 16:28:34, rsc wrote:
> Can delete this goto and the label.
Done.
https://codereview.appspot.com/6819123/diff/16015/src/cmd/6g/ggen.c#newcode614
src/cmd/6g/ggen.c:614: divbymul:
On 2012/11/26 16:28:34, rsc wrote:
> // Front end handled 32-bit division. We only need to handle 64-bit.
Done.
https://codereview.appspot.com/6819123/diff/16015/src/cmd/6g/ggen.c#newcode621
src/cmd/6g/ggen.c:621: case TUINT8:
On 2012/11/26 16:28:34, rsc wrote:
> delete these cases, leaving only TUINT64.
Done.
https://codereview.appspot.com/6819123/diff/16015/src/cmd/6g/ggen.c#newcode655
src/cmd/6g/ggen.c:655: case TINT8:
On 2012/11/26 16:28:34, rsc wrote:
> same
Done.
https://codereview.appspot.com/6819123/diff/16015/src/cmd/gc/walk.c
File src/cmd/gc/walk.c (right):
https://codereview.appspot.com/6819123/diff/16015/src/cmd/gc/walk.c#newcode959
src/cmd/gc/walk.c:959: * rewrite div and mod into function calls
On 2012/11/26 16:28:34, rsc wrote:
> rewrite 64-bit div and mod...
Done.
https://codereview.appspot.com/6819123/diff/16015/src/cmd/gc/walk.c#newcode2895
src/cmd/gc/walk.c:2895: int pow; // if >= 0, nr is 1<<n
On 2012/11/26 16:28:34, rsc wrote:
> 1<<pow?
Done.
Message from unknown
2012-11-26T22:45:12+00:00remyoudomphengurn:md5:7fb4b0845ab17d488710e9234688d28e
Message from remyoudompheng@gmail.com
2012-11-26T22:45:29+00:00remyoudomphengurn:md5:3310dcd283b4061e0535e015046a367b
*** Submitted as http://code.google.com/p/go/source/detail?r=8f9a26de2b20 ***
cmd/gc: add division rewrite to walk pass.
This allows 5g and 8g to benefit from the rewrite as shifts
or magic multiplies. The 64-bit arithmetic is not handled there,
and left in 6g.
Update issue 2230.
R=golang-dev, dave, mtj, iant, rsc
CC=golang-dev
http://codereview.appspot.com/6819123