cmd/gc: add division rewrite to walk pass.
This allows 5g and 8g to benefit from the rewrite as shifts
or magic multiplies. The 64-bit arithmetic is not handled there,
and left in 6g.
Update issue 2230.
Added unsigned magic multiply treatment: * 32-bit case is suboptimal because it needs a 32*32->64 ...
12 years, 2 months ago
(2012-11-11 13:17:17 UTC)
#8
Added unsigned magic multiply treatment:
* 32-bit case is suboptimal because it needs a 32*32->64 bit multiply for which
we don't have a portable AST representation (other than uint64 product) but
would be faster (64bit*64bit needs 3 MUL, 32*32->64 needs one MUL).
* 64-bit case is not implemented and left to 6g (it would need a 128-bit value).
In the benchmarks (UDiv is uint32, UDivh is uint16):
386
benchmark old ns/op new ns/op delta
BenchmarkUDiv10 6 3 -54.78%
BenchmarkUDiv123456 6 6 +0.29%
BenchmarkUDivh10 7 2 -67.00%
BenchmarkUDivh123 7 3 -50.25%
arm
benchmark old ns/op new ns/op delta
BenchmarkUDiv10 43 10 -74.65%
BenchmarkUDiv123456 37 13 -64.10%
BenchmarkUDivh10 38 9 -74.40%
BenchmarkUDivh123 38 13 -64.66%
5g generates codes like this for 16-bit division (R3 /= 10).
0572 (divmul_test.go:129) MOVHU R3,R0
0573 (divmul_test.go:129) MOVHU R0,R1
0574 (divmul_test.go:129) MOVW R1,R0
0575 (divmul_test.go:129) MOVW $52429,R1
0576 (divmul_test.go:129) MULU R1,R0,R0
0577 (divmul_test.go:129) MOVW R0>>19,R0
0578 (divmul_test.go:129) MOVHU R0,R1
0579 (divmul_test.go:129) MOVHU R1,R3
It could be simplified to
MOVHU R3, R0
MOVW $52429, R1
MULU R1, R0, R0
MOVW R0>>19, R0
MOVHU R0, R3
Some minor suggestions. I don't feel qualified to review the gc changes, apart from observing ...
12 years, 2 months ago
(2012-11-11 22:51:51 UTC)
#11
Some minor suggestions. I don't feel qualified to review the gc changes, apart
from observing the benchmark numbers are impressive.
pando(~/go/test/bench/go) % ~/go/misc/benchcmp {old,new}.txt
benchmark old ns/op new ns/op delta
BenchmarkDiv1 43 3 -92.68%
BenchmarkDiv2 65 4 -93.07%
BenchmarkDiv3 65 21 -67.80%
BenchmarkDiv4 65 6 -90.79%
BenchmarkDiv10 65 24 -63.17%
BenchmarkUDiv1 45 3 -93.34%
BenchmarkUDiv2 65 3 -94.61%
BenchmarkUDiv3 65 17 -73.77%
BenchmarkUDiv4 65 3 -94.61%
BenchmarkUDiv10 65 17 -73.77%
BenchmarkMod1 45 3 -92.23%
BenchmarkMod2 62 9 -85.49%
BenchmarkMod3 62 25 -59.71%
BenchmarkMod4 62 13 -79.61%
BenchmarkMod10 62 29 -53.24%
https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c
File src/cmd/gc/walk.c (right):
https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode894
src/cmd/gc/walk.c:894: * on 386, rewrite float ops into l = l op r.
should this be 386/arm ?
https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode961
src/cmd/gc/walk.c:961: }
why not move the block into this switch then push the default down to the bottom
of the switch and make goto ret; unconditional.
https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode2899
src/cmd/gc/walk.c:2899: // by a constant
nit: maybe move this comment to like 2908
https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode2910
src/cmd/gc/walk.c:2910: // just sign bit
i know this came from 6g, but can the comment be improved ? ie.
// nothing to do because divisor is larger than operand ?
This is all awesome, but, don't we already know that some of these are more ...
12 years, 2 months ago
(2012-11-11 23:19:39 UTC)
#12
This is all awesome, but, don't we already know that some of these are more
than special...
div 1 means noop
div 2 means shift by 1
div 4 means shift by 2
mod 1 means 0
mod 2 means x & 1
mod 4 means x & 3
On Sun, Nov 11, 2012 at 5:51 PM, <dave@cheney.net> wrote:
> Some minor suggestions. I don't feel qualified to review the gc changes,
> apart from observing the benchmark numbers are impressive.
>
> pando(~/go/test/bench/go) % ~/go/misc/benchcmp {old,new}.txt
>
> benchmark old ns/op new ns/op delta
> BenchmarkDiv1 43 3 -92.68%
> BenchmarkDiv2 65 4 -93.07%
> BenchmarkDiv3 65 21 -67.80%
> BenchmarkDiv4 65 6 -90.79%
> BenchmarkDiv10 65 24 -63.17%
> BenchmarkUDiv1 45 3 -93.34%
> BenchmarkUDiv2 65 3 -94.61%
> BenchmarkUDiv3 65 17 -73.77%
> BenchmarkUDiv4 65 3 -94.61%
> BenchmarkUDiv10 65 17 -73.77%
> BenchmarkMod1 45 3 -92.23%
> BenchmarkMod2 62 9 -85.49%
> BenchmarkMod3 62 25 -59.71%
> BenchmarkMod4 62 13 -79.61%
> BenchmarkMod10 62 29 -53.24%
>
>
>
>
https://codereview.appspot.**com/6819123/diff/13001/src/**cmd/gc/walk.c<https...
> File src/cmd/gc/walk.c (right):
>
> https://codereview.appspot.**com/6819123/diff/13001/src/**
>
cmd/gc/walk.c#newcode894<https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode894>
> src/cmd/gc/walk.c:894: * on 386, rewrite float ops into l = l op r.
> should this be 386/arm ?
>
> https://codereview.appspot.**com/6819123/diff/13001/src/**
>
cmd/gc/walk.c#newcode961<https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode961>
> src/cmd/gc/walk.c:961: }
> why not move the block into this switch then push the default down to
> the bottom of the switch and make goto ret; unconditional.
>
> https://codereview.appspot.**com/6819123/diff/13001/src/**
>
cmd/gc/walk.c#newcode2899<https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode2899>
> src/cmd/gc/walk.c:2899: // by a constant
> nit: maybe move this comment to like 2908
>
> https://codereview.appspot.**com/6819123/diff/13001/src/**
>
cmd/gc/walk.c#newcode2910<https://codereview.appspot.com/6819123/diff/13001/src/cmd/gc/walk.c#newcode2910>
> src/cmd/gc/walk.c:2910: // just sign bit
> i know this came from 6g, but can the comment be improved ? ie.
>
> // nothing to do because divisor is larger than operand ?
>
>
https://codereview.appspot.**com/6819123/<https://codereview.appspot.com/6819...
>
--
Michael T. Jones | Chief Technology Advocate | mtj@google.com | +1
650-335-5765
There may be an error with patch set #8. ./all.bash fails the second phase with ...
12 years, 2 months ago
(2012-11-12 04:13:21 UTC)
#14
There may be an error with patch set #8. ./all.bash fails the second phase with
# runtime/cgo
cannot load imported symbols from ELF file $WORK/runtime/cgo/_obj/_cgo_.o:
length of symbol section is not a multiple of SymSize
patch set 11 fails for me on linux/386 220887(~/go/src) % go run ~/go/test/ken/modconst.go u8 121 ...
12 years, 2 months ago
(2012-11-15 11:37:56 UTC)
#21
patch set 11 fails for me on linux/386
220887(~/go/src) % go run ~/go/test/ken/modconst.go
u8 121 121 60 1
panic: fail
goroutine 1 [running]:
main.u8test(0x23c7979, 0xb0bff12)
/home/dfc/go/test/ken/modconst.go:578 +0xee
main.u8run()
/home/dfc/go/test/ken/modconst.go:613 +0x313
main.main()
/home/dfc/go/test/ken/modconst.go:630 +0x3f
goroutine 2 [syscall]:
created by runtime.main
/home/dfc/go/src/pkg/runtime/proc.c:225
exit status 2
Sorry, the latest patchset is a work in progress and it misses ARM support. It ...
12 years, 2 months ago
(2012-11-15 15:45:25 UTC)
#23
Sorry, the latest patchset is a work in progress and it misses ARM support. It
cannot work properly due to a bug in 8l. See CL 6846057.
I'll upload a complete patch later.
Yes, LGTM with thanks. I think it is important to wait for Ian Nigel or ...
12 years, 2 months ago
(2012-11-21 07:33:40 UTC)
#29
Yes, LGTM with thanks. I think it is important to wait for Ian Nigel or Russ
before committing.
On 21/11/2012, at 18:28, remyoudompheng@gmail.com wrote:
> I did not take a look yet. I suppose there is one more instruction or
> something.
>
> Does it look good to you at least?
>
> http://codereview.appspot.com/6819123/
Issue 6819123: code review 6819123: cmd/gc: add division rewrite to walk pass.
(Closed)
Created 12 years, 2 months ago by remyoudompheng
Modified 11 years, 6 months ago
Reviewers:
Base URL:
Comments: 24