Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(88)

Issue 6446165: code review 6446165: math/big: unroll loops a bit in amd64 assembly routines. (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
13 years, 9 months ago by remyoudompheng
Modified:
13 years, 9 months ago
CC:
golang-dev, remy_archlinux.org
Visibility:
Public.

Description

math/big: unroll loops a bit in amd64 assembly routines. Processing 4 words at a time reduces the amount of instructions needed to save and restore the carry flag, among other things. Benchmarks on a Core 2 Quad Q8200@2.33GHz benchmark old ns/op new ns/op delta BenchmarkAdd_1w 50 48 -2.40% BenchmarkAdd_2w 50 52 +4.55% BenchmarkAdd_5w 55 59 +5.73% BenchmarkAdd_100kb 4285 2528 -41.00% BenchmarkAdd_1Mb 44307 24145 -45.51% BenchmarkAdd_5Mb 325697 289706 -11.05% BenchmarkAdd_10Mb 1137018 1106273 -2.70% BenchmarkMul_1w 52 52 -0.76% BenchmarkMul_2w 117 117 +0.00% BenchmarkMul_5w 241 228 -5.39% BenchmarkMul_1kb 1101 940 -14.62% BenchmarkMul_10kb 59019 47135 -20.14% BenchmarkMul_50kb 829171 643858 -22.35% BenchmarkMul_100kb 2563856 1999235 -22.02% BenchmarkMul_1Mb 105886450 83408800 -21.23% BenchmarkMul_5Mb 1285270000 1005876000 -21.74% BenchmarkMul_10Mb 3869718000 3029543000 -21.71%

Patch Set 1 #

Patch Set 2 : diff -r b855390a295f https://go.googlecode.com/hg/ #

Patch Set 3 : diff -r b855390a295f https://go.googlecode.com/hg/ #

Patch Set 4 : diff -r b855390a295f https://go.googlecode.com/hg/ #

Total comments: 20
Unified diffs Side-by-side diffs Delta from patch set Stats (+158 lines, -25 lines) Patch
M src/pkg/math/big/arith_amd64.s View 1 2 chunks +115 lines, -25 lines 13 comments Download
M src/pkg/math/big/nat_test.go View 1 2 3 1 chunk +43 lines, -0 lines 7 comments Download

Messages

Total messages: 22
remyoudompheng
Hello gri@golang.org, golang-dev@googlegroups.org (cc: golang-dev@googlegroups.com, remy@archlinux.org), I'd like you to review this change to https://go.googlecode.com/hg/
13 years, 9 months ago (2012-08-21 17:55:46 UTC) #1
gri
I will look at this a bit later (today or tomorrow). In the meantime could ...
13 years, 9 months ago (2012-08-21 18:53:59 UTC) #2
remyoudompheng
Hello gri@golang.org, golang-dev@googlegroups.org (cc: golang-dev@googlegroups.com, remy@archlinux.org), Please take another look.
13 years, 9 months ago (2012-08-21 19:03:31 UTC) #3
Christopher Swenson
On 2012/08/21 19:03:31, remyoudompheng wrote: > Hello mailto:gri@golang.org, mailto:golang-dev@googlegroups.org (cc: > mailto:golang-dev@googlegroups.com, mailto:remy@archlinux.org), > > ...
13 years, 9 months ago (2012-08-21 21:42:11 UTC) #4
Christopher Swenson
http://codereview.appspot.com/6446165/diff/5/src/pkg/math/big/arith_amd64.s File src/pkg/math/big/arith_amd64.s (right): http://codereview.appspot.com/6446165/diff/5/src/pkg/math/big/arith_amd64.s#newcode38 src/pkg/math/big/arith_amd64.s:38: MOVQ $0, BX // i = 0 This instruction ...
13 years, 9 months ago (2012-08-21 21:42:57 UTC) #5
nigeltao
http://codereview.appspot.com/6446165/diff/5/src/pkg/math/big/arith_amd64.s File src/pkg/math/big/arith_amd64.s (right): http://codereview.appspot.com/6446165/diff/5/src/pkg/math/big/arith_amd64.s#newcode38 src/pkg/math/big/arith_amd64.s:38: MOVQ $0, BX // i = 0 On 2012/08/21 ...
13 years, 9 months ago (2012-08-22 01:32:39 UTC) #6
nigeltao
http://codereview.appspot.com/6446165/diff/5/src/pkg/math/big/arith_amd64.s File src/pkg/math/big/arith_amd64.s (right): http://codereview.appspot.com/6446165/diff/5/src/pkg/math/big/arith_amd64.s#newcode37 src/pkg/math/big/arith_amd64.s:37: SHLQ $2, CX // CX = (n/4)*4 Should the ...
13 years, 9 months ago (2012-08-22 01:55:47 UTC) #7
r
The 'linkers' (6l etc.) turn the pseudo-ops into the best instruction for the job. The ...
13 years, 9 months ago (2012-08-22 02:16:17 UTC) #8
remyoudompheng
On 2012/08/21 21:42:11, Christopher Swenson wrote: > Though, when we are talking about 1Mb+ numbers ...
13 years, 9 months ago (2012-08-22 03:44:59 UTC) #9
remyoudompheng
On 2012/08/22 03:44:59, remyoudompheng wrote: > I am currently working on a FFT-based implementation. I ...
13 years, 9 months ago (2012-08-22 03:45:44 UTC) #10
gri
Thanks for working on this. I think the assembly code can be made more tight. ...
13 years, 9 months ago (2012-08-22 23:35:02 UTC) #11
nigeltao
http://codereview.appspot.com/6446165/diff/5/src/pkg/math/big/arith_amd64.s File src/pkg/math/big/arith_amd64.s (right): http://codereview.appspot.com/6446165/diff/5/src/pkg/math/big/arith_amd64.s#newcode30 src/pkg/math/big/arith_amd64.s:30: TEXT ·addVV(SB),7,$0 On 2012/08/22 23:35:02, gri wrote: > MOVL ...
13 years, 9 months ago (2012-08-23 04:36:30 UTC) #12
gri
FYI. http://codereview.appspot.com/6446165/diff/5/src/pkg/math/big/arith_amd64.s File src/pkg/math/big/arith_amd64.s (right): http://codereview.appspot.com/6446165/diff/5/src/pkg/math/big/arith_amd64.s#newcode30 src/pkg/math/big/arith_amd64.s:30: TEXT ·addVV(SB),7,$0 Here's a version which has even ...
13 years, 9 months ago (2012-08-23 05:10:17 UTC) #13
gri
Good to know (MOVLQZX). So the first instructions can be: MOVL n+8(FP), CX TESTQ CX, ...
13 years, 9 months ago (2012-08-23 05:18:39 UTC) #14
remyoudompheng
For some reason here DECQ/JNZ is 2x times slower than CMPQ/JL (for both rolled/unrolled versions), ...
13 years, 9 months ago (2012-08-23 06:58:28 UTC) #15
gri
It may well be that DECQ/JNZ is slower than CMPQ/JL. It used to be the ...
13 years, 9 months ago (2012-08-23 15:32:51 UTC) #16
Christopher Swenson
Could case b) be helped by using prefetching? I would guess that loop prediction + ...
13 years, 9 months ago (2012-08-23 15:49:59 UTC) #17
gri
FYI: I just submitted http://codereview.appspot.com/**6478055/<http://codereview.appspot.com/6478055/> which contains benchmarks for some of the core vector routines. ...
13 years, 9 months ago (2012-08-23 23:00:42 UTC) #18
iant2
On Thu, Aug 23, 2012 at 4:00 PM, Robert Griesemer <gri@golang.org> wrote: > > I ...
13 years, 9 months ago (2012-08-23 23:24:26 UTC) #19
gri
There you go! Used to be true in 1994, and it's still true :-) - ...
13 years, 9 months ago (2012-08-23 23:26:45 UTC) #20
remyoudompheng
Superseded by http://codereview.appspot.com/6482062/
13 years, 9 months ago (2012-08-24 19:26:13 UTC) #21
remyoudompheng
13 years, 9 months ago (2012-08-24 19:26:50 UTC) #22
*** Abandoned ***
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b