Issue 6261051: code review 6261051: cmd/6g: change sbop swap logic

Issue 6261051: code review 6261051: cmd/6g: change sbop swap logic (Closed)

Can't Edit
Can't Publish+Mail
Start Review

Created:
12 years, 7 months ago by rsc

Modified:
12 years, 7 months ago

Reviewers:
ken3

CC:
ken2, golang-dev

Visibility:
Public.

Description

cmd/6g: change sbop swap logic I added the nl->op == OLITERAL case during the recent performance round, and while it helps for small integer constants, it hurts for floating point constants. In the Mandelbrot benchmark it causes 2*Zr*Zi to compile like Zr*2*Zi: 0x000000000042663d <+249>: movsd %xmm6,%xmm0 0x0000000000426641 <+253>: movsd $2,%xmm1 0x000000000042664a <+262>: mulsd %xmm1,%xmm0 0x000000000042664e <+266>: mulsd %xmm5,%xmm0 instead of: 0x0000000000426835 <+276>: movsd $2,%xmm0 0x000000000042683e <+285>: mulsd %xmm6,%xmm0 0x0000000000426842 <+289>: mulsd %xmm5,%xmm0 It is unclear why that has such a dramatic performance effect in a tight loop, but it's obviously slightly better code, so go with it. benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 5957470000 5973924000 +0.28% BenchmarkFannkuch11 3811295000 3869128000 +1.52% BenchmarkGobDecode 26001900 25670500 -1.27% BenchmarkGobEncode 12051430 11948590 -0.85% BenchmarkGzip 177432 174821 -1.47% BenchmarkGunzip 10967 10756 -1.92% BenchmarkJSONEncode 78924750 79746900 +1.04% BenchmarkJSONDecode 313606400 307081600 -2.08% BenchmarkMandelbrot200 13670860 8200725 -40.01% !!! BenchmarkRevcomp25M 1179194000 1206539000 +2.32% BenchmarkTemplate 447931200 443948200 -0.89% BenchmarkMD5Hash1K 2856 2873 +0.60% BenchmarkMD5Hash8K 22083 22029 -0.24% benchmark old MB/s new MB/s speedup BenchmarkGobDecode 29.52 29.90 1.01x BenchmarkGobEncode 63.69 64.24 1.01x BenchmarkJSONEncode 24.59 24.33 0.99x BenchmarkJSONDecode 6.19 6.32 1.02x BenchmarkRevcomp25M 215.54 210.66 0.98x BenchmarkTemplate 4.33 4.37 1.01x BenchmarkMD5Hash1K 358.54 356.31 0.99x BenchmarkMD5Hash8K 370.95 371.86 1.00x

Patch Set 1 #

Patch Set 2 : diff -r 1195a9780464 https://go.googlecode.com/hg/ #

Patch Set 3 : diff -r 1195a9780464 https://go.googlecode.com/hg/ #

Created: 12 years, 7 months ago

Download [raw] [tar.bz2]

		Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+19 lines, -1 line)			Patch
	M	src/cmd/6g/cgen.c	View	1	1 chunk	+19 lines, -1 line	0 comments	Download

Messages

Total messages: 3

Expand All Messages | Collapse All Messages

rsc

Hello ken2 (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://go.googlecode.com/hg/

12 years, 7 months ago (2012-05-30 14:22:31 UTC) #1

rsc

*** Submitted as http://code.google.com/p/go/source/detail?r=8d29e1651d7a *** cmd/6g: change sbop swap logic I added the nl->op == ...

12 years, 7 months ago (2012-05-30 14:22:40 UTC) #2

*** Submitted as http://code.google.com/p/go/source/detail?r=8d29e1651d7a ***

cmd/6g: change sbop swap logic

I added the nl->op == OLITERAL case during the recent
performance round, and while it helps for small integer constants,
it hurts for floating point constants.  In the Mandelbrot benchmark
it causes 2*Zr*Zi to compile like Zr*2*Zi:

        0x000000000042663d <+249>:	movsd  %xmm6,%xmm0
        0x0000000000426641 <+253>:	movsd  $2,%xmm1
        0x000000000042664a <+262>:	mulsd  %xmm1,%xmm0
        0x000000000042664e <+266>:	mulsd  %xmm5,%xmm0

instead of:

        0x0000000000426835 <+276>:	movsd  $2,%xmm0
        0x000000000042683e <+285>:	mulsd  %xmm6,%xmm0
        0x0000000000426842 <+289>:	mulsd  %xmm5,%xmm0

It is unclear why that has such a dramatic performance effect
in a tight loop, but it's obviously slightly better code, so go with it.

benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    5957470000   5973924000   +0.28%
BenchmarkFannkuch11      3811295000   3869128000   +1.52%
BenchmarkGobDecode         26001900     25670500   -1.27%
BenchmarkGobEncode         12051430     11948590   -0.85%
BenchmarkGzip                177432       174821   -1.47%
BenchmarkGunzip               10967        10756   -1.92%
BenchmarkJSONEncode        78924750     79746900   +1.04%
BenchmarkJSONDecode       313606400    307081600   -2.08%
BenchmarkMandelbrot200     13670860      8200725  -40.01%  !!!
BenchmarkRevcomp25M      1179194000   1206539000   +2.32%
BenchmarkTemplate         447931200    443948200   -0.89%
BenchmarkMD5Hash1K             2856         2873   +0.60%
BenchmarkMD5Hash8K            22083        22029   -0.24%

benchmark                  old MB/s     new MB/s  speedup
BenchmarkGobDecode            29.52        29.90    1.01x
BenchmarkGobEncode            63.69        64.24    1.01x
BenchmarkJSONEncode           24.59        24.33    0.99x
BenchmarkJSONDecode            6.19         6.32    1.02x
BenchmarkRevcomp25M          215.54       210.66    0.98x
BenchmarkTemplate              4.33         4.37    1.01x
BenchmarkMD5Hash1K           358.54       356.31    0.99x
BenchmarkMD5Hash8K           370.95       371.86    1.00x

R=ken2
CC=golang-dev
http://codereview.appspot.com/6261051

lgtm

Expand All Messages | Collapse All Messages