Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(80)

Issue 8056043: code review 8056043: runtime: Implement faster equals for strings and bytes. (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
11 years, 9 months ago by khr
Modified:
11 years, 8 months ago
Reviewers:
dave, albert.strasheim
CC:
bradfitz, r, khr1, remyoudompheng, minux1, ality, golang-dev
Visibility:
Public.

Description

runtime: Implement faster equals for strings and bytes. (amd64) benchmark old ns/op new ns/op delta BenchmarkEqual0 16 6 -63.15% BenchmarkEqual9 22 7 -65.37% BenchmarkEqual32 36 9 -74.91% BenchmarkEqual4K 2187 120 -94.51% benchmark old MB/s new MB/s speedup BenchmarkEqual9 392.22 1134.38 2.89x BenchmarkEqual32 866.72 3457.39 3.99x BenchmarkEqual4K 1872.73 33998.87 18.15x (386) benchmark old ns/op new ns/op delta BenchmarkEqual0 16 5 -63.85% BenchmarkEqual9 22 7 -67.84% BenchmarkEqual32 34 12 -64.94% BenchmarkEqual4K 2196 113 -94.85% benchmark old MB/s new MB/s speedup BenchmarkEqual9 405.81 1260.18 3.11x BenchmarkEqual32 919.55 2631.21 2.86x BenchmarkEqual4K 1864.85 36072.54 19.34x Update issue 3751

Patch Set 1 #

Patch Set 2 : diff -r ff83e39f1348 https://khr%40golang.org@code.google.com/p/go/ #

Patch Set 3 : diff -r d040d5f08d5d https://khr%40golang.org@code.google.com/p/go/ #

Patch Set 4 : diff -r d040d5f08d5d https://khr%40golang.org@code.google.com/p/go/ #

Patch Set 5 : diff -r d040d5f08d5d https://khr%40golang.org@code.google.com/p/go/ #

Patch Set 6 : diff -r d040d5f08d5d https://khr%40golang.org@code.google.com/p/go/ #

Total comments: 2

Patch Set 7 : diff -r d040d5f08d5d https://khr%40golang.org@code.google.com/p/go/ #

Total comments: 2

Patch Set 8 : diff -r 2433c3567273 https://khr%40golang.org@code.google.com/p/go/ #

Total comments: 6

Patch Set 9 : diff -r 7505bf6d8988 https://khr%40golang.org@code.google.com/p/go/ #

Patch Set 10 : diff -r 7505bf6d8988 https://khr%40golang.org@code.google.com/p/go/ #

Total comments: 2

Patch Set 11 : diff -r 7505bf6d8988 https://khr%40golang.org@code.google.com/p/go/ #

Patch Set 12 : diff -r 52e3407d249f https://khr%40golang.org@code.google.com/p/go/ #

Patch Set 13 : diff -r 52e3407d249f https://khr%40golang.org@code.google.com/p/go/ #

Patch Set 14 : diff -r 52e3407d249f https://khr%40golang.org@code.google.com/p/go/ #

Unified diffs Side-by-side diffs Delta from patch set Stats (+416 lines, -62 lines) Patch
M src/pkg/bytes/asm_386.s View 1 2 3 4 5 6 7 1 chunk +0 lines, -16 lines 0 comments Download
M src/pkg/bytes/asm_amd64.s View 1 2 3 4 5 6 7 1 chunk +0 lines, -17 lines 0 comments Download
M src/pkg/bytes/bytes_decl.go View 1 2 3 4 5 6 7 8 9 10 11 1 chunk +1 line, -1 line 0 comments Download
M src/pkg/bytes/bytes_test.go View 1 2 3 4 5 6 7 8 9 10 3 chunks +76 lines, -0 lines 0 comments Download
A src/pkg/bytes/equal_test.go View 1 2 3 4 5 6 7 8 9 10 11 12 1 chunk +45 lines, -0 lines 0 comments Download
M src/pkg/runtime/alg.c View 1 3 chunks +6 lines, -17 lines 0 comments Download
M src/pkg/runtime/asm_386.s View 1 2 3 4 5 6 7 8 9 10 11 1 chunk +115 lines, -0 lines 0 comments Download
M src/pkg/runtime/asm_amd64.s View 1 2 3 4 5 6 7 8 9 10 11 1 chunk +112 lines, -0 lines 0 comments Download
M src/pkg/runtime/asm_arm.s View 1 2 3 4 5 6 7 8 9 10 2 chunks +18 lines, -1 line 0 comments Download
M src/pkg/runtime/hashmap.c View 1 2 3 4 5 6 7 8 9 10 11 1 chunk +1 line, -1 line 0 comments Download
M src/pkg/runtime/mapspeed_test.go View 1 2 3 4 5 6 7 8 9 1 chunk +11 lines, -0 lines 0 comments Download
M src/pkg/runtime/runtime.h View 1 2 3 4 5 6 7 1 chunk +2 lines, -0 lines 0 comments Download
M src/pkg/runtime/string.goc View 1 2 2 chunks +1 line, -9 lines 0 comments Download
M src/pkg/runtime/string_test.go View 1 2 3 4 5 6 7 8 1 chunk +28 lines, -0 lines 0 comments Download

Messages

Total messages: 49
khr
Hello bradfitz@golang.org (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://khr%40golang.org@code.google.com/p/go/
11 years, 9 months ago (2013-03-27 22:14:12 UTC) #1
r
run go vet. i think it will ask for a few tweaks to the assembly
11 years, 9 months ago (2013-03-27 22:34:05 UTC) #2
bradfitz
LGTM but I'm not qualified to review runtime/asm_arm.s On Wed, Mar 27, 2013 at 3:14 ...
11 years, 9 months ago (2013-03-27 22:51:25 UTC) #3
bradfitz
https://codereview.appspot.com/8056043/diff/11001/src/pkg/bytes/bytes_test.go File src/pkg/bytes/bytes_test.go (right): https://codereview.appspot.com/8056043/diff/11001/src/pkg/bytes/bytes_test.go#newcode64 src/pkg/bytes/bytes_test.go:64: {[]byte("abcdefgh"), []byte("abcdefgh"), 0}, comment above this line about what ...
11 years, 9 months ago (2013-03-27 23:35:20 UTC) #4
khr1
On Wed, Mar 27, 2013 at 4:35 PM, <bradfitz@golang.org> wrote: > > https://codereview.appspot.**com/8056043/diff/11001/src/** > pkg/bytes/bytes_test.go<https://codereview.appspot.com/8056043/diff/11001/src/pkg/bytes/bytes_test.go> ...
11 years, 9 months ago (2013-03-28 02:34:48 UTC) #5
dave_cheney.net
Thanks. I'll try this on some arm systems soon and get some benchmarks. I'd like ...
11 years, 9 months ago (2013-03-28 03:17:31 UTC) #6
dave_cheney.net
https://codereview.appspot.com/8056043/diff/18001/src/pkg/bytes/asm_386.s File src/pkg/bytes/asm_386.s (right): https://codereview.appspot.com/8056043/diff/18001/src/pkg/bytes/asm_386.s#newcode19 src/pkg/bytes/asm_386.s:19: TEXT ·Equal(SB),7,$12 I would like to suggest that this ...
11 years, 9 months ago (2013-03-28 03:26:05 UTC) #7
dave_cheney.net
Some excellent speedups on arm (chromebook) for Equals, but a regression on some other benchmarks. ...
11 years, 9 months ago (2013-03-28 03:55:42 UTC) #8
khr1
It looks like both the slow Index and Count ones are calling eq with 7-byte ...
11 years, 9 months ago (2013-03-28 04:25:42 UTC) #9
remyoudompheng
I have added the following benchmarks to runtime/string_test.go: func BenchmarkCompareStringBigUnaligned(b *testing.B) { bytes := make([]byte, ...
11 years, 9 months ago (2013-03-28 07:31:24 UTC) #10
remyoudompheng
https://codereview.appspot.com/8056043/diff/18001/src/pkg/bytes/bytes_test.go File src/pkg/bytes/bytes_test.go (right): https://codereview.appspot.com/8056043/diff/18001/src/pkg/bytes/bytes_test.go#newcode92 src/pkg/bytes/bytes_test.go:92: I would have added a test like this: func ...
11 years, 9 months ago (2013-03-28 07:43:58 UTC) #11
remyoudompheng
On 2013/03/28 07:31:24, remyoudompheng wrote: > I could squeeze a bit more (20%) on ARM ...
11 years, 9 months ago (2013-03-28 07:51:10 UTC) #12
khr1
Dave: I've moved bytes.Equal to the runtime files. Good idea, that puts all that code ...
11 years, 9 months ago (2013-03-29 07:06:48 UTC) #13
bradfitz
LGTM Assembly people should chime in, though. https://codereview.appspot.com/8056043/diff/39001/src/pkg/bytes/bytes_test.go File src/pkg/bytes/bytes_test.go (right): https://codereview.appspot.com/8056043/diff/39001/src/pkg/bytes/bytes_test.go#newcode335 src/pkg/bytes/bytes_test.go:335: buf := ...
11 years, 9 months ago (2013-03-29 07:26:09 UTC) #14
remyoudompheng
On 2013/03/29 07:06:48, khr1 wrote: > Dave: > I've moved bytes.Equal to the runtime files. ...
11 years, 9 months ago (2013-03-29 07:34:57 UTC) #15
khr1
On Fri, Mar 29, 2013 at 12:26 AM, <bradfitz@golang.org> wrote: > LGTM > > Assembly ...
11 years, 9 months ago (2013-03-29 07:38:26 UTC) #16
remyoudompheng
https://codereview.appspot.com/8056043/diff/39001/src/pkg/runtime/asm_386.s File src/pkg/runtime/asm_386.s (right): https://codereview.appspot.com/8056043/diff/39001/src/pkg/runtime/asm_386.s#newcode1027 src/pkg/runtime/asm_386.s:1027: PCMPEQB X1, X0 Do we explicitly require SSE2 on ...
11 years, 9 months ago (2013-03-29 07:38:39 UTC) #17
khr1
The SSE way is about 2x as fast, at least when the strings are intermediate ...
11 years, 9 months ago (2013-03-29 07:53:58 UTC) #18
dave_cheney.net
Thanks Keith, I'll get to benchmarking. https://codereview.appspot.com/8056043/diff/39001/src/pkg/runtime/asm_386.s File src/pkg/runtime/asm_386.s (right): https://codereview.appspot.com/8056043/diff/39001/src/pkg/runtime/asm_386.s#newcode996 src/pkg/runtime/asm_386.s:996: TEXT bytes·Equal(SB),7,$0 On ...
11 years, 9 months ago (2013-03-29 09:03:03 UTC) #19
albert.strasheim
Looks very good compared to glibc 4.7.2 on a Core i7-3720QM: BenchmarkRuntimeMemequal1K 25993.11 MB/s BenchmarkRuntimeMemequal1M ...
11 years, 8 months ago (2013-03-29 14:25:00 UTC) #20
minux1
On Fri, Mar 29, 2013 at 3:53 PM, Keith Randall <khr@google.com> wrote: > The SSE ...
11 years, 8 months ago (2013-03-29 14:44:25 UTC) #21
bradfitz
https://codereview.appspot.com/8056043/diff/39001/src/pkg/runtime/mapspeed_test.go File src/pkg/runtime/mapspeed_test.go (right): https://codereview.appspot.com/8056043/diff/39001/src/pkg/runtime/mapspeed_test.go#newcode124 src/pkg/runtime/mapspeed_test.go:124: key := strings.Repeat("X", 1<<20) On 2013/03/29 07:26:09, bradfitz wrote: ...
11 years, 8 months ago (2013-03-29 15:44:03 UTC) #22
khr1
> This was my suggestion. I believe the are precedents from the time > package. ...
11 years, 8 months ago (2013-03-29 17:16:39 UTC) #23
ality
minux <minux.ma@gmail.com> once said: > On Fri, Mar 29, 2013 at 3:53 PM, Keith Randall ...
11 years, 8 months ago (2013-03-29 17:58:09 UTC) #24
r
https://codereview.appspot.com/8056043/diff/35002/src/pkg/bytes/bytes_test.go File src/pkg/bytes/bytes_test.go (right): https://codereview.appspot.com/8056043/diff/35002/src/pkg/bytes/bytes_test.go#newcode94 src/pkg/bytes/bytes_test.go:94: b1 := make([]byte, 64) const size = 64 https://codereview.appspot.com/8056043/diff/35002/src/pkg/bytes/bytes_test.go#newcode102 ...
11 years, 8 months ago (2013-03-29 18:39:46 UTC) #25
ality
Anthony Martin <ality@pbrane.org> once said: > minux <minux.ma@gmail.com> once said: > > On Fri, Mar ...
11 years, 8 months ago (2013-03-29 19:07:09 UTC) #26
minux1
On Sat, Mar 30, 2013 at 1:58 AM, Anthony Martin <ality@pbrane.org> wrote: > According to ...
11 years, 8 months ago (2013-03-30 16:15:25 UTC) #27
ality
minux <minux.ma@gmail.com> once said: > On Sat, Mar 30, 2013 at 1:58 AM, Anthony Martin ...
11 years, 8 months ago (2013-03-30 16:51:48 UTC) #28
minux1
On Sun, Mar 31, 2013 at 12:51 AM, Anthony Martin <ality@pbrane.org> wrote: > minux <minux.ma@gmail.com> ...
11 years, 8 months ago (2013-03-30 17:08:48 UTC) #29
dave_cheney.net
I benchmarked patchset 10 on my chromebook, the results are mixed. The regressions in the ...
11 years, 8 months ago (2013-03-31 00:04:11 UTC) #30
bradfitz
Which arch is your chromebook? On Sat, Mar 30, 2013 at 5:04 PM, <dave@cheney.net> wrote: ...
11 years, 8 months ago (2013-03-31 00:54:53 UTC) #31
dave_cheney.net
This is the dual A15 Exynos 5 version. I can do some benchmarks on a ...
11 years, 8 months ago (2013-03-31 00:57:13 UTC) #32
dave_cheney.net
Here are some linux/386 from a 4 core atom 330. 220887(~) % cat bytes.txt benchmark ...
11 years, 8 months ago (2013-03-31 01:25:18 UTC) #33
dave_cheney.net
Here is a bit more data from the linxu/arm regression for Count4K before: Total: 118 ...
11 years, 8 months ago (2013-03-31 03:01:21 UTC) #34
khr1
I don't understand why it is so slow. A bit slower, maybe, but not that ...
11 years, 8 months ago (2013-03-31 04:14:03 UTC) #35
dave_cheney.net
Your new runtime.memeqbody is much faster than my old byte by byte attempt, i'm experimenting ...
11 years, 8 months ago (2013-03-31 11:15:18 UTC) #36
dave_cheney.net
Here is another arm datapoint, from a pandaboard, Cortex-A9. alarm(~/go/src/pkg/bytes) % ~/go/misc/benchcmp {old,new}.txt benchmark old ...
11 years, 8 months ago (2013-03-31 11:33:58 UTC) #37
bradfitz
On Sat, Mar 30, 2013 at 9:14 PM, Keith Randall <khr@google.com> wrote: > I don't ...
11 years, 8 months ago (2013-03-31 22:54:49 UTC) #38
dave_cheney.net
Last night I copied khr's new code back into bytes.Equals, removing all the indirection and ...
11 years, 8 months ago (2013-03-31 23:40:15 UTC) #39
khr1
If the simple implemention really is much faster for short strings, I'm in favor of ...
11 years, 8 months ago (2013-04-01 00:17:50 UTC) #40
dave_cheney.net
SGTM. Lets try that. On 01/04/2013, at 11:17, Keith Randall <khr@google.com> wrote: > If the ...
11 years, 8 months ago (2013-04-01 00:21:03 UTC) #41
khr
*** Submitted as https://code.google.com/p/go/source/detail?r=7f91fab04475 *** runtime: Implement faster equals for strings and bytes. (amd64) benchmark ...
11 years, 8 months ago (2013-04-02 23:26:20 UTC) #42
dave_cheney.net
Thanks khr, i'll try to integrate your faster word-at-a-time version into the asm_arm.s Cheers Dave
11 years, 8 months ago (2013-04-03 01:56:26 UTC) #43
albert.strasheim
Since this change has gone in, I'm seeing the following crash on linux/amd64: go test ...
11 years, 8 months ago (2013-04-03 05:43:38 UTC) #44
dave_cheney.net
Go 1.1 will be fast enough[1] without this change. If it's unstable then my vote ...
11 years, 8 months ago (2013-04-03 05:47:04 UTC) #45
albert.strasheim
On 2013/04/03 05:47:04, dfc wrote: > Go 1.1 will be fast enough[1] without this change. ...
11 years, 8 months ago (2013-04-03 05:51:22 UTC) #46
remyoudompheng
On 3 Apr 2013 07:47, "Dave Cheney" <dave@cheney.net> wrote: > > Go 1.1 will be ...
11 years, 8 months ago (2013-04-03 05:53:53 UTC) #47
remyoudompheng
On 2013/4/3 <fullung@gmail.com> wrote: > On 2013/04/03 05:47:04, dfc wrote: >> >> Go 1.1 will ...
11 years, 8 months ago (2013-04-03 06:33:05 UTC) #48
dave_cheney.net
11 years, 8 months ago (2013-04-03 07:35:51 UTC) #49
Wow, great catch!

On Wed, Apr 3, 2013 at 5:33 PM, Rémy Oudompheng
<remyoudompheng@gmail.com> wrote:
> On 2013/4/3  <fullung@gmail.com> wrote:
>> On 2013/04/03 05:47:04, dfc wrote:
>>>
>>> Go 1.1 will be fast enough[1] without this change. If it's unstable
>>> then my vote would be to revert this change so we have a chance of
>>> publishing a release candidate. Thoughts ?
>>
>>
>> Without looking past the stack trace, it seems like it might be a GC
>> issue though, in which case reverting this isn't going to fix the
>> underlying problem...
>
> Please try https://codereview.appspot.com/8300044
> I'm surprised the test could possibly work for anyone.
> [maybe the better fix is to check the error returned by syscall.Mprotect]
>
> Rémy.
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b