cmd/gc: allocate select descriptor on stack
benchmark old ns/op new ns/op delta
BenchmarkSelectUncontended 220 165 -25.00%
BenchmarkSelectContended 209 161 -22.97%
BenchmarkSelectProdCons 1042 904 -13.24%
But more importantly this change will allow
to get rid of free function in runtime.
Fixes issue 6494.
Hello golang-codereviews@googlegroups.com (cc: khr@golang.org, remyoudompheng@gmail.com, rsc@golang.org), I'd like you to review this change to https://dvyukov%40google.com@code.google.com/p/go/
9 years, 9 months ago
(2014-07-09 13:51:45 UTC)
#1
https://codereview.appspot.com/107670043/diff/80001/src/pkg/runtime/chan.goc File src/pkg/runtime/chan.goc (right): https://codereview.appspot.com/107670043/diff/80001/src/pkg/runtime/chan.goc#newcode991 src/pkg/runtime/chan.goc:991: sel = runtime·mallocgc(selectsize(cases.len), 0, FlagNoScan); On 2014/07/09 17:44:38, rsc ...
9 years, 9 months ago
(2014-07-09 17:55:55 UTC)
#4
https://codereview.appspot.com/107670043/diff/80001/src/pkg/runtime/chan.goc
File src/pkg/runtime/chan.goc (right):
https://codereview.appspot.com/107670043/diff/80001/src/pkg/runtime/chan.goc#...
src/pkg/runtime/chan.goc:991: sel = runtime·mallocgc(selectsize(cases.len), 0,
FlagNoScan);
On 2014/07/09 17:44:38, rsc wrote:
> Why is it FlagNoScan? More to the point, if FlagNoScan is okay here, why did
you
> need to define a type in the compiler?
FlagNoScan works here because all objects are referenced from cases. This is not
the case with normal selects, where it's possible that a chan is referenced only
from the select descriptor.
But I've removed FlagNoScan to avoid confusion.
PTAL
https://codereview.appspot.com/107670043/diff/100001/src/pkg/runtime/chan.goc File src/pkg/runtime/chan.goc (right): https://codereview.appspot.com/107670043/diff/100001/src/pkg/runtime/chan.goc#newcode991 src/pkg/runtime/chan.goc:991: sel = runtime·mallocgc(selectsize(cases.len), 0, 0); What does 0 mean ...
9 years, 9 months ago
(2014-07-09 18:34:43 UTC)
#6
9 years, 9 months ago
(2014-07-09 18:39:18 UTC)
#7
On 2014/07/09 18:34:43, rsc wrote:
> https://codereview.appspot.com/107670043/diff/100001/src/pkg/runtime/chan.goc
> File src/pkg/runtime/chan.goc (right):
>
>
https://codereview.appspot.com/107670043/diff/100001/src/pkg/runtime/chan.goc...
> src/pkg/runtime/chan.goc:991: sel = runtime·mallocgc(selectsize(cases.len), 0,
> 0);
> What does 0 mean then? Is the select descriptor now conservatively scanned?
It's
> not clear that that's better.
Yes, it's scanned conservatively. Maybe it's not better, but that's what we have
now. Currently it's allocated with runtime.mal which is mallocgc(n, 0, 0).
Eventually we will need to give it a proper type and scan (required for any
fancy GC), which neither of the above anyway.
On 2014/07/09 18:39:18, dvyukov wrote: > On 2014/07/09 18:34:43, rsc wrote: > > https://codereview.appspot.com/107670043/diff/100001/src/pkg/runtime/chan.goc > ...
9 years, 9 months ago
(2014-07-16 19:37:21 UTC)
#8
On 2014/07/09 18:39:18, dvyukov wrote:
> On 2014/07/09 18:34:43, rsc wrote:
> >
https://codereview.appspot.com/107670043/diff/100001/src/pkg/runtime/chan.goc
> > File src/pkg/runtime/chan.goc (right):
> >
> >
>
https://codereview.appspot.com/107670043/diff/100001/src/pkg/runtime/chan.goc...
> > src/pkg/runtime/chan.goc:991: sel = runtime·mallocgc(selectsize(cases.len),
0,
> > 0);
> > What does 0 mean then? Is the select descriptor now conservatively scanned?
> It's
> > not clear that that's better.
>
> Yes, it's scanned conservatively. Maybe it's not better, but that's what we
have
> now. Currently it's allocated with runtime.mal which is mallocgc(n, 0, 0).
> Eventually we will need to give it a proper type and scan (required for any
> fancy GC), which neither of the above anyway.
ping
*** Submitted as https://code.google.com/p/go/source/detail?r=e1d0077340e8 *** cmd/gc: allocate select descriptor on stack benchmark old ns/op new ...
9 years, 9 months ago
(2014-07-20 11:07:13 UTC)
#13
*** Submitted as https://code.google.com/p/go/source/detail?r=e1d0077340e8 ***
cmd/gc: allocate select descriptor on stack
benchmark old ns/op new ns/op delta
BenchmarkSelectUncontended 220 165 -25.00%
BenchmarkSelectContended 209 161 -22.97%
BenchmarkSelectProdCons 1042 904 -13.24%
But more importantly this change will allow
to get rid of free function in runtime.
Fixes issue 6494.
LGTM=rsc, khr
R=golang-codereviews, rsc, dominik.honnef, khr
CC=golang-codereviews, remyoudompheng
https://codereview.appspot.com/107670043
> The crash looks related, but I don't understand why it happened only > on ...
9 years, 9 months ago
(2014-07-20 12:06:04 UTC)
#18
> The crash looks related, but I don't understand why it happened only
> on plan9-386. Is there anything special about stacks on plan9?
Yes it's related and reproducible. We'll look at it.
+cc aram
--
David du Colombier
On Sun, Jul 20, 2014 at 4:06 PM, David du Colombier <0intro@gmail.com> wrote: >> The ...
9 years, 9 months ago
(2014-07-20 12:12:58 UTC)
#19
On Sun, Jul 20, 2014 at 4:06 PM, David du Colombier <0intro@gmail.com> wrote:
>> The crash looks related, but I don't understand why it happened only
>> on plan9-386. Is there anything special about stacks on plan9?
>
> Yes it's related and reproducible. We'll look at it.
Let's also see if it happens on other OSes episodically. If it happens
on linux/darwin, then I will debug it.
> The crash looks related, but I don't understand why it happened only > on ...
9 years, 9 months ago
(2014-07-20 12:57:30 UTC)
#20
Message was sent while issue was closed.
> The crash looks related, but I don't understand why it happened only
> on plan9-386. Is there anything special about stacks on plan9?
I don't quite understand this change in depth, so maybe I am talking nonsense.
The most unique aspect of Plan 9 in this respect that I can think of is that the
stack check preamble is very different than that of other operating systems.
On Sun, Jul 20, 2014 at 4:57 PM, <aram@mgk.ro> wrote: >> The crash looks related, ...
9 years, 9 months ago
(2014-07-20 13:17:38 UTC)
#21
On Sun, Jul 20, 2014 at 4:57 PM, <aram@mgk.ro> wrote:
>> The crash looks related, but I don't understand why it happened only
>> on plan9-386. Is there anything special about stacks on plan9?
>
>
> I don't quite understand this change in depth, so maybe I am talking
> nonsense.
>
> The most unique aspect of Plan 9 in this respect that I can think of is
> that the stack check preamble is very different than that of other
> operating systems.
I do not see how a different stack check can directly cause this crash.
This change moves select descriptor from heap to stack of the
goroutine. It also giver the descriptor proper type description so
that GC knows how to scan.
The crash is: GC scans the select descriptor and find nonsense value
in a slot marked as pointer. GC checks that stack slots marked as
pointers has valid pointers in them. In this case it finds value
0x1164 in a pointer slot.
There are chances that the slot is not actually a select descriptor
but some other variable in the same stack frame. And the change just
shuffled other variables in the frame.
Potential causes are: (1) a pointer slot in not zeroed in function
prologue, (2) type description is incorrect, so that a scalar slot is
marked as pointer, (3) somebody corrupts memory.
The first thing to do would be probably to understand what exactly
lives in the guilty slot on stack. 0x10281f18 is address of the slot,
and the stack frames are accompanies with FP values:
ad pointer in frame main.sender at 0x10281f18: 0x1164
fatal error: bad pointer in scanbitvector
goroutine 5 [runnable]:
runtime.park(0x20770, 0x10281ef0, 0x114c5d)
/usr/local/go/src/pkg/runtime/proc.c:1396 +0x95 fp=0x10281e58 sp=0x10281e50
selectgo(0x10281ebc)
/usr/local/go/src/pkg/runtime/chan.goc:796 +0x31c fp=0x10281eac sp=0x10281e58
runtime.selectgo(0x10281ef0)
/usr/local/go/src/pkg/runtime/chan.goc:639 +0xf fp=0x10281ebc sp=0x10281eac
main.sender(0x186a0, 0x102560c0, 0x102560f0, 0x10256120, 0x10256150)
/tmp/gobuilder/plan9-386-cnielsen-e1d0077340e8/go/test/chan/doubleselect.go:29
+0x1f1 fp=0x10281fb8 sp=0x10281ebc
runtime.goexit()
/usr/local/go/src/pkg/runtime/proc.c:1465 fp=0x10281fbc sp=0x10281fb8
created by main.main
/tmp/gobuilder/plan9-386-cnielsen-e1d0077340e8/go/test/chan/doubleselect.go:70
+0x165
The pointer it complains about, 0x1164, is the return address for the call to runtime·selectsend ...
9 years, 9 months ago
(2014-07-20 13:31:18 UTC)
#22
The pointer it complains about, 0x1164, is the return address for the
call to runtime·selectsend
doubleselect.go:30 0x115f e81cf40100 CALL runtime.selectsend(SB)
doubleselect.go:30 0x1164 0fb65c240c MOVZX 0xc(SP), BX
--
Aram Hăvărneanu
On Sun, Jul 20, 2014 at 5:30 PM, Aram Hăvărneanu <aram@mgk.ro> wrote: > The pointer ...
9 years, 9 months ago
(2014-07-20 13:45:26 UTC)
#23
On Sun, Jul 20, 2014 at 5:30 PM, Aram Hăvărneanu <aram@mgk.ro> wrote:
> The pointer it complains about, 0x1164, is the return address for the
> call to runtime·selectsend
>
> doubleselect.go:30 0x115f e81cf40100 CALL runtime.selectsend(SB)
> doubleselect.go:30 0x1164 0fb65c240c MOVZX 0xc(SP), BX
Oh, so text is mapped at 0x1000?
Then I think the reason why other OSes do not crash is because text is
mapped at much higher addresses, and GC does not consider these as
invalid.
Another consequence of this is that if a user anyhow gets a PC in a
pointer variable on stack, that will crash on plan9, because GC will
consider the pointer as invalid. I don't know how it is possible,
though.
> Ah, OK, I see. We put PC into select descriptor and mark that slot ...
9 years, 9 months ago
(2014-07-20 14:14:11 UTC)
#26
> Ah, OK, I see. We put PC into select descriptor and mark that slot as
> pointer. Please try the following patch:
> https://codereview.appspot.com/113350043
Yes, this patch fixes the issue.
Thank you both.
--
David du Colombier
thanks! mailed it On Sun, Jul 20, 2014 at 6:14 PM, David du Colombier <0intro@gmail.com> ...
9 years, 9 months ago
(2014-07-20 14:31:34 UTC)
#27
thanks! mailed it
On Sun, Jul 20, 2014 at 6:14 PM, David du Colombier <0intro@gmail.com> wrote:
>> Ah, OK, I see. We put PC into select descriptor and mark that slot as
>> pointer. Please try the following patch:
>> https://codereview.appspot.com/113350043
>
> Yes, this patch fixes the issue.
>
> Thank you both.
>
> --
> David du Colombier
Issue 107670043: code review 107670043: cmd/gc: allocate select descriptor on stack
(Closed)
Created 9 years, 9 months ago by dvyukov
Modified 9 years, 9 months ago
Reviewers: gobot, 0intro, aram
Base URL:
Comments: 6