Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(432)

Issue 4437075: code review 4437075: runtime: stack split + garbage collection bug (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
14 years ago by rsc
Modified:
14 years ago
Reviewers:
CC:
iant, r, golang-dev
Visibility:
Public.

Description

runtime: stack split + garbage collection bug The g->sched.sp saved stack pointer and the g->stackbase and g->stackguard stack bounds can change even while "the world is stopped", because a goroutine has to call functions (and therefore might split its stack) when exiting a system call to check whether the world is stopped (and if so, wait until the world continues). That means the garbage collector cannot access those values safely (without a race) for goroutines executing system calls. Instead, save a consistent triple in g->gcsp, g->gcstack, g->gcguard during entersyscall and have the garbage collector refer to those. The old code was occasionally seeing (because of the race) an sp and stk that did not correspond to each other, so that stk - sp was not the number of stack bytes following sp. In that case, if sp < stk then the call scanblock(sp, stk - sp) scanned too many bytes (anything between the two pointers, which pointed into different allocation blocks). If sp > stk then stk - sp wrapped around. On 32-bit, stk - sp is a uintptr (uint32) converted to int64 in the call to scanblock, so a large (~4G) but positive number. Scanblock would try to scan that many bytes and eventually fault accessing unmapped memory. On 64-bit, stk - sp is a uintptr (uint64) promoted to int64 in the call to scanblock, so a negative number. Scanblock would not scan anything, possibly causing in-use blocks to be freed. In short, 32-bit platforms would have seen either ineffective garbage collection or crashes during garbage collection, while 64-bit platforms would have seen either ineffective or incorrect garbage collection. You can see the invalid arguments to scanblock in the stack traces in issue 1620. Fixes issue 1620. Fixes issue 1746.

Patch Set 1 #

Patch Set 2 : diff -r 7f2899bf1256 https://go.googlecode.com/hg/ #

Patch Set 3 : diff -r a4ee5538faba https://go.googlecode.com/hg #

Patch Set 4 : diff -r a4ee5538faba https://go.googlecode.com/hg #

Total comments: 1

Patch Set 5 : diff -r a560408b8164 https://go.googlecode.com/hg/ #

Unified diffs Side-by-side diffs Delta from patch set Stats (+76 lines, -13 lines) Patch
M src/pkg/runtime/386/asm.s View 1 1 chunk +1 line, -1 line 0 comments Download
M src/pkg/runtime/amd64/asm.s View 1 1 chunk +1 line, -1 line 0 comments Download
M src/pkg/runtime/arm/asm.s View 1 1 chunk +1 line, -1 line 0 comments Download
M src/pkg/runtime/mgc0.c View 1 2 3 3 chunks +37 lines, -4 lines 0 comments Download
M src/pkg/runtime/proc.c View 1 2 3 4 5 chunks +32 lines, -6 lines 0 comments Download
M src/pkg/runtime/runtime.h View 1 2 chunks +4 lines, -0 lines 0 comments Download

Messages

Total messages: 5
rsc
Hello iant (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://go.googlecode.com/hg
14 years ago (2011-04-27 20:49:12 UTC) #1
r
LGTM but wait for ian
14 years ago (2011-04-27 23:04:18 UTC) #2
iant
LGTM http://codereview.appspot.com/4437075/diff/7001/src/pkg/runtime/proc.c File src/pkg/runtime/proc.c (right): http://codereview.appspot.com/4437075/diff/7001/src/pkg/runtime/proc.c#newcode685 src/pkg/runtime/proc.c:685: g->gcstack = nil; I think that clearing g->gcstack ...
14 years ago (2011-04-27 23:12:31 UTC) #3
rsc
> http://codereview.appspot.com/4437075/diff/7001/src/pkg/runtime/proc.c#newcode685 > src/pkg/runtime/proc.c:685: g->gcstack = nil; > I think that clearing g->gcstack could move ...
14 years ago (2011-04-28 02:37:21 UTC) #4
rsc
14 years ago (2011-04-28 03:21:15 UTC) #5
*** Submitted as http://code.google.com/p/go/source/detail?r=a74afff66b5c ***

runtime: stack split + garbage collection bug

The g->sched.sp saved stack pointer and the
g->stackbase and g->stackguard stack bounds
can change even while "the world is stopped",
because a goroutine has to call functions (and
therefore might split its stack) when exiting a
system call to check whether the world is stopped
(and if so, wait until the world continues).

That means the garbage collector cannot access
those values safely (without a race) for goroutines
executing system calls.  Instead, save a consistent
triple in g->gcsp, g->gcstack, g->gcguard during
entersyscall and have the garbage collector refer
to those.

The old code was occasionally seeing (because of
the race) an sp and stk that did not correspond to
each other, so that stk - sp was not the number of
stack bytes following sp.  In that case, if sp < stk
then the call scanblock(sp, stk - sp) scanned too
many bytes (anything between the two pointers,
which pointed into different allocation blocks).
If sp > stk then stk - sp wrapped around.
On 32-bit, stk - sp is a uintptr (uint32) converted
to int64 in the call to scanblock, so a large (~4G)
but positive number.  Scanblock would try to scan
that many bytes and eventually fault accessing
unmapped memory.  On 64-bit, stk - sp is a uintptr (uint64)
promoted to int64 in the call to scanblock, so a negative
number.  Scanblock would not scan anything, possibly
causing in-use blocks to be freed.

In short, 32-bit platforms would have seen either
ineffective garbage collection or crashes during garbage
collection, while 64-bit platforms would have seen
either ineffective or incorrect garbage collection.
You can see the invalid arguments to scanblock in the
stack traces in issue 1620.

Fixes issue 1620.
Fixes issue 1746.

R=iant, r
CC=golang-dev
http://codereview.appspot.com/4437075
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b