runtime: refactor gogo
Instead of 3 overlapping functions (gogo, gogocall, gogocallfn)
introduce 2 non-overlapping functions (gogo, returnto).
Required for preemptive scheduler, otherwise it would need
to introduce yet another function -- gogo2 (same as gogo, but restores DX instead of AX).
I think we need some good docs for gogo and returnto somewhere. https://codereview.appspot.com/9868044/diff/11001/src/pkg/runtime/asm_arm.s File src/pkg/runtime/asm_arm.s ...
10 years, 11 months ago
(2013-05-31 22:35:52 UTC)
#3
PTAL >I think we need some good docs for gogo and returnto somewhere. I am ...
10 years, 11 months ago
(2013-06-01 19:59:18 UTC)
#4
PTAL
>I think we need some good docs for gogo and returnto somewhere.
I am ready to add a comment for gogo and returnto, but I am not ready to
undertake the full docs for split stacks.
What would be a good place? runtime.h? It does not have comments for
functions...
https://codereview.appspot.com/9868044/diff/11001/src/pkg/runtime/asm_arm.s
File src/pkg/runtime/asm_arm.s (right):
https://codereview.appspot.com/9868044/diff/11001/src/pkg/runtime/asm_arm.s#n...
src/pkg/runtime/asm_arm.s:134: TEXT runtime·gogocall(SB), 7, $-4
On 2013/05/31 22:35:52, iant wrote:
> Remove these functions entirely.
Done.
https://codereview.appspot.com/9868044/diff/11001/src/pkg/runtime/proc.c
File src/pkg/runtime/proc.c (right):
https://codereview.appspot.com/9868044/diff/11001/src/pkg/runtime/proc.c#newc...
src/pkg/runtime/proc.c:994: runtime·returnto(&gp->sched, (byte*)runtime·goexit);
On 2013/05/31 22:35:52, iant wrote:
> Perhaps it would make sense to call returnto in runtime·newproc1 rather than
> here.
Agree.
But I want to just mechanically map old api onto new api in this CL. It touches
subtle parts of the runtime, so I will be happy if I get at least this
correctly.
Let's do the refactoring in the subsequent CL. Actually it will be very nice,
because then it will be possible to remove special handling of non-started
goroutines in gentraceback and GC stack scanning.
Added the TODO.
On 2013/06/01 19:59:18, dvyukov wrote: > PTAL > > >I think we need some good ...
10 years, 11 months ago
(2013-06-03 04:30:20 UTC)
#5
On 2013/06/01 19:59:18, dvyukov wrote:
> PTAL
>
> >I think we need some good docs for gogo and returnto somewhere.
> I am ready to add a comment for gogo and returnto, but I am not ready to
> undertake the full docs for split stacks.
> What would be a good place? runtime.h? It does not have comments for
> functions...
Somewhere near the top of proc.c, I suppose.
> Let's do the refactoring in the subsequent CL. Actually it will be very nice,
> because then it will be possible to remove special handling of non-started
> goroutines in gentraceback and GC stack scanning.
> Added the TODO.
OK.
On 2013/06/03 04:30:20, iant wrote: > On 2013/06/01 19:59:18, dvyukov wrote: > > PTAL > ...
10 years, 11 months ago
(2013-06-03 13:59:33 UTC)
#6
On 2013/06/03 04:30:20, iant wrote:
> On 2013/06/01 19:59:18, dvyukov wrote:
> > PTAL
> >
> > >I think we need some good docs for gogo and returnto somewhere.
> > I am ready to add a comment for gogo and returnto, but I am not ready to
> > undertake the full docs for split stacks.
> > What would be a good place? runtime.h? It does not have comments for
> > functions...
>
> Somewhere near the top of proc.c, I suppose.
Done
PTAL
https://codereview.appspot.com/9868044/diff/28001/src/pkg/runtime/proc.c File src/pkg/runtime/proc.c (right): https://codereview.appspot.com/9868044/diff/28001/src/pkg/runtime/proc.c#newcode30 src/pkg/runtime/proc.c:30: // AX holds return value from C functions and ...
10 years, 11 months ago
(2013-06-03 14:58:56 UTC)
#7
On 2013/05/30 13:18:11, dvyukov wrote: > I absolutely have no idea what I am doing ...
10 years, 11 months ago
(2013-06-03 15:06:15 UTC)
#8
On 2013/05/30 13:18:11, dvyukov wrote:
> I absolutely have no idea what I am doing wrt arm.
> Can somebody please test and fix it for arm?
Can somebody with ARM access please test this?
https://codereview.appspot.com/9868044/diff/28001/src/pkg/runtime/proc.c File src/pkg/runtime/proc.c (right): https://codereview.appspot.com/9868044/diff/28001/src/pkg/runtime/proc.c#newcode30 src/pkg/runtime/proc.c:30: // AX holds return value from C functions and ...
10 years, 11 months ago
(2013-06-03 15:09:27 UTC)
#9
NOT LGTM I don't understand what is going on here. Why is the old API ...
10 years, 11 months ago
(2013-06-03 20:33:48 UTC)
#12
NOT LGTM
I don't understand what is going on here.
Why is the old API so bad it needs to be replaced wholesale?
The old API was carefully crafted to create an abstract interface that could be
implemented on both x86 and ARM, despite the different calling conventions.
It is fine to add a context argument to runtime.gogo.
It sounds like maybe that's all you really need.
https://codereview.appspot.com/9868044/diff/29009/src/pkg/runtime/asm_386.s
File src/pkg/runtime/asm_386.s (right):
https://codereview.appspot.com/9868044/diff/29009/src/pkg/runtime/asm_386.s#n...
src/pkg/runtime/asm_386.s:156: // void returnto(Gobuf*, byte*)
Sure looks like gogocall to me.
"returnto" makes it sound like you are returning to something, but you're not.
You're restoring the Gobuf context (gogo) and then making a call.
Hence gogocall.
On 2013/06/03 20:33:48, rsc wrote: > NOT LGTM > > I don't understand what is ...
10 years, 11 months ago
(2013-06-04 10:59:51 UTC)
#14
On 2013/06/03 20:33:48, rsc wrote:
> NOT LGTM
>
> I don't understand what is going on here.
> Why is the old API so bad it needs to be replaced wholesale?
>
> The old API was carefully crafted to create an abstract interface that could
be
> implemented on both x86 and ARM, despite the different calling conventions.
>
> It is fine to add a context argument to runtime.gogo.
> It sounds like maybe that's all you really need.
The early prototype of preemptive scheduler contained the following code in
schedule():
if(gp->sched.pc == (byte*)runtime·goexit) // kickoff
runtime·gogocallfn(&gp->sched, gp->fnstart);
+ if(gp->preempt) {
+ if(gp->preempt == 1) {
+ gp->preempt = 0;
+ // preempted w/o new stack segment creation, just need to restore DX
+ runtime·gogo2(&gp->sched, gp->preemptcret);
+ } else {
+ gp->preempt = 0;
+ // preempted with new stack segment creation, need to restore DX and return
to lessstack
+ runtime·gogocall(&gp->sched, gp->preemptpc, gp->preemptcret);
+ }
+ }
runtime·gogo(&gp->sched, 0);
That's 4 (four!) different ways to switch to a goroutine.
Much more permutations of "and return to"/"just switch", "restore AX"/"do not
restore AX", "restore DX"/"do not restore DX", "use separate PC and
context"/"use Func" are possible in future.
On top of that, non-started goroutines looks different from started, so we have
special cases for that in gentraceback() and addstackroots().
On top of that, 2 cases of preempted goroutines (with new stack segment and w/o)
will require yet more special handling in gentraceback() and addstackroots().
All this stems form the non-orthogonal API. What we essentially need is: (1)
arrange to return to a PC when stack segment underflows w/o switching right now
(this is returnto()) and switch to a full goroutine context (PC, SP, AX and DX,
this is gogo()). This allows to:
- arrange to return to goexit() directly in newproc()
- arrange to return to lessstack() directly in newstack()
This is turn allows to:
- switch goroutines in schedule() with a single gogo() call
- remove special handling on non-started goroutines in gentraceback() and
addstackroots()
- do not add any special handling for preempted goroutines to gentraceback() and
addstackroots()
And I think it's just a better orthogonal API, which provides better foundation
for potential future changes.
I do not get it right for ARM at first try. But the idea is to push PC onto
stack in returnto(), then gogo() retrieves the PC from stack and restores it in
LR. I think it should work.
Okay, I see what you are getting at. But the naming is confusing, and the ...
10 years, 11 months ago
(2013-06-05 14:39:08 UTC)
#15
Okay, I see what you are getting at. But the naming is confusing, and the
new API isn't great either. Instead of tacking more arguments onto gogo,
that stuff really belongs in the Gobuf, which represents the paused
goroutine's state. What you're calling returnto is really the beginning of
a call. I understand how you got to "return" but that's not really what it
does: it's gogocall not gogoreturn. A better name would be gostartcall, so
that the current gogocall = gostartcall + gogo. And if we do this, the
whole concept of an unstarted goroutine should be removed completely.
Can you try something like CL 10036044 instead? It mostly works but some
tests fail due to garbage collection problems that I don't understand.
Russ
On Wed, Jun 5, 2013 at 6:39 PM, Russ Cox <rsc@golang.org> wrote: > Okay, I ...
10 years, 11 months ago
(2013-06-05 14:56:36 UTC)
#16
On Wed, Jun 5, 2013 at 6:39 PM, Russ Cox <rsc@golang.org> wrote:
> Okay, I see what you are getting at. But the naming is confusing, and the
> new API isn't great either. Instead of tacking more arguments onto gogo,
> that stuff really belongs in the Gobuf
Agree.
> , which represents the paused
> goroutine's state. What you're calling returnto is really the beginning of a
> call. I understand how you got to "return" but that's not really what it
> does: it's gogocall not gogoreturn. A better name would be gostartcall, so
> that the current gogocall = gostartcall + gogo. And if we do this, the whole
> concept of an unstarted goroutine should be removed completely.
Agree.
> Can you try something like CL 10036044 instead? It mostly works but some
> tests fail due to garbage collection problems that I don't understand.
I will look into making your CL work.
1547 newg->sched.pc = (byte*)runtime·goexit;
1552 newg->sched.g = newg; 1548 newg->sched.g = newg;
1553 » newg->fnstart = fn; 1549 » runtime·gostartcallfn(&newg->sched, fn);
On Wed, Jun 5, 2013 at 6:56 PM, Dmitry Vyukov <dvyukov@google.com> wrote: > On Wed, ...
10 years, 11 months ago
(2013-06-05 15:00:13 UTC)
#17
On Wed, Jun 5, 2013 at 6:56 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Wed, Jun 5, 2013 at 6:39 PM, Russ Cox <rsc@golang.org> wrote:
>> Okay, I see what you are getting at. But the naming is confusing, and the
>> new API isn't great either. Instead of tacking more arguments onto gogo,
>> that stuff really belongs in the Gobuf
>
> Agree.
>
>
>> , which represents the paused
>> goroutine's state. What you're calling returnto is really the beginning of a
>> call. I understand how you got to "return" but that's not really what it
>> does: it's gogocall not gogoreturn. A better name would be gostartcall, so
>> that the current gogocall = gostartcall + gogo. And if we do this, the whole
>> concept of an unstarted goroutine should be removed completely.
>
> Agree.
>
>> Can you try something like CL 10036044 instead? It mostly works but some
>> tests fail due to garbage collection problems that I don't understand.
>
> I will look into making your CL work.
Shouldn't it be the other way around? Start call to goexit, then switch to fn?
1547 newg->sched.pc = (byte*)runtime·goexit;
1548 newg->sched.g = newg;
1549 » runtime·gostartcallfn(&newg->sched, fn);
This can have problems with GC, because it does not understand
non-started functions yet. For the non-started goroutine fn has not
yet decremented SP to form stack frame, but the GC assumes it does.
On Wed, Jun 5, 2013 at 11:00 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > On Wed, ...
10 years, 11 months ago
(2013-06-05 15:09:02 UTC)
#18
On Wed, Jun 5, 2013 at 11:00 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Wed, Jun 5, 2013 at 6:56 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > On Wed, Jun 5, 2013 at 6:39 PM, Russ Cox <rsc@golang.org> wrote:
> >> Okay, I see what you are getting at. But the naming is confusing, and
> the
> >> new API isn't great either. Instead of tacking more arguments onto gogo,
> >> that stuff really belongs in the Gobuf
> >
> > Agree.
> >
> >
> >> , which represents the paused
> >> goroutine's state. What you're calling returnto is really the beginning
> of a
> >> call. I understand how you got to "return" but that's not really what it
> >> does: it's gogocall not gogoreturn. A better name would be gostartcall,
> so
> >> that the current gogocall = gostartcall + gogo. And if we do this, the
> whole
> >> concept of an unstarted goroutine should be removed completely.
> >
> > Agree.
> >
> >> Can you try something like CL 10036044 instead? It mostly works but some
> >> tests fail due to garbage collection problems that I don't understand.
> >
> > I will look into making your CL work.
>
> Shouldn't it be the other way around? Start call to goexit, then switch to
> fn?
>
Maybe it's the other way around in your CL (I didn't look closely), but the
way I would like to approach this is to think of
> 1547 newg->sched.pc = (byte*)runtime·goexit;
> 1548 newg->sched.g = newg;
>
as creating a goroutine that is about to run goexit.
> 1549 » runtime·gostartcallfn(&newg->sched, fn);
>
And then this pushes a call to fn ahead of whatever the goroutine was going
to do. So the net result is that the goroutine calls fn, and then when fn
returns, the goroutine calls goexit.
> This can have problems with GC, because it does not understand
> non-started functions yet. For the non-started goroutine fn has not
> yet decremented SP to form stack frame, but the GC assumes it does.
>
The stack walker that we had before all the new GC work understood that if
pc == f->entry then there are no locals, only arguments. I attempted to add
that to addframeroots (see the CL), but it is possible other fixes are
needed elsewhere.
Russ
On Wed, Jun 5, 2013 at 7:09 PM, Russ Cox <rsc@golang.org> wrote: > On Wed, ...
10 years, 11 months ago
(2013-06-05 15:14:20 UTC)
#19
On Wed, Jun 5, 2013 at 7:09 PM, Russ Cox <rsc@golang.org> wrote:
> On Wed, Jun 5, 2013 at 11:00 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>
>> On Wed, Jun 5, 2013 at 6:56 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> > On Wed, Jun 5, 2013 at 6:39 PM, Russ Cox <rsc@golang.org> wrote:
>> >> Okay, I see what you are getting at. But the naming is confusing, and
>> >> the
>> >> new API isn't great either. Instead of tacking more arguments onto
>> >> gogo,
>> >> that stuff really belongs in the Gobuf
>> >
>> > Agree.
>> >
>> >
>> >> , which represents the paused
>> >> goroutine's state. What you're calling returnto is really the beginning
>> >> of a
>> >> call. I understand how you got to "return" but that's not really what
>> >> it
>> >> does: it's gogocall not gogoreturn. A better name would be gostartcall,
>> >> so
>> >> that the current gogocall = gostartcall + gogo. And if we do this, the
>> >> whole
>> >> concept of an unstarted goroutine should be removed completely.
>> >
>> > Agree.
>> >
>> >> Can you try something like CL 10036044 instead? It mostly works but
>> >> some
>> >> tests fail due to garbage collection problems that I don't understand.
>> >
>> > I will look into making your CL work.
>>
>> Shouldn't it be the other way around? Start call to goexit, then switch to
>> fn?
>
>
> Maybe it's the other way around in your CL (I didn't look closely), but the
> way I would like to approach this is to think of
>
>>
>> 1547 newg->sched.pc = (byte*)runtime·goexit;
>> 1548 newg->sched.g = newg;
>
>
> as creating a goroutine that is about to run goexit.
>
>>
>> 1549 » runtime·gostartcallfn(&newg->sched, fn);
>
>
> And then this pushes a call to fn ahead of whatever the goroutine was going
> to do. So the net result is that the goroutine calls fn, and then when fn
> returns, the goroutine calls goexit.
OK
>> This can have problems with GC, because it does not understand
>> non-started functions yet. For the non-started goroutine fn has not
>> yet decremented SP to form stack frame, but the GC assumes it does.
>
>
> The stack walker that we had before all the new GC work understood that if
> pc == f->entry then there are no locals, only arguments. I attempted to add
> that to addframeroots (see the CL), but it is possible other fixes are
> needed elsewhere.
I will try to debug it.
I looked at this CL again after your question about setup being the other way ...
10 years, 11 months ago
(2013-06-05 18:05:24 UTC)
#20
I looked at this CL again after your question about setup being the other
way around, and now I understand the code and why it's called returnto.
With that understanding, I wanted to explain a little more why I prefer my
proposal.
In general, context switching APIs are not easy to design, but I've learned
a few lessons from others' mistakes and my own.
In early versions of Unix, the context switch switched the stack pointer
but not the program counter! Flipping just the stack pointer was completely
manageable to program against in hand-written assembly, where that kind of
thing is not uncommon. When C came along, however, things did not go a
smoothly. After a context restore inside a C function you had the program
counter of the function that invoked the context restore running in the
stack frame of the function that invoked the context save. If the old pc
was expecting to find any local variables on the stack after the call, it
wouldn't. Worse, each C function epilogue adjusts the stack pointer upward
to remove the frame allocated in the prologue. For the one function's
epilogue to execute a correct return from the other function's stack frame,
the two functions must have stack frames of equal size. Add a local
variable to one or the other and the kernel stops working, because the
context switch restore created a state expected by assembly programs but
unexpected in C programs. In essence, the context switch API had latent
assembly assumptions and broke when it had to be "ported" to work with C.
This mess is what prompted the famous "you are not expected to understand
this" comment (see swtch.com/unix for a scan). The lesson I learned from
taking the time to understand this is that context switching functions
should work with a description of an actual snapshot of the CPU. Save and
restore both the stack pointer and the program counter and any other live
registers, all together. In the Go runtime, gosave copies the current CPU's
execution state into the Gobuf, and gogo does the reverse.
When we started doing Go on ARM, we ran into places where the code had
latent x86 assumpsions, and we had to design more portable APIs to remove
those assumptions. After trying a handful of things that didn't work, we
realized that the APIs that did work were the ones that could be explained
in terms of hardware instructions. In retrospect, this is obvious: if the
contexts are valid CPU snapshots, the operations on them have to use valid
CPU snapshots as input and output, and that's what hardware instructions
do. So gogocall is a gogo (context restore) followed by the simulation of
one call instruction before letting the CPU run with the context.
If these routines need to be refactored, then the suggestion in my CL is to
have an operation gostartcall that follows these two rules. First, it takes
as input a valid CPU snapshot (a Gobuf describing, in practice, the
beginning of an execution of lessstack or goexit) and adjusts it to return
another valid CPU snapshot. Second, it can be explained in terms of
hardware instructions: it executes a single call instruction in that Gobuf
context and saves the resulting context back into the Gobuf.
In contrast, the returnto API does not follow either of these rules. The
Gobuf passed into returnto is not a working CPU snapshot: on the x86, the
input stack pointer and program counter do not correspond. And because the
input is not a well-formed execution context, returnto cannot be described
in terms of CPU execution. It is certainly possible to build a working
system using this primitive, as I'm sure you have, but because it breaks
the rules, I worry that it will turn out to be fragile for some unexpected
reason later. That's why I prefer gostartcall.
Thanks.
Russ
On Wed, Jun 5, 2013 at 10:05 PM, Russ Cox <rsc@golang.org> wrote: > I looked ...
10 years, 11 months ago
(2013-06-06 16:03:19 UTC)
#21
On Wed, Jun 5, 2013 at 10:05 PM, Russ Cox <rsc@golang.org> wrote:
> I looked at this CL again after your question about setup being the other
> way around, and now I understand the code and why it's called returnto. With
> that understanding, I wanted to explain a little more why I prefer my
> proposal.
>
> In general, context switching APIs are not easy to design, but I've learned
> a few lessons from others' mistakes and my own.
>
> In early versions of Unix, the context switch switched the stack pointer but
> not the program counter! Flipping just the stack pointer was completely
> manageable to program against in hand-written assembly, where that kind of
> thing is not uncommon. When C came along, however, things did not go a
> smoothly. After a context restore inside a C function you had the program
> counter of the function that invoked the context restore running in the
> stack frame of the function that invoked the context save. If the old pc was
> expecting to find any local variables on the stack after the call, it
> wouldn't. Worse, each C function epilogue adjusts the stack pointer upward
> to remove the frame allocated in the prologue. For the one function's
> epilogue to execute a correct return from the other function's stack frame,
> the two functions must have stack frames of equal size. Add a local variable
> to one or the other and the kernel stops working, because the context switch
> restore created a state expected by assembly programs but unexpected in C
> programs. In essence, the context switch API had latent assembly assumptions
> and broke when it had to be "ported" to work with C. This mess is what
> prompted the famous "you are not expected to understand this" comment (see
> swtch.com/unix for a scan). The lesson I learned from taking the time to
> understand this is that context switching functions should work with a
> description of an actual snapshot of the CPU. Save and restore both the
> stack pointer and the program counter and any other live registers, all
> together. In the Go runtime, gosave copies the current CPU's execution state
> into the Gobuf, and gogo does the reverse.
>
> When we started doing Go on ARM, we ran into places where the code had
> latent x86 assumpsions, and we had to design more portable APIs to remove
> those assumptions. After trying a handful of things that didn't work, we
> realized that the APIs that did work were the ones that could be explained
> in terms of hardware instructions. In retrospect, this is obvious: if the
> contexts are valid CPU snapshots, the operations on them have to use valid
> CPU snapshots as input and output, and that's what hardware instructions do.
> So gogocall is a gogo (context restore) followed by the simulation of one
> call instruction before letting the CPU run with the context.
>
> If these routines need to be refactored, then the suggestion in my CL is to
> have an operation gostartcall that follows these two rules. First, it takes
> as input a valid CPU snapshot (a Gobuf describing, in practice, the
> beginning of an execution of lessstack or goexit) and adjusts it to return
> another valid CPU snapshot. Second, it can be explained in terms of hardware
> instructions: it executes a single call instruction in that Gobuf context
> and saves the resulting context back into the Gobuf.
>
> In contrast, the returnto API does not follow either of these rules. The
> Gobuf passed into returnto is not a working CPU snapshot: on the x86, the
> input stack pointer and program counter do not correspond. And because the
> input is not a well-formed execution context, returnto cannot be described
> in terms of CPU execution. It is certainly possible to build a working
> system using this primitive, as I'm sure you have, but because it breaks the
> rules, I worry that it will turn out to be fragile for some unexpected
> reason later. That's why I prefer gostartcall.
Thanks for this insight! Now your API makes perfect sense.
My version does not have a deep idea behind, I just wanted to split
the startcall/returnto part from the actual switch (gogo).
With the small fix that I described in the CL, your version passes
all.bash. So please send it for review.
Issue 9868044: code review 9868044: runtime: refactor gogo
(Closed)
Created 10 years, 11 months ago by dvyukov
Modified 10 years, 11 months ago
Reviewers: golang-dev, iant, capnm, rsc, ality
Base URL:
Comments: 12