runtime: use custom thunks for race calls instead of cgo
Implement custom assembly thunks for hot race calls (memory accesses and function entry/exit).
The thunks extract caller pc, verify that the address is in heap or global and switch to g0 stack.
Before:
ok regexp 3.692s
ok compress/bzip2 9.461s
ok encoding/json 6.380s
After:
ok regexp 2.229s (-40%)
ok compress/bzip2 4.703s (-50%)
ok encoding/json 3.629s (-43%)
For comparison, normal non-race build:
ok regexp 0.348s
ok compress/bzip2 0.304s
ok encoding/json 0.661s
Race build:
ok regexp 2.229s (+540%)
ok compress/bzip2 4.703s (+1447%)
ok encoding/json 3.629s (+449%)
Also removes some race-related special cases from cgocall and scheduler.
In long-term it will allow to remove cyclic runtime/race dependency on cmd/cgo.
Fixes issue 4249.
Fixes issue 7460.
Update issue 6508
Update issue 6688
Hello iant@golang.org (cc: golang-codereviews@googlegroups.com), I'd like you to review this change to https://dvyukov%40google.com@code.google.com/p/go/
10 years, 10 months ago
(2014-01-28 17:42:49 UTC)
#1
Hi Ian, This does not work yet, and I need some help. I've changed Go->C ...
10 years, 10 months ago
(2014-01-28 17:49:13 UTC)
#2
Hi Ian,
This does not work yet, and I need some help.
I've changed Go->C race calls to not use cgo, but I still use cgo to call back
from C to Go (__tsan_symbolize function), this does not work as expected.
I don't know how to implement C->Go __tsan_symbolize callback w/o cgo.
1. If I comment out everything in runtime/race/race.go and rename
s/__tsan_symbolize2/__tsan_symbolize/ in runtime/race_amd64.s (completely
cgo-less build), then when I do go test -race runtime, I get:
# testmain
runtime/race(.text): undefined: __libc_malloc
runtime/race(.text): undefined: __tsan_symbolize
runtime/race(.text): undefined: getuid
runtime/race(.text): undefined: pthread_self
runtime/race(.text): undefined: madvise
runtime/race(.text): undefined: setrlimit
runtime/race(.text): undefined: sleep
runtime/race(.text): undefined: usleep
runtime/race(.text): undefined: abort
runtime/race(.text): undefined: isatty
runtime/race(.text): undefined: setrlimit
runtime/race(.text): undefined: getrlimit
runtime/race(.text): undefined: __libc_free
runtime/race(.text): undefined: setrlimit
runtime/race(.text): undefined: __errno_location
runtime/race(.text): undefined: __libc_mallinfo
runtime/race(.text): undefined: __libc_stack_end
runtime/race(.text): undefined: exit
runtime/race(.text): undefined: prctl
runtime/race(.text): undefined: prctl
runtime/race(.text): undefined: dl_iterate_phdr
too many errors
The linker does not link libc at all.
2. When I leave only the following code in runtime/race/race.go:
package race
/*
void __tsan_init(void **racectx);
*/
import "C"
//export __tsan_symbolize
func __tsan_symbolize(pc uintptr, fun, file **C.char, line, off *C.int) C.int
I get:
# testmain
__tsan_symbolize: missing section in putelfsym
__tsan_symbolize: missing section in putelfsym
Is it possible to expose __tsan_symbolize function compiled with 6a to C world,
and link with libc at the same time, but w/o using cgo?
On high-level what I want to do is: 1. Link C object file runtime/race/race_linux_amd64.syso 2. ...
10 years, 10 months ago
(2014-01-28 18:13:29 UTC)
#3
On high-level what I want to do is:
1. Link C object file runtime/race/race_linux_amd64.syso
2. Being able to call it from Go
3. Being able to callback from it into Go
All w/o using cmd/cgo.
Assembly thunks are ready and working, so this is not the problem. The problem
is merely linking it all together.
I am ready to provide you with more information and/or binaries.
On Tue, Jan 28, 2014 at 10:13 AM, <dvyukov@google.com> wrote: > On high-level what I ...
10 years, 10 months ago
(2014-01-28 18:45:06 UTC)
#4
On Tue, Jan 28, 2014 at 10:13 AM, <dvyukov@google.com> wrote:
> On high-level what I want to do is:
> 1. Link C object file runtime/race/race_linux_amd64.syso
> 2. Being able to call it from Go
> 3. Being able to callback from it into Go
> All w/o using cmd/cgo.
> Assembly thunks are ready and working, so this is not the problem. The
> problem is merely linking it all together.
What if you pass an explicit -linkmode=external to the linker?
go build -ldflags="-linkmode=external"
By default the Go linker switches to -linkmode=external when it sees
an import of runtime/cgo. It sounds like you aren't importing
runtime/cgo anywhere.
Ian
On Tue, Jan 28, 2014 at 10:45 PM, Ian Lance Taylor <iant@golang.org> wrote: > On ...
10 years, 10 months ago
(2014-01-28 18:48:46 UTC)
#5
On Tue, Jan 28, 2014 at 10:45 PM, Ian Lance Taylor <iant@golang.org> wrote:
> On Tue, Jan 28, 2014 at 10:13 AM, <dvyukov@google.com> wrote:
>> On high-level what I want to do is:
>> 1. Link C object file runtime/race/race_linux_amd64.syso
>> 2. Being able to call it from Go
>> 3. Being able to callback from it into Go
>> All w/o using cmd/cgo.
>> Assembly thunks are ready and working, so this is not the problem. The
>> problem is merely linking it all together.
>
> What if you pass an explicit -linkmode=external to the linker?
>
> go build -ldflags="-linkmode=external"
>
> By default the Go linker switches to -linkmode=external when it sees
> an import of runtime/cgo. It sounds like you aren't importing
> runtime/cgo anywhere.
No luck
$ go test -v -race -ldflags="-linkmode=external" -x
pkg/tool/linux_amd64/6l -o $WORK/runtime/_test/runtime.test -L
$WORK/runtime/_test -L $WORK -linkmode=external -race -installsuffix
race $WORK/runtime/_test/main.a
# testmain
runtime.raceinit: missing section for __tsan_init
runtime.raceinit: missing section for __tsan_map_shadow
runtime.racefini: missing section for __tsan_fini
runtime.racemapshadow: missing section for __tsan_map_shadow
runtime.racemalloc: missing section for __tsan_malloc
runtime.racegostart: missing section for __tsan_go_start
runtime.racegoend: missing section for __tsan_go_end
runtime.raceacquireg: missing section for __tsan_acquire
runtime.racereleaseg: missing section for __tsan_release
runtime.racereleasemergeg: missing section for __tsan_release_merge
runtime.racefingo: missing section for __tsan_finalizer_goroutine
runtime.raceread: missing section for __tsan_read
runtime.racereadpc: missing section for __tsan_read_pc
runtime.racewrite: missing section for __tsan_write
runtime.racewritepc: missing section for __tsan_write_pc
runtime.racereadrange: missing section for __tsan_read_range
runtime.racereadrangepc: missing section for __tsan_read_range_pc
runtime.racewriterange: missing section for __tsan_write_range
runtime.racewriterangepc: missing section for __tsan_write_range_pc
runtime.racefuncenter: missing section for __tsan_func_enter
runtime.racefuncexit: missing section for __tsan_func_exit
too many errors
On Tue, Jan 28, 2014 at 10:48 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > > $ ...
10 years, 10 months ago
(2014-01-28 18:55:34 UTC)
#6
On Tue, Jan 28, 2014 at 10:48 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> $ go test -v -race -ldflags="-linkmode=external" -x
>
> pkg/tool/linux_amd64/6l -o $WORK/runtime/_test/runtime.test -L
> $WORK/runtime/_test -L $WORK -linkmode=external -race -installsuffix
> race $WORK/runtime/_test/main.a
> # testmain
> runtime.raceinit: missing section for __tsan_init
> runtime.raceinit: missing section for __tsan_map_shadow
> runtime.racefini: missing section for __tsan_fini
> runtime.racemapshadow: missing section for __tsan_map_shadow
> runtime.racemalloc: missing section for __tsan_malloc
> runtime.racegostart: missing section for __tsan_go_start
> runtime.racegoend: missing section for __tsan_go_end
> runtime.raceacquireg: missing section for __tsan_acquire
> runtime.racereleaseg: missing section for __tsan_release
> runtime.racereleasemergeg: missing section for __tsan_release_merge
> runtime.racefingo: missing section for __tsan_finalizer_goroutine
> runtime.raceread: missing section for __tsan_read
> runtime.racereadpc: missing section for __tsan_read_pc
> runtime.racewrite: missing section for __tsan_write
> runtime.racewritepc: missing section for __tsan_write_pc
> runtime.racereadrange: missing section for __tsan_read_range
> runtime.racereadrangepc: missing section for __tsan_read_range_pc
> runtime.racewriterange: missing section for __tsan_write_range
> runtime.racewriterangepc: missing section for __tsan_write_range_pc
> runtime.racefuncenter: missing section for __tsan_func_enter
> runtime.racefuncexit: missing section for __tsan_func_exit
This seems to me to be the set of functions defined using cgo in
runtime/race/race.go in the current tree. What are you doing to
define them in your cgo-less scheme?
Ian
On Tue, Jan 28, 2014 at 10:55 PM, Ian Lance Taylor <iant@golang.org> wrote: > On ...
10 years, 10 months ago
(2014-01-28 18:58:29 UTC)
#7
On Tue, Jan 28, 2014 at 10:55 PM, Ian Lance Taylor <iant@golang.org> wrote:
> On Tue, Jan 28, 2014 at 10:48 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>
>> $ go test -v -race -ldflags="-linkmode=external" -x
>>
>> pkg/tool/linux_amd64/6l -o $WORK/runtime/_test/runtime.test -L
>> $WORK/runtime/_test -L $WORK -linkmode=external -race -installsuffix
>> race $WORK/runtime/_test/main.a
>> # testmain
>> runtime.raceinit: missing section for __tsan_init
>> runtime.raceinit: missing section for __tsan_map_shadow
>> runtime.racefini: missing section for __tsan_fini
>> runtime.racemapshadow: missing section for __tsan_map_shadow
>> runtime.racemalloc: missing section for __tsan_malloc
>> runtime.racegostart: missing section for __tsan_go_start
>> runtime.racegoend: missing section for __tsan_go_end
>> runtime.raceacquireg: missing section for __tsan_acquire
>> runtime.racereleaseg: missing section for __tsan_release
>> runtime.racereleasemergeg: missing section for __tsan_release_merge
>> runtime.racefingo: missing section for __tsan_finalizer_goroutine
>> runtime.raceread: missing section for __tsan_read
>> runtime.racereadpc: missing section for __tsan_read_pc
>> runtime.racewrite: missing section for __tsan_write
>> runtime.racewritepc: missing section for __tsan_write_pc
>> runtime.racereadrange: missing section for __tsan_read_range
>> runtime.racereadrangepc: missing section for __tsan_read_range_pc
>> runtime.racewriterange: missing section for __tsan_write_range
>> runtime.racewriterangepc: missing section for __tsan_write_range_pc
>> runtime.racefuncenter: missing section for __tsan_func_enter
>> runtime.racefuncexit: missing section for __tsan_func_exit
>
> This seems to me to be the set of functions defined using cgo in
> runtime/race/race.go in the current tree.
yes
> What are you doing to
> define them in your cgo-less scheme?
They are defined as normal Go assembly functions here:
https://codereview.appspot.com/55100044/diff/160001/src/pkg/runtime/race_amd64.s
On Tue, Jan 28, 2014 at 10:58 PM, Dmitry Vyukov <dvyukov@google.com> wrote: > On Tue, ...
10 years, 10 months ago
(2014-01-28 18:59:37 UTC)
#8
On Tue, Jan 28, 2014 at 10:58 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Tue, Jan 28, 2014 at 10:55 PM, Ian Lance Taylor <iant@golang.org> wrote:
>> On Tue, Jan 28, 2014 at 10:48 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>>
>>> $ go test -v -race -ldflags="-linkmode=external" -x
>>>
>>> pkg/tool/linux_amd64/6l -o $WORK/runtime/_test/runtime.test -L
>>> $WORK/runtime/_test -L $WORK -linkmode=external -race -installsuffix
>>> race $WORK/runtime/_test/main.a
>>> # testmain
>>> runtime.raceinit: missing section for __tsan_init
>>> runtime.raceinit: missing section for __tsan_map_shadow
>>> runtime.racefini: missing section for __tsan_fini
>>> runtime.racemapshadow: missing section for __tsan_map_shadow
>>> runtime.racemalloc: missing section for __tsan_malloc
>>> runtime.racegostart: missing section for __tsan_go_start
>>> runtime.racegoend: missing section for __tsan_go_end
>>> runtime.raceacquireg: missing section for __tsan_acquire
>>> runtime.racereleaseg: missing section for __tsan_release
>>> runtime.racereleasemergeg: missing section for __tsan_release_merge
>>> runtime.racefingo: missing section for __tsan_finalizer_goroutine
>>> runtime.raceread: missing section for __tsan_read
>>> runtime.racereadpc: missing section for __tsan_read_pc
>>> runtime.racewrite: missing section for __tsan_write
>>> runtime.racewritepc: missing section for __tsan_write_pc
>>> runtime.racereadrange: missing section for __tsan_read_range
>>> runtime.racereadrangepc: missing section for __tsan_read_range_pc
>>> runtime.racewriterange: missing section for __tsan_write_range
>>> runtime.racewriterangepc: missing section for __tsan_write_range_pc
>>> runtime.racefuncenter: missing section for __tsan_func_enter
>>> runtime.racefuncexit: missing section for __tsan_func_exit
>>
>> This seems to me to be the set of functions defined using cgo in
>> runtime/race/race.go in the current tree.
>
> yes
>
>> What are you doing to
>> define them in your cgo-less scheme?
>
> They are defined as normal Go assembly functions here:
>
https://codereview.appspot.com/55100044/diff/160001/src/pkg/runtime/race_amd64.s
Oh, sorry, I do not define them at all.
When I was using cgo for the callback, they all magically appear from
runtime/race/race_linux_amd64.syso.
The object file is still there, so I expect to be linked in and
provide the functions.
On Tue, Jan 28, 2014 at 10:59 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > On Tue, ...
10 years, 10 months ago
(2014-01-28 19:19:58 UTC)
#9
On Tue, Jan 28, 2014 at 10:59 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Tue, Jan 28, 2014 at 10:58 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Tue, Jan 28, 2014 at 10:55 PM, Ian Lance Taylor <iant@golang.org> wrote:
>>> On Tue, Jan 28, 2014 at 10:48 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>>>
>>>> $ go test -v -race -ldflags="-linkmode=external" -x
>>>>
>>>> pkg/tool/linux_amd64/6l -o $WORK/runtime/_test/runtime.test -L
>>>> $WORK/runtime/_test -L $WORK -linkmode=external -race -installsuffix
>>>> race $WORK/runtime/_test/main.a
>>>> # testmain
>>>> runtime.raceinit: missing section for __tsan_init
>>>> runtime.raceinit: missing section for __tsan_map_shadow
>>>> runtime.racefini: missing section for __tsan_fini
>>>> runtime.racemapshadow: missing section for __tsan_map_shadow
>>>> runtime.racemalloc: missing section for __tsan_malloc
>>>> runtime.racegostart: missing section for __tsan_go_start
>>>> runtime.racegoend: missing section for __tsan_go_end
>>>> runtime.raceacquireg: missing section for __tsan_acquire
>>>> runtime.racereleaseg: missing section for __tsan_release
>>>> runtime.racereleasemergeg: missing section for __tsan_release_merge
>>>> runtime.racefingo: missing section for __tsan_finalizer_goroutine
>>>> runtime.raceread: missing section for __tsan_read
>>>> runtime.racereadpc: missing section for __tsan_read_pc
>>>> runtime.racewrite: missing section for __tsan_write
>>>> runtime.racewritepc: missing section for __tsan_write_pc
>>>> runtime.racereadrange: missing section for __tsan_read_range
>>>> runtime.racereadrangepc: missing section for __tsan_read_range_pc
>>>> runtime.racewriterange: missing section for __tsan_write_range
>>>> runtime.racewriterangepc: missing section for __tsan_write_range_pc
>>>> runtime.racefuncenter: missing section for __tsan_func_enter
>>>> runtime.racefuncexit: missing section for __tsan_func_exit
>>>
>>> This seems to me to be the set of functions defined using cgo in
>>> runtime/race/race.go in the current tree.
>>
>> yes
>>
>>> What are you doing to
>>> define them in your cgo-less scheme?
>>
>> They are defined as normal Go assembly functions here:
>>
https://codereview.appspot.com/55100044/diff/160001/src/pkg/runtime/race_amd64.s
>
> Oh, sorry, I do not define them at all.
> When I was using cgo for the callback, they all magically appear from
> runtime/race/race_linux_amd64.syso.
> The object file is still there, so I expect to be linked in and
> provide the functions.
If the Go code is going to refer to them, then you need to define them
in a way that the Go linker can see. When you use cgo, cgo is
producing a .c file that defines them (_cgo_import.c). When not using
cgo, you need to write a .c file that does
#pragma cgo_import_static SYMBOL
and compile that .c file with 6c. See cmd/cgo/doc.go.
Ian
On 2014/01/28 19:19:58, iant wrote: > On Tue, Jan 28, 2014 at 10:59 AM, Dmitry ...
10 years, 10 months ago
(2014-01-29 12:44:16 UTC)
#10
On 2014/01/28 19:19:58, iant wrote:
> On Tue, Jan 28, 2014 at 10:59 AM, Dmitry Vyukov <mailto:dvyukov@google.com>
wrote:
> > On Tue, Jan 28, 2014 at 10:58 PM, Dmitry Vyukov <mailto:dvyukov@google.com>
wrote:
> >> On Tue, Jan 28, 2014 at 10:55 PM, Ian Lance Taylor <mailto:iant@golang.org>
wrote:
> >>> On Tue, Jan 28, 2014 at 10:48 AM, Dmitry Vyukov
<mailto:dvyukov@google.com> wrote:
> >>>>
> >>>> $ go test -v -race -ldflags="-linkmode=external" -x
> >>>>
> >>>> pkg/tool/linux_amd64/6l -o $WORK/runtime/_test/runtime.test -L
> >>>> $WORK/runtime/_test -L $WORK -linkmode=external -race -installsuffix
> >>>> race $WORK/runtime/_test/main.a
> >>>> # testmain
> >>>> runtime.raceinit: missing section for __tsan_init
> >>>> runtime.raceinit: missing section for __tsan_map_shadow
> >>>> runtime.racefini: missing section for __tsan_fini
> >>>> runtime.racemapshadow: missing section for __tsan_map_shadow
> >>>> runtime.racemalloc: missing section for __tsan_malloc
> >>>> runtime.racegostart: missing section for __tsan_go_start
> >>>> runtime.racegoend: missing section for __tsan_go_end
> >>>> runtime.raceacquireg: missing section for __tsan_acquire
> >>>> runtime.racereleaseg: missing section for __tsan_release
> >>>> runtime.racereleasemergeg: missing section for __tsan_release_merge
> >>>> runtime.racefingo: missing section for __tsan_finalizer_goroutine
> >>>> runtime.raceread: missing section for __tsan_read
> >>>> runtime.racereadpc: missing section for __tsan_read_pc
> >>>> runtime.racewrite: missing section for __tsan_write
> >>>> runtime.racewritepc: missing section for __tsan_write_pc
> >>>> runtime.racereadrange: missing section for __tsan_read_range
> >>>> runtime.racereadrangepc: missing section for __tsan_read_range_pc
> >>>> runtime.racewriterange: missing section for __tsan_write_range
> >>>> runtime.racewriterangepc: missing section for __tsan_write_range_pc
> >>>> runtime.racefuncenter: missing section for __tsan_func_enter
> >>>> runtime.racefuncexit: missing section for __tsan_func_exit
> >>>
> >>> This seems to me to be the set of functions defined using cgo in
> >>> runtime/race/race.go in the current tree.
> >>
> >> yes
> >>
> >>> What are you doing to
> >>> define them in your cgo-less scheme?
> >>
> >> They are defined as normal Go assembly functions here:
> >>
>
https://codereview.appspot.com/55100044/diff/160001/src/pkg/runtime/race_amd64.s
> >
> > Oh, sorry, I do not define them at all.
> > When I was using cgo for the callback, they all magically appear from
> > runtime/race/race_linux_amd64.syso.
> > The object file is still there, so I expect to be linked in and
> > provide the functions.
>
> If the Go code is going to refer to them, then you need to define them
> in a way that the Go linker can see. When you use cgo, cgo is
> producing a .c file that defines them (_cgo_import.c). When not using
> cgo, you need to write a .c file that does
>
> #pragma cgo_import_static SYMBOL
>
> and compile that .c file with 6c. See cmd/cgo/doc.go.
>
> Ian
Thanks!
I've added import_static for the calls:
#pragma cgo_import_static __tsan_release_merge
export static for the callback:
#pragma cgo_export_static __tsan_symbolize
and dynamic import of libc:
#pragma cgo_import_dynamic _ _ "libc.so.6"
and also forced external linking when -race flag is provided.
Now it mostly works.
However I hit this issues (which is not -race related):
https://code.google.com/p/go/issues/detail?id=7234
and it prevents me from doing exhaustive testing and finding what else is
broken.
Ian, I've patched in your fix, and now cmd/cgo crashes as follows: Program received signal ...
10 years, 10 months ago
(2014-01-30 15:14:55 UTC)
#11
Ian, I've patched in your fix, and now cmd/cgo crashes as follows:
Program received signal SIGSEGV, Segmentation fault.
[Switching to LWP 429]
____strtoul_l_internal (nptr=0x7ffff78b8e80 "0-31\n", endptr=0x7ffff78baeb8,
base=10, group=0, loc=0x0) at ../stdlib/strtol_l.c:246
246 ../stdlib/strtol_l.c: No such file or directory.
(gdb) bt
#0 ____strtoul_l_internal (nptr=0x7ffff78b8e80 "0-31\n", endptr=0x7ffff78baeb8,
base=10, group=0, loc=0x0) at ../stdlib/strtol_l.c:246
#1 0x00007ffff7b0cbd9 in __get_nprocs () at
../sysdeps/unix/sysv/linux/getsysstats.c:164
#2 0x00007ffff7a9913e in arena_get2 (size=<optimized out>,
avoid_arena=<optimized out>, a_tsd=<optimized out>) at arena.c:840
#3 0x00007ffff7a9d0cb in arena_get2 (avoid_arena=0x0, size=132224,
a_tsd=<optimized out>) at arena.c:831
#4 __GI___libc_malloc (bytes=132224) at malloc.c:2921
#5 0x0000000000409f14 in __sanitizer::InternalAlloc(unsigned long,
__sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<0ul,
140737488355328ull, 16ul, __sanitizer::SizeClassMap<17ul, 64ul, 14ul>, 24ul,
__sanitizer::TwoLevelByteMap<2048ull, 4096ull,
__sanitizer::NoOpMapUnmapCallback>, __sanitizer::NoOpMapUnmapCallback> >*) ()
#6 0x000000000041cd30 in __tsan_go_start ()
#7 0x000000000049001d in runtime.racecall () at
src/pkg/runtime/race_amd64.s:355
#8 0x000000c208010900 in ?? ()
#9 0x0000000000490487 in runtime.clone () at
src/pkg/runtime/sys_linux_amd64.s:306
#10 0x0000000000000000 in ?? ()
(gdb) disass
Dump of assembler code for function ____strtoul_l_internal:
0x00007ffff7a57240 <+0>: push %r15
0x00007ffff7a57242 <+2>: push %r14
0x00007ffff7a57244 <+4>: mov %rsi,%r14
0x00007ffff7a57247 <+7>: push %r13
0x00007ffff7a57249 <+9>: push %r12
0x00007ffff7a5724b <+11>: push %rbp
0x00007ffff7a5724c <+12>: push %rbx
0x00007ffff7a5724d <+13>: mov %rdi,%rbx
0x00007ffff7a57250 <+16>: sub $0x38,%rsp
0x00007ffff7a57254 <+20>: test %ecx,%ecx
=> 0x00007ffff7a57256 <+22>: mov 0x8(%r8),%rax
It seems that it tries to dereference loc=0x0 parameter.
Is it possible that glibc constructors were not called?
On Thu, Jan 30, 2014 at 7:14 AM, <dvyukov@google.com> wrote: > > Is it possible ...
10 years, 10 months ago
(2014-01-30 16:55:38 UTC)
#12
On Thu, Jan 30, 2014 at 7:14 AM, <dvyukov@google.com> wrote:
>
> Is it possible that glibc constructors were not called?
I haven't looked at exactly what you are doing, but, yes, it is
entirely possible that glibc constructors are not called. In a static
link they are normally called by the glibc startup code in crt1.o,
which calls __libc_start_main. Go does not use the glibc startup
code, and so there would be nothing to call them.
I don't know if that is the issue here, though. The locale parameter
is NULL, which it should not be. The locale parameter is fetched from
a TLS variable which is initialized to &_nl_global_locale. So I
suspect that what is happening here is that TLS variables are not
being initialized correctly for newly created threads. Normally when
using cgo threads are created using pthread_create, which does
initialize TLS variables. How are threads being created in your
program?
Ian
On 2014/01/30 16:55:38, iant wrote: > On Thu, Jan 30, 2014 at 7:14 AM, <mailto:dvyukov@google.com> ...
10 years, 10 months ago
(2014-01-31 09:39:01 UTC)
#13
On 2014/01/30 16:55:38, iant wrote:
> On Thu, Jan 30, 2014 at 7:14 AM, <mailto:dvyukov@google.com> wrote:
> >
> > Is it possible that glibc constructors were not called?
>
> I haven't looked at exactly what you are doing, but, yes, it is
> entirely possible that glibc constructors are not called. In a static
> link they are normally called by the glibc startup code in crt1.o,
> which calls __libc_start_main. Go does not use the glibc startup
> code, and so there would be nothing to call them.
>
> I don't know if that is the issue here, though. The locale parameter
> is NULL, which it should not be. The locale parameter is fetched from
> a TLS variable which is initialized to &_nl_global_locale. So I
> suspect that what is happening here is that TLS variables are not
> being initialized correctly for newly created threads. Normally when
> using cgo threads are created using pthread_create, which does
> initialize TLS variables. How are threads being created in your
> program?
You are right.
runtime thinks that cgo is not linked
and indeed it seems to be not linked, I do not see _cgo_sys_thread_start symbol
in the executable
however, I've added debug prints into linker and that code that adds fake import
of runtime/cgo is executed
On 2014/01/31 09:39:01, dvyukov wrote: > On 2014/01/30 16:55:38, iant wrote: > > On Thu, ...
10 years, 10 months ago
(2014-01-31 10:06:43 UTC)
#14
On 2014/01/31 09:39:01, dvyukov wrote:
> On 2014/01/30 16:55:38, iant wrote:
> > On Thu, Jan 30, 2014 at 7:14 AM, <mailto:dvyukov@google.com> wrote:
> > >
> > > Is it possible that glibc constructors were not called?
> >
> > I haven't looked at exactly what you are doing, but, yes, it is
> > entirely possible that glibc constructors are not called. In a static
> > link they are normally called by the glibc startup code in crt1.o,
> > which calls __libc_start_main. Go does not use the glibc startup
> > code, and so there would be nothing to call them.
> >
> > I don't know if that is the issue here, though. The locale parameter
> > is NULL, which it should not be. The locale parameter is fetched from
> > a TLS variable which is initialized to &_nl_global_locale. So I
> > suspect that what is happening here is that TLS variables are not
> > being initialized correctly for newly created threads. Normally when
> > using cgo threads are created using pthread_create, which does
> > initialize TLS variables. How are threads being created in your
> > program?
>
>
> You are right.
> runtime thinks that cgo is not linked
> and indeed it seems to be not linked, I do not see _cgo_sys_thread_start
symbol
> in the executable
> however, I've added debug prints into linker and that code that adds fake
import
> of runtime/cgo is executed
The problem seems to be related to my changes in cmd/go
On 2014/01/31 10:06:43, dvyukov wrote: > On 2014/01/31 09:39:01, dvyukov wrote: > > On 2014/01/30 ...
10 years, 10 months ago
(2014-01-31 12:09:08 UTC)
#15
On 2014/01/31 10:06:43, dvyukov wrote:
> On 2014/01/31 09:39:01, dvyukov wrote:
> > On 2014/01/30 16:55:38, iant wrote:
> > > On Thu, Jan 30, 2014 at 7:14 AM, <mailto:dvyukov@google.com> wrote:
> > > >
> > > > Is it possible that glibc constructors were not called?
> > >
> > > I haven't looked at exactly what you are doing, but, yes, it is
> > > entirely possible that glibc constructors are not called. In a static
> > > link they are normally called by the glibc startup code in crt1.o,
> > > which calls __libc_start_main. Go does not use the glibc startup
> > > code, and so there would be nothing to call them.
> > >
> > > I don't know if that is the issue here, though. The locale parameter
> > > is NULL, which it should not be. The locale parameter is fetched from
> > > a TLS variable which is initialized to &_nl_global_locale. So I
> > > suspect that what is happening here is that TLS variables are not
> > > being initialized correctly for newly created threads. Normally when
> > > using cgo threads are created using pthread_create, which does
> > > initialize TLS variables. How are threads being created in your
> > > program?
> >
> >
> > You are right.
> > runtime thinks that cgo is not linked
> > and indeed it seems to be not linked, I do not see _cgo_sys_thread_start
> symbol
> > in the executable
> > however, I've added debug prints into linker and that code that adds fake
> import
> > of runtime/cgo is executed
>
>
> The problem seems to be related to my changes in cmd/go
So there is an inherent cyclic dependency even if runtime/race depends on just
runtime/cgo (which is required to properly initialize libc).
runtime/cgo depends on cmd/cgo
and cmd/cgo is a Go program which obviously must depend on runtime/race in race
build
I could resolve it by making race runtime to not depend on libc at all, but I am
not ready to do it right now.
So I will try to stick with continuing using cgo (import "C") in runtime/race
for now. But that usage will be pure nominal, i.e. to ensure correct linking and
initalization.
On 2014/01/31 13:44:22, dvyukov wrote: > OK, it seems to be working now. At least ...
10 years, 10 months ago
(2014-02-06 11:52:31 UTC)
#17
On 2014/01/31 13:44:22, dvyukov wrote:
> OK, it seems to be working now. At least on linux.
> Thanks!
> I will update and test darwin/windows and then send PTAL.
Ian,
I've done lots of refactoring and testing, and now it fully works on linux and
windows. But I see weird problems on darwin:
First, it works on all std tests except this single program:
https://code.google.com/p/go/source/browse/src/pkg/runtime/race/testdata/cgo_...
On that program:
1. Some functions gets stripped from the executable. E.g. I do not see in nm
runtime.racereadrange/racewriterange (while I do see in nm
runtime.raceread/racewrite). I've checked 6l -v -v output, and there is no
message "marktext runtime.racereadrange".
2. runtime.racefuncenter makes CALL to __tsan_func_enter, but in the compiled
binary it makes CALL into a middle of instruction in some unrelated function.
I am completely confused. Do you have any ideas what goes wrong or what else I
can do to debug it?
On 2014/02/06 11:52:31, dvyukov wrote: > > I've done lots of refactoring and testing, and ...
10 years, 10 months ago
(2014-02-06 16:41:59 UTC)
#18
On 2014/02/06 11:52:31, dvyukov wrote:
>
> I've done lots of refactoring and testing, and now it fully works on linux and
> windows. But I see weird problems on darwin:
>
> First, it works on all std tests except this single program:
>
https://code.google.com/p/go/source/browse/src/pkg/runtime/race/testdata/cgo_...
>
> On that program:
> 1. Some functions gets stripped from the executable. E.g. I do not see in nm
> runtime.racereadrange/racewriterange (while I do see in nm
> runtime.raceread/racewrite). I've checked 6l -v -v output, and there is no
> message "marktext runtime.racereadrange".
> 2. runtime.racefuncenter makes CALL to __tsan_func_enter, but in the compiled
> binary it makes CALL into a middle of instruction in some unrelated function.
>
> I am completely confused. Do you have any ideas what goes wrong or what else I
> can do to debug it?
I don't know much about Darwin.
Since this is a cgo test, I assume that the -v output shows that the Go linker
is invoking the external linker--that it, it is running clang to do the link.
In this mode the Go linker will link all the Go code together into a relocatable
object file, and pass that object file to the clang linker.
Is there a reason to think that racereadrange should be included in the output?
The Go linker will discard all unreferenced symbols. This is a simple program,
and perhaps there is nothing that calls racereadrange. What goes wrong if that
symbol is discarded?
The failure to correctly call __tsan_func_enter suggests that the relocation for
that function call is not being represented correctly in the object file
generated by the linker. Use objdump or some similar to look at the relocations
and the symbols to see if they look any different. In some cases that code in
runtime/asm_amd64.s loads the function address into a register and does an
indirect call. I don't know when or why that is necessary.
On 2014/02/06 16:41:59, iant wrote: > On 2014/02/06 11:52:31, dvyukov wrote: > > > > ...
10 years, 10 months ago
(2014-02-06 16:51:54 UTC)
#19
On 2014/02/06 16:41:59, iant wrote:
> On 2014/02/06 11:52:31, dvyukov wrote:
> >
> > I've done lots of refactoring and testing, and now it fully works on linux
and
> > windows. But I see weird problems on darwin:
> >
> > First, it works on all std tests except this single program:
> >
>
https://code.google.com/p/go/source/browse/src/pkg/runtime/race/testdata/cgo_...
> >
> > On that program:
> > 1. Some functions gets stripped from the executable. E.g. I do not see in nm
> > runtime.racereadrange/racewriterange (while I do see in nm
> > runtime.raceread/racewrite). I've checked 6l -v -v output, and there is no
> > message "marktext runtime.racereadrange".
> > 2. runtime.racefuncenter makes CALL to __tsan_func_enter, but in the
compiled
> > binary it makes CALL into a middle of instruction in some unrelated
function.
> >
> > I am completely confused. Do you have any ideas what goes wrong or what else
I
> > can do to debug it?
>
> I don't know much about Darwin.
>
> Since this is a cgo test, I assume that the -v output shows that the Go linker
> is invoking the external linker--that it, it is running clang to do the link.
> In this mode the Go linker will link all the Go code together into a
relocatable
> object file, and pass that object file to the clang linker.
>
> Is there a reason to think that racereadrange should be included in the
output?
> The Go linker will discard all unreferenced symbols. This is a simple
program,
> and perhaps there is nothing that calls racereadrange. What goes wrong if
that
> symbol is discarded?
Maybe you are right and it is just not used.
The crash that I see is caused by the CALL into middle of instruction, so let's
solve that first.
> The failure to correctly call __tsan_func_enter suggests that the relocation
for
> that function call is not being represented correctly in the object file
> generated by the linker. Use objdump or some similar to look at the
relocations
> and the symbols to see if they look any different. In some cases that code in
> runtime/asm_amd64.s loads the function address into a register and does an
> indirect call. I don't know when or why that is necessary.
There are 2 cases:
1. A function is directly called from assembly:
CALL __tsan_read(SB)
2. C code takes address of a function:
runtime·racecall(__tsan_map_shadow, start, size);
and passes it to assembly, which calls it as:
MOVQ» 8(SP), AX» » // function pointer
CALL» AX
In the debugger I see that few race functions that are called indirectly (2)
(e.g. __tsan_map_shadow) pass successfully.
But then __tsan_func_enter crashes. There are high chances that it is the first
function that is called directly (1) from assembly.
That would explain the difference between good and bad functions.
Functions that are called indirectly (2) are declared in C file as:
void __tsan_init(void);
Functions that are called directly (1) are not declared in C file.
All functions (1+2) are marked as cgo_import_static:
#pragma cgo_import_static __tsan_init
#pragma cgo_import_static __tsan_read
Does it say something to you?
Meanwhile I will try to look at the object file.
On 2014/02/06 16:51:54, dvyukov wrote: > On 2014/02/06 16:41:59, iant wrote: > > On 2014/02/06 ...
10 years, 10 months ago
(2014-02-06 17:20:32 UTC)
#20
On 2014/02/06 16:51:54, dvyukov wrote:
> On 2014/02/06 16:41:59, iant wrote:
> > On 2014/02/06 11:52:31, dvyukov wrote:
> > >
> > > I've done lots of refactoring and testing, and now it fully works on linux
> and
> > > windows. But I see weird problems on darwin:
> > >
> > > First, it works on all std tests except this single program:
> > >
> >
>
https://code.google.com/p/go/source/browse/src/pkg/runtime/race/testdata/cgo_...
> > >
> > > On that program:
> > > 1. Some functions gets stripped from the executable. E.g. I do not see in
nm
> > > runtime.racereadrange/racewriterange (while I do see in nm
> > > runtime.raceread/racewrite). I've checked 6l -v -v output, and there is no
> > > message "marktext runtime.racereadrange".
> > > 2. runtime.racefuncenter makes CALL to __tsan_func_enter, but in the
> compiled
> > > binary it makes CALL into a middle of instruction in some unrelated
> function.
> > >
> > > I am completely confused. Do you have any ideas what goes wrong or what
else
> I
> > > can do to debug it?
> >
> > I don't know much about Darwin.
> >
> > Since this is a cgo test, I assume that the -v output shows that the Go
linker
> > is invoking the external linker--that it, it is running clang to do the
link.
> > In this mode the Go linker will link all the Go code together into a
> relocatable
> > object file, and pass that object file to the clang linker.
> >
> > Is there a reason to think that racereadrange should be included in the
> output?
> > The Go linker will discard all unreferenced symbols. This is a simple
> program,
> > and perhaps there is nothing that calls racereadrange. What goes wrong if
> that
> > symbol is discarded?
>
> Maybe you are right and it is just not used.
> The crash that I see is caused by the CALL into middle of instruction, so
let's
> solve that first.
>
>
> > The failure to correctly call __tsan_func_enter suggests that the relocation
> for
> > that function call is not being represented correctly in the object file
> > generated by the linker. Use objdump or some similar to look at the
> relocations
> > and the symbols to see if they look any different. In some cases that code
in
> > runtime/asm_amd64.s loads the function address into a register and does an
> > indirect call. I don't know when or why that is necessary.
>
>
> There are 2 cases:
> 1. A function is directly called from assembly:
> CALL __tsan_read(SB)
> 2. C code takes address of a function:
> runtime·racecall(__tsan_map_shadow, start, size);
> and passes it to assembly, which calls it as:
> MOVQ» 8(SP), AX» » // function pointer
> CALL» AX
>
> In the debugger I see that few race functions that are called indirectly (2)
> (e.g. __tsan_map_shadow) pass successfully.
> But then __tsan_func_enter crashes. There are high chances that it is the
first
> function that is called directly (1) from assembly.
> That would explain the difference between good and bad functions.
>
> Functions that are called indirectly (2) are declared in C file as:
> void __tsan_init(void);
> Functions that are called directly (1) are not declared in C file.
> All functions (1+2) are marked as cgo_import_static:
> #pragma cgo_import_static __tsan_init
> #pragma cgo_import_static __tsan_read
>
> Does it say something to you?
> Meanwhile I will try to look at the object file.
What relocations do I need to look for? For/in racefuncenter/__tsan_func_enter?
I do not see any relocations inside of racefuncenter:
0000000000017a90 t runtime.raceinit
0000000000020e10 t runtime.racefuncenter
0000000000020e50 t runtime.racefuncexit
address pcrel length extern type scattered symbolnum/value
00017b74 0 2 0 0 0 7
00017bda 0 2 0 0 0 7
00017c30 0 2 0 0 0 7
00017cd1 0 2 0 0 0 7
00017cf3 0 2 0 0 0 7
address pcrel length extern type scattered symbolnum/value
00020c5e 0 2 0 0 0 2
00020e8c 0 2 0 0 0 9
00020eae 0 2 0 0 0 9
00020eca 0 2 0 0 0 9
00021268 0 2 0 0 0 6
On 2014/02/06 17:20:32, dvyukov wrote: > > What relocations do I need to look for? ...
10 years, 10 months ago
(2014-02-06 22:07:12 UTC)
#21
On 2014/02/06 17:20:32, dvyukov wrote:
>
> What relocations do I need to look for? For/in
racefuncenter/__tsan_func_enter?
> I do not see any relocations inside of racefuncenter:
Since racefuncenter is defined in the go.o file, and __tsan_func_enter is
defined in the .syso file, I would expect a relocation in the body of
racefuncenter that refers to __tsan_func_enter. The linker needs that in order
to modify the instruction in go.o to refer to the symbol defined in .syso.
If there is no relocation, then that is the problem.
Ian
On 2014/02/06 16:51:54, dvyukov wrote: > On 2014/02/06 16:41:59, iant wrote: > > > The ...
10 years, 10 months ago
(2014-02-06 22:21:19 UTC)
#22
On 2014/02/06 16:51:54, dvyukov wrote:
> On 2014/02/06 16:41:59, iant wrote:
>
> > The failure to correctly call __tsan_func_enter suggests that the relocation
> for
> > that function call is not being represented correctly in the object file
> > generated by the linker. Use objdump or some similar to look at the
> relocations
> > and the symbols to see if they look any different. In some cases that code
in
> > runtime/asm_amd64.s loads the function address into a register and does an
> > indirect call. I don't know when or why that is necessary.
>
>
> There are 2 cases:
> 1. A function is directly called from assembly:
> CALL __tsan_read(SB)
> 2. C code takes address of a function:
> runtime·racecall(__tsan_map_shadow, start, size);
> and passes it to assembly, which calls it as:
> MOVQ» 8(SP), AX» » // function pointer
> CALL» AX
There is a third case in runtime/asm_amd64.s. I don't know if it is meaningful
or should just be written differently.
MOVQ $runtime·cgocallback_gofunc(SB), AX
CALL AX
Ian
On Fri, Feb 7, 2014 at 2:21 AM, <iant@golang.org> wrote: > On 2014/02/06 16:51:54, dvyukov ...
10 years, 10 months ago
(2014-02-07 05:39:21 UTC)
#23
On Fri, Feb 7, 2014 at 2:21 AM, <iant@golang.org> wrote:
> On 2014/02/06 16:51:54, dvyukov wrote:
>>
>> On 2014/02/06 16:41:59, iant wrote:
>
>
>> > The failure to correctly call __tsan_func_enter suggests that the
>
> relocation
>>
>> for
>> > that function call is not being represented correctly in the object
>
> file
>>
>> > generated by the linker. Use objdump or some similar to look at the
>> relocations
>> > and the symbols to see if they look any different. In some cases
>
> that code in
>>
>> > runtime/asm_amd64.s loads the function address into a register and
>
> does an
>>
>> > indirect call. I don't know when or why that is necessary.
>
>
>
>> There are 2 cases:
>> 1. A function is directly called from assembly:
>> CALL __tsan_read(SB)
>> 2. C code takes address of a function:
>> runtime·racecall(__tsan_map_shadow, start, size);
>> and passes it to assembly, which calls it as:
>> MOVQ» 8(SP), AX» » // function pointer
>> CALL» AX
>
>
> There is a third case in runtime/asm_amd64.s. I don't know if it is
> meaningful or should just be written differently.
>
> MOVQ $runtime·cgocallback_gofunc(SB), AX
> CALL AX
Sometimes we do this to disable stack depth checker on NOSPLIT
functions, e.g. when you are switching stack in a function, you don't
want calls in this function to be accounted against the same NOSPLIT
chain.
On 2014/02/06 22:21:19, iant wrote: > On 2014/02/06 16:51:54, dvyukov wrote: > > On 2014/02/06 ...
10 years, 9 months ago
(2014-02-24 15:15:11 UTC)
#25
On 2014/02/06 22:21:19, iant wrote:
> On 2014/02/06 16:51:54, dvyukov wrote:
> > On 2014/02/06 16:41:59, iant wrote:
> >
> > > The failure to correctly call __tsan_func_enter suggests that the
relocation
> > for
> > > that function call is not being represented correctly in the object file
> > > generated by the linker. Use objdump or some similar to look at the
> > relocations
> > > and the symbols to see if they look any different. In some cases that
code
> in
> > > runtime/asm_amd64.s loads the function address into a register and does an
> > > indirect call. I don't know when or why that is necessary.
> >
> >
> > There are 2 cases:
> > 1. A function is directly called from assembly:
> > CALL __tsan_read(SB)
> > 2. C code takes address of a function:
> > runtime·racecall(__tsan_map_shadow, start, size);
> > and passes it to assembly, which calls it as:
> > MOVQ» 8(SP), AX» » // function pointer
> > CALL» AX
>
> There is a third case in runtime/asm_amd64.s. I don't know if it is
meaningful
> or should just be written differently.
>
> MOVQ $runtime·cgocallback_gofunc(SB), AX
> CALL AX
Ian, I've decided to use indirect CALL AX for now. This slows down execution a
bit but not radically.
Here are enough changes that I want to push before 1.3. We can always tweak
assembly later.
Ready for review.
PTAL
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c File src/pkg/runtime/race.c (right): https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c#newcode62 src/pkg/runtime/race.c:62: // racecall allows to call an arbitrary function f ...
10 years, 9 months ago
(2014-02-24 17:40:18 UTC)
#26
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c File src/pkg/runtime/race.c (right): https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c#newcode276 src/pkg/runtime/race.c:276: runtime·racesymbolize(SymbolizeContext *ctx) On 2014/02/24 17:40:19, iant wrote: > This ...
10 years, 9 months ago
(2014-02-24 17:59:16 UTC)
#27
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c
File src/pkg/runtime/race.c (right):
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c#ne...
src/pkg/runtime/race.c:276: runtime·racesymbolize(SymbolizeContext *ctx)
On 2014/02/24 17:40:19, iant wrote:
> This code is going to be called on the g0 stack. I'm not sure what that means
> with regard to stack splitting. The function is not marked as NOSPLIT, and
nor
> are the functions that it calls. Where is the stackguard set up?
The g0 stack (aka the m stack) is the OS stack. It has something big, like 64k,
available.
On Mon, Feb 24, 2014 at 9:59 AM, <rsc@golang.org> wrote: > > https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c > File ...
10 years, 9 months ago
(2014-02-24 18:10:47 UTC)
#28
On Mon, Feb 24, 2014 at 9:59 AM, <rsc@golang.org> wrote:
>
> https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c
> File src/pkg/runtime/race.c (right):
>
>
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c#ne...
> src/pkg/runtime/race.c:276: runtime·racesymbolize(SymbolizeContext *ctx)
> On 2014/02/24 17:40:19, iant wrote:
>>
>> This code is going to be called on the g0 stack. I'm not sure what
>
> that means
>>
>> with regard to stack splitting. The function is not marked as
>
> NOSPLIT, and nor
>>
>> are the functions that it calls. Where is the stackguard set up?
>
>
> The g0 stack (aka the m stack) is the OS stack. It has something big,
> like 64k, available.
Sure, but since the function is not NOSPLIT it is going to check
whether there is enough stack available, and I don't understand where
that stack guard is being set.
Ian
On Mon, Feb 24, 2014 at 10:10 AM, Ian Lance Taylor <iant@golang.org> wrote: > On ...
10 years, 9 months ago
(2014-02-24 18:11:30 UTC)
#29
On Mon, Feb 24, 2014 at 10:10 AM, Ian Lance Taylor <iant@golang.org> wrote:
> On Mon, Feb 24, 2014 at 9:59 AM, <rsc@golang.org> wrote:
>>
>> https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c
>> File src/pkg/runtime/race.c (right):
>>
>>
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c#ne...
>> src/pkg/runtime/race.c:276: runtime·racesymbolize(SymbolizeContext *ctx)
>> On 2014/02/24 17:40:19, iant wrote:
>>>
>>> This code is going to be called on the g0 stack. I'm not sure what
>>
>> that means
>>>
>>> with regard to stack splitting. The function is not marked as
>>
>> NOSPLIT, and nor
>>>
>>> are the functions that it calls. Where is the stackguard set up?
>>
>>
>> The g0 stack (aka the m stack) is the OS stack. It has something big,
>> like 64k, available.
>
> Sure, but since the function is not NOSPLIT it is going to check
> whether there is enough stack available, and I don't understand where
> that stack guard is being set.
Or I guess it just comes from the g0 goroutine. Never mind. Sorry.
Ian
I am a bit worried about all the duplicated code in race_amd64.s. It seems like ...
10 years, 9 months ago
(2014-02-24 18:20:23 UTC)
#30
I am a bit worried about all the duplicated code in race_amd64.s.
It seems like you should be able to coalesce it into a single function and make
each of the existing functions a simple stub. You've started this with
racereadrange and racewriterange, but I think you can go further. Assume the
second argument is always an address to check, and do the check in the stub.
Also, please use the FP notation when reading arguments. Then you can drop the
comments since the names will be in the actual code.
#define CHECKARG R10
TEXT runtime.raceread(SB),NOSPLIT,$0-8
MOVQ $__tsan_read(SB), AX
MOVQ addr+0(FP), RARG1
MOVQ 0(SP), RARG2 // caller pc
JMP racecalladdr<>(SB)
TEXT runtime.racereadpc(SB),NOSPLIT,$0-8
MOVQ $__tsan_read(SB), AX
MOVQ addr+0(FP), RARG1
MOVQ callpc+8(FP), RARG2
MOVQ pc+16(FP), RARG3
JMP racecalladdr<>(SB)
TEXT runtime.racefuncenter(SB),NOSPLIT,$0-8
MOVQ $__tsan_func_enter(SB), AX
MOVQ pc+0(FP), RARG1
JMP racecallnoaddr<>(SB)
...
TEXT racecalladdr<>(SB),NOSPLIT,$0-0
// If addr (RARG1) is out of range, do nothing.
// Otherwise, invoke racecall. Arguments already set.
CMPQ RARG1, $noptrdata(SB)
JB racenop
CMPQ RARG1, $enoptrbss(SB)
JB raceok
CMPQ RARG1, runtime.racearenastart(SB)
JB racenop
CMPQ RARG1, runtime.racearenaend(SB)
JAE racenop
raceok:
JMP racecall<>(SB)
racenop:
RET
TEXT racecall<>(SB),NOSPLIT,$0-0
get_tls(R12)
MOVQ m(R12), R13
MOVQ g(R12), R14
CMPQ m_g0(R13), R14
JEQ racecrash
MOVQ g_racectx(R14), RARG0
// Switch to g0 stack for the call.
MOVQ SP, R12 // callee save, preserved across CALL
MOVQ m_g0(R13), R15
MOVQ (g_sched+gobuf_sp)(R15), SP
ANDQ $~15, SP
CALL AX
MOVQ R12, SP
RET
racecrash:
MOVQ $0, 0xfff1 // crash, because we are on g0 stack
There might be a bug here, but you get the idea. There's just one copy of the
address check, and just one copy of the stack switch + call.
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race_amd64.s
File src/pkg/runtime/race_amd64.s (right):
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race_amd6...
src/pkg/runtime/race_amd64.s:65: // Switch to g0 stack.
On 2014/02/24 17:40:19, iant wrote:
> This will fail horribly if you are already on the g0 stack. I wonder if it's
> worth adding a test, as in runtime·morestack in asm_amd64.s.
>
> Same comment for the code below, of course.
Today we may not run any Go code on the g0 stack, and perhaps we never will. But
perhaps we will. If so we'd want to not annotate that code with race calls. This
is definitely worth checking. Just crash if you find yourself on g0.
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race_amd64.s File src/pkg/runtime/race_amd64.s (right): https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race_amd64.s#newcode27 src/pkg/runtime/race_amd64.s:27: // They use callee-saved registers, because they are preserved ...
10 years, 9 months ago
(2014-02-24 18:25:49 UTC)
#31
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c File src/pkg/runtime/race.c (right): https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c#newcode62 src/pkg/runtime/race.c:62: // racecall allows to call an arbitrary function f ...
10 years, 9 months ago
(2014-02-25 09:24:57 UTC)
#32
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c
File src/pkg/runtime/race.c (right):
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c#ne...
src/pkg/runtime/race.c:62: // racecall allows to call an arbitrary function f
from C race runtime
On 2014/02/24 17:40:19, iant wrote:
> s/to call/calling/
Done.
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race.c#ne...
src/pkg/runtime/race.c:276: runtime·racesymbolize(SymbolizeContext *ctx)
On 2014/02/24 17:59:17, rsc wrote:
> On 2014/02/24 17:40:19, iant wrote:
> > This code is going to be called on the g0 stack. I'm not sure what that
means
> > with regard to stack splitting. The function is not marked as NOSPLIT, and
> nor
> > are the functions that it calls. Where is the stackguard set up?
>
> The g0 stack (aka the m stack) is the OS stack. It has something big, like
64k,
> available.
Yes, the effect of race call and then callback through racesymbolizethunk is
equal to mcall. So this function is no special than any other C code in runtime
running on g0.
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race_amd64.s
File src/pkg/runtime/race_amd64.s (right):
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race_amd6...
src/pkg/runtime/race_amd64.s:11: // The following thunks allow to call
gcc-compiled race runtime directly from Go code
On 2014/02/24 17:40:19, iant wrote:
> s/to call/calling the/
Done.
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race_amd6...
src/pkg/runtime/race_amd64.s:27: // They use callee-saved registers, because
they are preserved across calls into C.
On 2014/02/24 18:25:50, rsc wrote:
> This is kind of confusing. The only place where this matters is the use of R12
> to hold SP across the call. The rest of the registers can be anything as long
as
> they are not the arguments. I would delete this line in favor of a single
> comment where R12 is used (as in the sample code I sent in my previous mail).
Done.
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race_amd6...
src/pkg/runtime/race_amd64.s:65: // Switch to g0 stack.
On 2014/02/24 18:20:24, rsc wrote:
> On 2014/02/24 17:40:19, iant wrote:
> > This will fail horribly if you are already on the g0 stack. I wonder if
it's
> > worth adding a test, as in runtime·morestack in asm_amd64.s.
> >
> > Same comment for the code below, of course.
>
> Today we may not run any Go code on the g0 stack, and perhaps we never will.
But
> perhaps we will. If so we'd want to not annotate that code with race calls.
This
> is definitely worth checking. Just crash if you find yourself on g0.
In order for this to happen, somebody must explicitly enable instrumentation of
Go code in runtime package in compiler. That change will fail horribly on race
builders because this code is extremely heavy examined.
We can't simply switch from annotations to instrumentation, because when a
memory access comes from g0, which user goroutine is it attached to? We can't
communicate a memory access w/o user goroutine context.
This looks too distant and vague to me, I don't want to pay for this today.
Every additional instruction in this function costs some percents of execution
time.
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race_amd6...
src/pkg/runtime/race_amd64.s:69: ANDQ $~15, SP // alignment for gcc ABI
On 2014/02/24 17:40:19, iant wrote:
> In runtime·asmcgocall the code also subtracts 64 from SP, with a comment about
> "room for 4 stack-based fast-call registers as per windows amd64 calling
> convention." Is that necessary here? If not, why not?
>
> Same comment for the code below.
Added the comment:
// SP must be 16-byte aligned.
+// Windows also requires "stack-backing" for the 4 register arguments:
+// http://msdn.microsoft.com/en-us/library/ms235286.aspx
+// We do not do this, because it seems to be intended for vararg/unprototyped
functions.
+// Gcc-compiled race runtime does not try to use that space.
On 2014/02/24 18:20:23, rsc wrote: > I am a bit worried about all the duplicated ...
10 years, 9 months ago
(2014-02-25 09:40:38 UTC)
#33
On 2014/02/24 18:20:23, rsc wrote:
> I am a bit worried about all the duplicated code in race_amd64.s.
>
> It seems like you should be able to coalesce it into a single function and
make
> each of the existing functions a simple stub. You've started this with
> racereadrange and racewriterange, but I think you can go further. Assume the
> second argument is always an address to check, and do the check in the stub.
>
> Also, please use the FP notation when reading arguments. Then you can drop the
> comments since the names will be in the actual code.
>
> #define CHECKARG R10
>
> TEXT runtime.raceread(SB),NOSPLIT,$0-8
> MOVQ $__tsan_read(SB), AX
> MOVQ addr+0(FP), RARG1
> MOVQ 0(SP), RARG2 // caller pc
> JMP racecalladdr<>(SB)
>
> TEXT runtime.racereadpc(SB),NOSPLIT,$0-8
> MOVQ $__tsan_read(SB), AX
> MOVQ addr+0(FP), RARG1
> MOVQ callpc+8(FP), RARG2
> MOVQ pc+16(FP), RARG3
> JMP racecalladdr<>(SB)
>
> TEXT runtime.racefuncenter(SB),NOSPLIT,$0-8
> MOVQ $__tsan_func_enter(SB), AX
> MOVQ pc+0(FP), RARG1
> JMP racecallnoaddr<>(SB)
>
> ...
>
> TEXT racecalladdr<>(SB),NOSPLIT,$0-0
> // If addr (RARG1) is out of range, do nothing.
> // Otherwise, invoke racecall. Arguments already set.
> CMPQ RARG1, $noptrdata(SB)
> JB racenop
> CMPQ RARG1, $enoptrbss(SB)
> JB raceok
> CMPQ RARG1, runtime.racearenastart(SB)
> JB racenop
> CMPQ RARG1, runtime.racearenaend(SB)
> JAE racenop
> raceok:
> JMP racecall<>(SB)
> racenop:
> RET
>
> TEXT racecall<>(SB),NOSPLIT,$0-0
> get_tls(R12)
> MOVQ m(R12), R13
> MOVQ g(R12), R14
> CMPQ m_g0(R13), R14
> JEQ racecrash
> MOVQ g_racectx(R14), RARG0
> // Switch to g0 stack for the call.
> MOVQ SP, R12 // callee save, preserved across CALL
> MOVQ m_g0(R13), R15
> MOVQ (g_sched+gobuf_sp)(R15), SP
> ANDQ $~15, SP
> CALL AX
> MOVQ R12, SP
> RET
> racecrash:
> MOVQ $0, 0xfff1 // crash, because we are on g0 stack
>
> There might be a bug here, but you get the idea. There's just one copy of the
> address check, and just one copy of the stack switch + call.
>
>
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race_amd64.s
> File src/pkg/runtime/race_amd64.s (right):
>
>
https://codereview.appspot.com/55100044/diff/670001/src/pkg/runtime/race_amd6...
> src/pkg/runtime/race_amd64.s:65: // Switch to g0 stack.
> On 2014/02/24 17:40:19, iant wrote:
> > This will fail horribly if you are already on the g0 stack. I wonder if
it's
> > worth adding a test, as in runtime·morestack in asm_amd64.s.
> >
> > Same comment for the code below, of course.
>
> Today we may not run any Go code on the g0 stack, and perhaps we never will.
But
> perhaps we will. If so we'd want to not annotate that code with race calls.
This
> is definitely worth checking. Just crash if you find yourself on g0.
Yes, initially I had even more functions in assembly. But then transformed lots
to the common racecall, and unified the range access functions because they are
not so common and also heavier-weight by itself.
raceread/write/funcenter/exit are very intentionally copy-pasted and
specialized.
I am also ready to remove special code racereadpc/racewritepc, but I do not see
how to do it w/o affecting raceread/racewrite.
Every additional instruction in the hot functions costs some percents of
execution time.
Long-term I would to move in the opposite direction -- simplify address range
check, remove stack alignment (if we align it when we save g0 SP), replace the
indirect calls with static calls, allocate separate tls slot for race context
(so we don't need to dereference g), etc.
These thunks still consume 15% of execution time.
On 2014/03/06 15:35:17, rsc wrote: > LGTM > > Thank you. The refactored code is ...
10 years, 9 months ago
(2014-03-06 15:46:06 UTC)
#37
On 2014/03/06 15:35:17, rsc wrote:
> LGTM
>
> Thank you. The refactored code is much better. Why the :( ? Is it
significantly
> slower?
12% slowdown
it's additive, so when other parts become faster, it will account for more
Is CL description updated with new (12% slower) numbers? On Thu, Mar 6, 2014 at ...
10 years, 9 months ago
(2014-03-06 17:14:13 UTC)
#39
Is CL description updated with new (12% slower) numbers?
On Thu, Mar 6, 2014 at 7:46 AM, <dvyukov@google.com> wrote:
> On 2014/03/06 15:35:17, rsc wrote:
>
>> LGTM
>>
>
> Thank you. The refactored code is much better. Why the :( ? Is it
>>
> significantly
>
>> slower?
>>
>
> 12% slowdown
> it's additive, so when other parts become faster, it will account for
> more
>
>
>
> https://codereview.appspot.com/55100044/
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-codereviews" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-codereviews+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
no On Thu, Mar 6, 2014 at 9:14 PM, Brad Fitzpatrick <bradfitz@golang.org> wrote: > Is ...
10 years, 9 months ago
(2014-03-06 17:15:08 UTC)
#40
no
On Thu, Mar 6, 2014 at 9:14 PM, Brad Fitzpatrick <bradfitz@golang.org> wrote:
> Is CL description updated with new (12% slower) numbers?
>
>
>
> On Thu, Mar 6, 2014 at 7:46 AM, <dvyukov@google.com> wrote:
>>
>> On 2014/03/06 15:35:17, rsc wrote:
>>>
>>> LGTM
>>
>>
>>> Thank you. The refactored code is much better. Why the :( ? Is it
>>
>> significantly
>>>
>>> slower?
>>
>>
>> 12% slowdown
>> it's additive, so when other parts become faster, it will account for
>> more
>>
>>
>>
>> https://codereview.appspot.com/55100044/
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "golang-codereviews" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to golang-codereviews+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>
>
*** Submitted as https://code.google.com/p/go/source/detail?r=6a8bd83965b3 *** runtime: use custom thunks for race calls instead of cgo ...
10 years, 9 months ago
(2014-03-06 19:48:45 UTC)
#41
*** Submitted as https://code.google.com/p/go/source/detail?r=6a8bd83965b3 ***
runtime: use custom thunks for race calls instead of cgo
Implement custom assembly thunks for hot race calls (memory accesses and
function entry/exit).
The thunks extract caller pc, verify that the address is in heap or global and
switch to g0 stack.
Before:
ok regexp 3.692s
ok compress/bzip2 9.461s
ok encoding/json 6.380s
After:
ok regexp 2.229s (-40%)
ok compress/bzip2 4.703s (-50%)
ok encoding/json 3.629s (-43%)
For comparison, normal non-race build:
ok regexp 0.348s
ok compress/bzip2 0.304s
ok encoding/json 0.661s
Race build:
ok regexp 2.229s (+540%)
ok compress/bzip2 4.703s (+1447%)
ok encoding/json 3.629s (+449%)
Also removes some race-related special cases from cgocall and scheduler.
In long-term it will allow to remove cyclic runtime/race dependency on cmd/cgo.
Fixes issue 4249.
Fixes issue 7460.
Update issue 6508
Update issue 6688
R=iant, rsc, bradfitz
CC=golang-codereviews
https://codereview.appspot.com/55100044
Issue 55100044: code review 55100044: runtime: use custom thunks for race calls instead of cgo
(Closed)
Created 10 years, 10 months ago by dvyukov
Modified 10 years, 9 months ago
Reviewers: gobot
Base URL:
Comments: 15