runtime: use GetQueuedCompletionStatusEx on windows if available
GetQueuedCompletionStatusEx allows to dequeue a batch of completion
notifications, which is more efficient than dequeueing one by one.
benchmark old ns/op new ns/op delta
BenchmarkClientServerParallel4 100605 90945 -9.60%
BenchmarkClientServerParallel4-2 90225 74504 -17.42%
I've made some initial comments, but lets land https://codereview.appspot.com/12413043/ before we do proper review. Please, ...
11 years, 7 months ago
(2013-08-05 02:59:34 UTC)
#2
I've made some initial comments, but lets land
https://codereview.appspot.com/12413043/ before we do proper review. Please,
ping me when ready. Thank you.
Alex
https://codereview.appspot.com/12436044/diff/12001/src/pkg/net/fd_windows.go
File src/pkg/net/fd_windows.go (right):
https://codereview.appspot.com/12436044/diff/12001/src/pkg/net/fd_windows.go#...
src/pkg/net/fd_windows.go:205: err = syscall.WSAGetOverlappedResult(o.fd.sysfd,
&o.o, &o.qty, 0, &flags)
I am concerned that we are doing 2 syscalls (instead of 1) on systems without
GetQueuedCompletionStatusEx. Also on systems with GetQueuedCompletionStatusEx
this WSAGetOverlappedResult call would run runtime.cgocall (which does
entersyscall / exitsyscall among other things). Maybe you could move all this
error retrieval logic into runtime - you could use stdcall there.
Given all this extra calls + complications, are you sure it is worth pursuing
batch retrievals?
https://codereview.appspot.com/12436044/diff/12001/src/pkg/runtime/netpoll_wi...
File src/pkg/runtime/netpoll_windows.c (right):
https://codereview.appspot.com/12436044/diff/12001/src/pkg/runtime/netpoll_wi...
src/pkg/runtime/netpoll_windows.c:126:
Could you, please explain for me (I don't have to put in the source), your logic
here. Why wouldn't take as many completions as you can in one single call?
Also the extra NtQueryIoCompletion syscall, is it all paying for itself? In
worse case scenario (1 pending completed io), which, I think, will be common,
you will end up with 3 syscalls instead of 1.
On 2013/08/05 02:59:34, brainman wrote: > I've made some initial comments, but lets land > ...
11 years, 7 months ago
(2013-08-06 13:05:30 UTC)
#3
On 2013/08/05 02:59:34, brainman wrote:
> I've made some initial comments, but lets land
> https://codereview.appspot.com/12413043/ before we do proper review. Please,
> ping me when ready. Thank you.
>
> Alex
>
> https://codereview.appspot.com/12436044/diff/12001/src/pkg/net/fd_windows.go
> File src/pkg/net/fd_windows.go (right):
>
>
https://codereview.appspot.com/12436044/diff/12001/src/pkg/net/fd_windows.go#...
> src/pkg/net/fd_windows.go:205: err =
syscall.WSAGetOverlappedResult(o.fd.sysfd,
> &o.o, &o.qty, 0, &flags)
> I am concerned that we are doing 2 syscalls (instead of 1) on systems without
> GetQueuedCompletionStatusEx. Also on systems with GetQueuedCompletionStatusEx
> this WSAGetOverlappedResult call would run runtime.cgocall (which does
> entersyscall / exitsyscall among other things). Maybe you could move all this
> error retrieval logic into runtime - you could use stdcall there.
>
> Given all this extra calls + complications, are you sure it is worth pursuing
> batch retrievals?
>
>
https://codereview.appspot.com/12436044/diff/12001/src/pkg/runtime/netpoll_wi...
> File src/pkg/runtime/netpoll_windows.c (right):
>
>
https://codereview.appspot.com/12436044/diff/12001/src/pkg/runtime/netpoll_wi...
> src/pkg/runtime/netpoll_windows.c:126:
> Could you, please explain for me (I don't have to put in the source), your
logic
> here. Why wouldn't take as many completions as you can in one single call?
>
> Also the extra NtQueryIoCompletion syscall, is it all paying for itself? In
> worse case scenario (1 pending completed io), which, I think, will be common,
> you will end up with 3 syscalls instead of 1.
Hi Alex,
I've synched to tip and simplified this change significantly.
It does not use NtQueryIoCompletion and WSAGetOverlappedResult is called from
runtime, so no syscall overhead (and no changes to fd_windows.go).
In general I think that using GQCSEx is important because (1) it should provide
speedup for real programs, and (2) it unifies implicit interface between netpoll
and scheduler (return a batch of ready goroutines). (2) will allow to do further
scheduling improvements that will benefit all OSes.
PTAL
*** Submitted as https://code.google.com/p/go/source/detail?r=23e06fccd3b9 *** runtime: use GetQueuedCompletionStatusEx on windows if available GetQueuedCompletionStatusEx allows to ...
11 years, 7 months ago
(2013-08-08 13:42:13 UTC)
#5
*** Submitted as https://code.google.com/p/go/source/detail?r=23e06fccd3b9 ***
runtime: use GetQueuedCompletionStatusEx on windows if available
GetQueuedCompletionStatusEx allows to dequeue a batch of completion
notifications, which is more efficient than dequeueing one by one.
benchmark old ns/op new ns/op delta
BenchmarkClientServerParallel4 100605 90945 -9.60%
BenchmarkClientServerParallel4-2 90225 74504 -17.42%
R=golang-dev, alex.brainman
CC=golang-dev
https://codereview.appspot.com/12436044
Issue 12436044: code review 12436044: runtime: use GetQueuedCompletionStatusEx on windows if ...
(Closed)
Created 11 years, 7 months ago by dvyukov
Modified 11 years, 7 months ago
Reviewers:
Base URL:
Comments: 2