Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(9)

Issue 69190044: code review 69190044: runtime: fix use after close race in Solaris network poller

Can't Edit
Can't Publish+Mail
Start Review
Created:
11 years, 9 months ago by aram
Modified:
11 years, 9 months ago
Reviewers:
iant, dvyukov
CC:
golang-codereviews, bradfitz, iant, dvyukov, aram2, gobot
Visibility:
Public.

Description

runtime: fix use after close race in Solaris network poller The Solaris network poller uses event ports, which are level-triggered. As such, it has to re-arm itself after each wakeup. The arming mechanism (which runs in its own thread) raced with the closing of a file descriptor happening in a different thread. When a network file descriptor is about to be closed, the network poller is awaken to give it a chance to remove its association with the file descriptor. Because the poller always re-armed itself, it raced with code that closed the descriptor. This change makes the network poller check before re-arming if the file descriptor is about to be closed, in which case it will ignore the re-arming request. It uses the per-PollDesc lock in order to serialize access to the PollDesc. This change also adds extensive documentation describing the Solaris implementation of the network poller. Fixes issue 7410.

Patch Set 1 #

Patch Set 2 : diff -r 3512bf47df7e https://code.google.com/p/go #

Patch Set 3 : diff -r 3512bf47df7e https://code.google.com/p/go #

Total comments: 2

Patch Set 4 : diff -r 3cd546ac3a99 https://code.google.com/p/go #

Patch Set 5 : diff -r 3cd546ac3a99 https://code.google.com/p/go #

Total comments: 4

Patch Set 6 : diff -r ceb13960d05b https://code.google.com/p/go #

Patch Set 7 : diff -r ceb13960d05b https://code.google.com/p/go #

Total comments: 8

Patch Set 8 : diff -r 08dcdcdb757b https://code.google.com/p/go #

Patch Set 9 : diff -r 08dcdcdb757b https://code.google.com/p/go #

Unified diffs Side-by-side diffs Delta from patch set Stats (+131 lines, -30 lines) Patch
M src/pkg/runtime/netpoll.goc View 1 2 3 1 chunk +18 lines, -0 lines 0 comments Download
M src/pkg/runtime/netpoll_solaris.c View 1 2 3 4 5 6 7 7 chunks +110 lines, -30 lines 0 comments Download
M src/pkg/runtime/runtime.h View 1 2 3 4 5 6 7 1 chunk +3 lines, -0 lines 0 comments Download

Messages

Total messages: 51
aram
Hello golang-codereviews@googlegroups.com, I'd like you to review this change to https://code.google.com/p/go
11 years, 9 months ago (2014-02-26 18:16:08 UTC) #1
aram
This should fix both the builders.
11 years, 9 months ago (2014-02-26 18:16:50 UTC) #2
bradfitz
[+iant, dmitry] On Wed, Feb 26, 2014 at 10:16 AM, <aram@mgk.ro> wrote: > Reviewers: golang-codereviews, ...
11 years, 9 months ago (2014-02-26 18:18:11 UTC) #3
iant
I don't understand how this avoids the race. I can see that it narrows the ...
11 years, 9 months ago (2014-02-26 19:11:19 UTC) #4
aram
On Wed, Feb 26, 2014 at 8:11 PM, <iant@golang.org> wrote: > I don't understand how ...
11 years, 9 months ago (2014-02-26 19:26:53 UTC) #5
iant
On Wed, Feb 26, 2014 at 11:26 AM, Aram Hăvărneanu <aram@mgk.ro> wrote: > On Wed, ...
11 years, 9 months ago (2014-02-26 20:40:20 UTC) #6
dvyukov
Why do you need to rearm it after polling? You arm it before each netpollwait, ...
11 years, 9 months ago (2014-02-27 13:25:27 UTC) #7
aram
> Why do you need to rearm it after polling? > You arm it before ...
11 years, 9 months ago (2014-03-02 12:56:22 UTC) #8
dvyukov
On 2014/03/02 12:56:22, aram wrote: > > Why do you need to rearm it after ...
11 years, 9 months ago (2014-03-02 15:24:28 UTC) #9
dvyukov
I do not understand what invariant you want to preserve: 1. fd is always armed ...
11 years, 9 months ago (2014-03-02 15:36:41 UTC) #10
aram
> Good point, I also don't understand why that can't happen. Well this can't happen ...
11 years, 9 months ago (2014-03-07 20:48:10 UTC) #11
iant
On Fri, Mar 7, 2014 at 12:48 PM, <aram@mgk.ro> wrote: >> Good point, I also ...
11 years, 9 months ago (2014-03-08 00:01:56 UTC) #12
aram
Ian Lance Taylor <iant@golang.org> wrote: > ... Thanks for the detailed explanation. > I don't ...
11 years, 9 months ago (2014-03-08 12:28:14 UTC) #13
aram
I'd like to reference Dmitry's #5 comment in this bug report here: https://code.google.com/p/go/issues/detail?id=6955#c5 Dmitry, care ...
11 years, 9 months ago (2014-03-08 19:00:17 UTC) #14
dvyukov
On 2014/03/08 19:00:17, aram wrote: > I'd like to reference Dmitry's #5 comment in this ...
11 years, 9 months ago (2014-03-11 10:23:30 UTC) #15
dvyukov
On 2014/03/08 00:01:56, iant wrote: > On Fri, Mar 7, 2014 at 12:48 PM, <mailto:aram@mgk.ro> ...
11 years, 9 months ago (2014-03-11 10:24:29 UTC) #16
dvyukov
On 2014/03/08 12:28:14, aram wrote: > Ian Lance Taylor <mailto:iant@golang.org> wrote: > > ... > ...
11 years, 9 months ago (2014-03-11 10:27:21 UTC) #17
aram
> I still do not understand why we need to arm it in netpollopen, nobody ...
11 years, 9 months ago (2014-03-11 10:43:57 UTC) #18
aram
On Tue, Mar 11, 2014 at 11:23 AM, <dvyukov@google.com> wrote: > Nevermind. Maybe I was ...
11 years, 9 months ago (2014-03-11 10:44:15 UTC) #19
dvyukov
https://codereview.appspot.com/69190044/diff/40001/src/pkg/runtime/netpoll_solaris.c File src/pkg/runtime/netpoll_solaris.c (right): https://codereview.appspot.com/69190044/diff/40001/src/pkg/runtime/netpoll_solaris.c#newcode105 src/pkg/runtime/netpoll_solaris.c:105: } while(runtime·cas(ep, old, events) != events); cas return success/fail ...
11 years, 9 months ago (2014-03-11 10:45:07 UTC) #20
dvyukov
Here is what I would do: 1. remove port_associate from netpollopen 2. mutate the events ...
11 years, 9 months ago (2014-03-11 10:47:41 UTC) #21
aram
On Tue, Mar 11, 2014 at 11:47 AM, <dvyukov@google.com> wrote: > remove port_associate from netpollopen ...
11 years, 9 months ago (2014-03-11 11:30:48 UTC) #22
dvyukov
On Tue, Mar 11, 2014 at 3:30 PM, Aram Hăvărneanu <aram@mgk.ro> wrote: > On Tue, ...
11 years, 9 months ago (2014-03-11 11:49:20 UTC) #23
aram
> I do not see any need for the association while nobody is waiting for ...
11 years, 9 months ago (2014-03-11 13:33:27 UTC) #24
aram
Hello golang-codereviews@googlegroups.com, bradfitz@golang.org, iant@golang.org, dvyukov@google.com (cc: golang-codereviews@googlegroups.com), Please take another look.
11 years, 9 months ago (2014-03-11 13:33:39 UTC) #25
dvyukov
On Tue, Mar 11, 2014 at 5:33 PM, <aram@mgk.ro> wrote: >> I do not see ...
11 years, 9 months ago (2014-03-11 13:35:58 UTC) #26
aram2
In #8 you can see the test timeout panic trace. -- Aram Hăvărneanu
11 years, 9 months ago (2014-03-11 13:37:42 UTC) #27
dvyukov
On 2014/03/11 13:37:42, aram2 wrote: > In #8 you can see the test timeout panic ...
11 years, 9 months ago (2014-03-11 13:51:29 UTC) #28
aram
> #8 was about re-arming after polling > I am asking about arming in netpollopen ...
11 years, 9 months ago (2014-03-11 17:08:00 UTC) #29
dvyukov
But why does it hang? Maybe there is some other latent issue.
11 years, 9 months ago (2014-03-11 17:39:33 UTC) #30
gobot
R=golang-dev (assigned by bradfitz@golang.org)
11 years, 9 months ago (2014-03-11 17:53:54 UTC) #31
aram
On Tue, Mar 11, 2014 at 6:39 PM, <dvyukov@google.com> wrote: > But why does it ...
11 years, 9 months ago (2014-03-11 18:02:08 UTC) #32
dvyukov
On 2014/03/11 18:02:08, aram wrote: > On Tue, Mar 11, 2014 at 6:39 PM, <mailto:dvyukov@google.com> ...
11 years, 9 months ago (2014-03-11 18:12:02 UTC) #33
aram
On Tue, Mar 11, 2014 at 7:12 PM, <dvyukov@google.com> wrote: > What if we call ...
11 years, 9 months ago (2014-03-11 18:23:42 UTC) #34
dvyukov
On 2014/03/11 18:23:42, aram wrote: > On Tue, Mar 11, 2014 at 7:12 PM, <mailto:dvyukov@google.com> ...
11 years, 9 months ago (2014-03-11 18:34:50 UTC) #35
aram
Could we review the locking bits? If there are any doubts about the design of ...
11 years, 9 months ago (2014-03-11 19:22:21 UTC) #36
aram
> You've said: "Because runtime·netpoll is blocked in port_getn, and > calling port_associate during this ...
11 years, 9 months ago (2014-03-11 19:22:35 UTC) #37
dvyukov
OK, now I see how it fixes the closing race https://codereview.appspot.com/69190044/diff/70001/src/pkg/runtime/netpoll_solaris.c File src/pkg/runtime/netpoll_solaris.c (right): https://codereview.appspot.com/69190044/diff/70001/src/pkg/runtime/netpoll_solaris.c#newcode109 ...
11 years, 9 months ago (2014-03-12 05:58:42 UTC) #38
dvyukov
On 2014/03/11 19:22:21, aram wrote: > Could we review the locking bits? If there are ...
11 years, 9 months ago (2014-03-12 05:59:10 UTC) #39
dvyukov
On 2014/03/11 19:22:21, aram wrote: > Could we review the locking bits? If there are ...
11 years, 9 months ago (2014-03-12 05:59:10 UTC) #40
dvyukov
On 2014/03/11 19:22:35, aram wrote: > > You've said: "Because runtime·netpoll is blocked in port_getn, ...
11 years, 9 months ago (2014-03-12 06:11:00 UTC) #41
aram
> Ah, OK, now it all makes sense. So there are 2 separate things: (1) ...
11 years, 9 months ago (2014-03-13 12:46:29 UTC) #42
aram
Hello golang-codereviews@googlegroups.com, bradfitz@golang.org, iant@golang.org, dvyukov@google.com, aram.h@mgk.ro, gobot@golang.org (cc: golang-codereviews@googlegroups.com), Please take another look.
11 years, 9 months ago (2014-03-13 12:46:49 UTC) #43
dvyukov
On 2014/03/13 12:46:29, aram wrote: > > Ah, OK, now it all makes sense. So ...
11 years, 9 months ago (2014-03-14 06:54:55 UTC) #44
dvyukov
https://codereview.appspot.com/69190044/diff/110001/src/pkg/runtime/netpoll_solaris.c File src/pkg/runtime/netpoll_solaris.c (right): https://codereview.appspot.com/69190044/diff/110001/src/pkg/runtime/netpoll_solaris.c#newcode156 src/pkg/runtime/netpoll_solaris.c:156: // Updates the the association with a new set ...
11 years, 9 months ago (2014-03-14 06:55:44 UTC) #45
dvyukov
Ian, do you want to take another look? I am going to l-g-t-m it after ...
11 years, 9 months ago (2014-03-14 06:56:53 UTC) #46
aram
https://codereview.appspot.com/69190044/diff/110001/src/pkg/runtime/netpoll_solaris.c File src/pkg/runtime/netpoll_solaris.c (right): https://codereview.appspot.com/69190044/diff/110001/src/pkg/runtime/netpoll_solaris.c#newcode156 src/pkg/runtime/netpoll_solaris.c:156: // Updates the the association with a new set ...
11 years, 9 months ago (2014-03-14 12:28:32 UTC) #47
aram
Hello golang-codereviews@googlegroups.com, bradfitz@golang.org, iant@golang.org, dvyukov@google.com, aram.h@mgk.ro, gobot@golang.org (cc: golang-codereviews@googlegroups.com), Please take another look.
11 years, 9 months ago (2014-03-14 12:28:47 UTC) #48
dvyukov
LGTM let's wait for min(Ian reply, 1 day)
11 years, 9 months ago (2014-03-14 12:31:55 UTC) #49
iant
LGTM
11 years, 9 months ago (2014-03-14 13:47:58 UTC) #50
dvyukov
11 years, 9 months ago (2014-03-14 13:53:22 UTC) #51
*** Submitted as https://code.google.com/p/go/source/detail?r=b6b69b97b4b7 ***

runtime: fix use after close race in Solaris network poller

The Solaris network poller uses event ports, which are
level-triggered. As such, it has to re-arm itself after each
wakeup. The arming mechanism (which runs in its own thread) raced
with the closing of a file descriptor happening in a different
thread. When a network file descriptor is about to be closed,
the network poller is awaken to give it a chance to remove its
association with the file descriptor. Because the poller always
re-armed itself, it raced with code that closed the descriptor.

This change makes the network poller check before re-arming if
the file descriptor is about to be closed, in which case it will
ignore the re-arming request. It uses the per-PollDesc lock in
order to serialize access to the PollDesc.

This change also adds extensive documentation describing the
Solaris implementation of the network poller.

Fixes issue 7410.

LGTM=dvyukov, iant
R=golang-codereviews, bradfitz, iant, dvyukov, aram.h, gobot
CC=golang-codereviews
https://codereview.appspot.com/69190044

Committer: Dmitriy Vyukov <dvyukov@google.com>
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b