Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(1891)

Issue 9857046: tomb: fix data race on t.reason (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
10 years, 11 months ago by dave
Modified:
10 years, 11 months ago
Reviewers:
gustavo, dimitern, mp+166445, fwereade, niemeyer
Visibility:
Public.

Description

tomb: fix data race on t.reason t.m.Lock must be held to properly observe changes to t.reason. https://code.launchpad.net/~dave-cheney/tomb/001-fix-data-race-on-reason/+merge/166445 (do not edit description out of merge proposal)

Patch Set 1 #

Total comments: 1
Unified diffs Side-by-side diffs Delta from patch set Stats (+4 lines, -0 lines) Patch
A [revision details] View 1 chunk +2 lines, -0 lines 0 comments Download
M tomb.go View 1 chunk +2 lines, -0 lines 1 comment Download

Messages

Total messages: 9
dave_cheney.net
Please take a look.
10 years, 11 months ago (2013-05-30 06:41:24 UTC) #1
dimitern
LGTM, good catch!
10 years, 11 months ago (2013-05-30 06:45:08 UTC) #2
fwereade
LGTM. Yay race detector!
10 years, 11 months ago (2013-05-30 13:52:00 UTC) #3
niemeyer
https://codereview.appspot.com/9857046/diff/1/tomb.go File tomb.go (right): https://codereview.appspot.com/9857046/diff/1/tomb.go#newcode112 tomb.go:112: defer t.m.Unlock() I cannot see the race.
10 years, 11 months ago (2013-05-30 23:01:40 UTC) #4
dave_cheney.net
Without this change, tomb.Wait is not guaranteed to see the value assigned to t.reason on ...
10 years, 11 months ago (2013-05-30 23:28:11 UTC) #5
gustavo_niemeyer.net
On Thu, May 30, 2013 at 8:28 PM, Dave Cheney <dave@cheney.net> wrote: > Without this ...
10 years, 11 months ago (2013-05-30 23:54:58 UTC) #6
dave_cheney.net
Mate, i'm not going to ask you to commit this if it is wrong. If ...
10 years, 11 months ago (2013-05-30 23:57:31 UTC) #7
gustavo_niemeyer.net
On Thu, May 30, 2013 at 8:57 PM, Dave Cheney <dave@cheney.net> wrote: > Mate, i'm ...
10 years, 11 months ago (2013-05-31 00:47:34 UTC) #8
gustavo_niemeyer.net
10 years, 11 months ago (2013-05-31 01:30:07 UTC) #9
On Thu, May 30, 2013 at 9:47 PM, Gustavo Niemeyer <gustavo@niemeyer.net> wrote:
> On Thu, May 30, 2013 at 8:57 PM, Dave Cheney <dave@cheney.net> wrote:
>> Mate, i'm not going to ask you to commit this if it is wrong. If you
>> prefer I will raise a bug with Dmitry about the race detector.
>
> That seems to reflect very poorly what I said, Dave.
>
> Please leave the issue with me. Thank you.

------------------------------------------------------------
revno: 17
revision-id: gustavo@niemeyer.net-20130531003818-70ikdgklbxopn8x4
parent: gustavo@niemeyer.net-20120404032216-03l0c7zep5s1m1fd
committer: Gustavo Niemeyer <gustavo@niemeyer.net>
branch nick: tomb
timestamp: Thu 2013-05-30 21:38:18 -0300
message:
  Fix race found by race detector, and reported by Dave Cheney.

  Here is more information than you'll want to know:

  If goroutine 1 is doing:

      t.Kill(nil)
      t.Wait()

  and goroutine 2 is doing:

      t.Wait()

  and goroutine 3 is doing:

      t.Done()

  There is a race, because goroutine 3 could finish first, and then
  goroutine 1 might run, and the memory model won't guarantee that
  goroutine 2 actually sees t.reason recorded by the Kill(nil) of
  goroutine 1.

  This specific case is more of a theoretical problem than a real one,
  for the following reasons:

  - The memory model guarantees that the close(t.done) performed
    by t.Done() will Happen Before the <-t.done done by either t.Wait,
    which means both goroutine 1 and 2 will always necessarily see a
    real error coming out of the goroutine(s) being monitored by t.

  - The Kill(nil) performed by goroutine 1 won't affect the value
    of t.reason due to the logic in the Kill method (although,
    observing the memory model pedantically means goroutine 1 might
    do whatever it pleases with the memory region, due to lack of
    synchronization).

  In a different scenario, though, t.Kill in goroutine 1 might be
  performed with a non-nil error, and goroutine 2 would see either
  nil or the error, and t.reason might be in an intermediate unknown
  state, so the race is in fact real. Solving this specific race via
  a mutex, as done in this change, still won't mean that this
  behavior is sane, though: what t.Wait() in goroutine 2 observes
  will be either the prior error value, or the new error value,
  based purely on timing of the two Wait calls.

  The change introduced protects t.reason with the tomb's mutex,
  which silents the race detector, and makes the logic entirely
  memory-model friendly.
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b