Left: | ||
Right: |
OLD | NEW |
---|---|
(Empty) | |
1 <!--{ | |
2 "Title": "Race Detector", | |
3 "Template": true | |
minux1
2012/12/14 09:56:09
does this document really need templates?
If you
| |
4 }--> | |
5 | |
6 <h2>Introduction</h2> | |
7 | |
8 <p> | |
9 Data races are one of the most common and hardest to debug types of bugs in conc urrent systems. A data race occurs when two goroutines access the same variable w/o proper synchronization and at least one of the accesses is write. See the <a href="/ref/mem">The Go Memory Model</a> for details. | |
fss
2012/12/14 12:16:05
Isn't "without" better than "w/o"?
dvyukov
2012/12/14 12:29:42
Done.
| |
10 </p> | |
11 | |
12 <p> | |
13 Here is an example of a data race on map variable that can lead to crashes and m emory corruptions: | |
14 </p> | |
15 | |
16 <pre> | |
17 func main() { | |
18 c := make(chan bool) | |
19 m := make(map[string]string) | |
20 go func() { | |
21 m["1"] = "a" // First conflicting access. | |
22 c <- true | |
23 }() | |
24 m["2"] = "b" // Second conflicting access. | |
25 <-c | |
26 for k, v := range m { | |
27 fmt.Println(k, v) | |
28 } | |
29 } | |
30 </pre> | |
31 | |
32 <h2>Usage</h2> | |
33 | |
34 <p> | |
35 Fortunately, Go includes built-in data race detector. The usage is very simple -- you just need to add -race flag to go command: | |
fss
2012/12/14 12:16:05
s/includes built-in/includes a built-in/
s/flag t
dvyukov
2012/12/14 12:29:42
Done.
| |
36 </p> | |
37 | |
38 <pre> | |
39 $ go test -race mypkg // to test the package | |
40 $ go run -race mysrc.go // to run the source file | |
41 $ go build -race mycmd // to build the command | |
42 $ go install -race mypkg // to install the package | |
43 </pre> | |
44 | |
45 <h2>Report Format</h2> | |
46 | |
47 <p> | |
48 When the race detector finds a data race in the program, it prints an informativ e report. The report contains stack traces for conflicting accesses, as well as stacks where the involved goroutines were created. You may see an example belo w: | |
49 </p> | |
50 | |
51 <pre> | |
52 WARNING: DATA RACE | |
53 Read by goroutine 185: | |
54 net.(*pollServer).AddFD() | |
55 src/pkg/net/fd_unix.go:89 +0x398 | |
56 net.(*pollServer).WaitWrite() | |
57 src/pkg/net/fd_unix.go:247 +0x45 | |
58 net.(*netFD).Write() | |
59 src/pkg/net/fd_unix.go:540 +0x4d4 | |
60 net.(*conn).Write() | |
61 src/pkg/net/net.go:129 +0x101 | |
62 net.func·060() | |
63 src/pkg/net/timeout_test.go:603 +0xaf | |
64 | |
65 Previous write by goroutine 184: | |
66 net.setWriteDeadline() | |
67 src/pkg/net/sockopt_posix.go:135 +0xdf | |
68 net.setDeadline() | |
69 src/pkg/net/sockopt_posix.go:144 +0x9c | |
70 net.(*conn).SetDeadline() | |
71 src/pkg/net/net.go:161 +0xe3 | |
72 net.func·061() | |
73 src/pkg/net/timeout_test.go:616 +0x3ed | |
74 | |
75 Goroutine 185 (running) created at: | |
76 net.func·061() | |
77 src/pkg/net/timeout_test.go:609 +0x288 | |
78 | |
79 Goroutine 184 (running) created at: | |
80 net.TestProlongTimeout() | |
81 src/pkg/net/timeout_test.go:618 +0x298 | |
82 testing.tRunner() | |
83 src/pkg/testing/testing.go:301 +0xe8 | |
84 </pre> | |
85 | |
86 <h2>Options</h2> | |
87 | |
88 <p> | |
89 You can pass some options to the race detector by means of <code>GORACE</code> e nvironment variable. The format is: | |
90 </p> | |
91 | |
92 <pre> | |
93 GORACE="option1=val1 option2=val2" | |
94 </pre> | |
95 | |
96 <p> | |
97 The options are: | |
98 </p> | |
99 <li> log_path: Tells race detector to write reports to 'log_path.pid' file. The special values are 'stdout' and 'stderr'. The default is 'stderr'.</li> | |
100 <li> exitcode: Override exit status of the process if something was reported. D efault value is 66.</li> | |
101 <li> strip_path_prefix: Allows to strip beginnings of file paths in reports to m ake them more concise.</li> | |
102 <li> history_size: Per-goroutine history size, controls how many previous memory accesses are remembered per goroutine. Possible values are [0..7]. history_si ze=0 amounts to 32K memory accesses. Each next value doubles the amount of memo ry accesses, up to history_size=7 that amounts to 4M memory accesses. The defau lt value is 1 (64K memory accesses). Try to increase this value when you see "f ailed to restore the stack" in reports. However, it can significantly increase memory consumption.</li> | |
103 | |
104 <p> | |
105 Example: | |
106 </p> | |
107 | |
108 <pre> | |
109 $ GORACE="log_path=/tmp/race/report strip_path_prefix=/my/go/sources/" go test - race | |
110 </pre> | |
111 | |
112 <h2>How To Use</h2> | |
113 | |
114 <p> | |
115 You may start with just running your tests under the race detector (<code>go tes t -race</code>). However sometimes tests have limited coverage, especially with respect to concurrency. The race detector finds only races that actually happe n in the execution, it can't find races in code paths that were not executed. S o it may be beneficial to run the whole program built with -race under a realist ic workload, frequently it discovers much more bugs than tests. | |
116 </p> | |
117 | |
118 <h2>Typical Data Races</h2> | |
119 | |
120 <p> | |
121 Here are some example of typical data races. All of them can be automatically d etected with the race detector. | |
fss
2012/12/14 12:16:05
s/example/examples/
dvyukov
2012/12/14 12:29:42
Done.
| |
122 </p> | |
123 | |
124 <h3>Race on loop counter</h3> | |
125 | |
126 <pre> | |
127 func main() { | |
128 var wg sync.WaitGroup | |
129 wg.Add(5) | |
130 for i := 0; i < 5; i++ { | |
131 go func() { | |
132 fmt.Println(i) // Not the 'i' you are looking for. | |
133 wg.Done() | |
134 }() | |
135 } | |
136 wg.Wait() | |
137 } | |
138 </pre> | |
139 | |
140 <p> | |
141 Closures capture variables by reference rather than by value, so the reads of th e <code>i</code> variable in the goroutines race with <code>i</code> increment i n the loop statement. Such program typically outputs 55555 instead of expected 01234. The program can be fixed by explicitly making a copy of the loop counter : | |
142 </p> | |
143 | |
144 <pre> | |
145 func main() { | |
146 var wg sync.WaitGroup | |
147 wg.Add(5) | |
148 for i := 0; i < 5; i++ { | |
149 go func(j int) { | |
150 fmt.Println(j) // Good. Read local copy of the loop cou nter. | |
151 wg.Done() | |
152 }(i) | |
153 } | |
154 wg.Wait() | |
155 } | |
156 </pre> | |
157 | |
158 <h3>Accidentally shared variable</h3> | |
159 | |
160 <pre> | |
161 // ParallelWrite writes data to file1 and file2, returns the errors. | |
162 func ParallelWrite(data []byte) chan error { | |
163 res := make(chan error, 2) | |
164 f1, err := os.Create("file1") | |
165 if err != nil { | |
166 res <- err | |
167 } else { | |
168 go func() { | |
169 // This err is shared with the main goroutine, | |
170 // so the write races with the write below. | |
171 _, err = f1.Write(data) | |
172 res <- err | |
173 f1.Close() | |
174 }() | |
175 } | |
176 f2, err := os.Create("file2") // The second conflicting write to err. | |
177 if err != nil { | |
178 res <- err | |
179 } else { | |
180 go func() { | |
181 _, err = f2.Write(data) | |
182 res <- err | |
183 f2.Close() | |
184 }() | |
185 } | |
186 return res | |
187 } | |
188 </pre> | |
189 | |
190 <p> | |
191 The fix is simple, one just needs to introduce new variables in the goroutines ( note <code>:=</code>): | |
192 </p> | |
193 | |
194 <pre> | |
195 _, err := f1.Write(data) | |
196 ... | |
197 _, err := f2.Write(data) | |
198 </pre> | |
199 | |
200 <h3>Unprotected global variable</h3> | |
201 | |
202 <p> | |
203 If the following code is called from several goroutines, it leads to bad races o n the <code>services</code> map. | |
204 </p> | |
205 | |
206 <pre> | |
207 var services map[string]net.Addr | |
208 | |
209 func RegisterService(name string, addr net.Addr) { | |
210 services[name] = addr | |
211 } | |
212 | |
213 func GetService(name string) net.Addr { | |
214 return services[name] | |
215 } | |
216 </pre> | |
217 | |
218 <p> | |
219 It can be fixed by protecting the accesses with a mutex: | |
220 </p> | |
221 | |
222 <pre> | |
223 var services map[string]net.Addr | |
224 var mu sync.Mutex | |
225 | |
226 func RegisterService(name string, addr net.Addr) { | |
227 mu.Lock() | |
228 defer mu.Unlock() | |
229 services[name] = addr | |
230 } | |
231 | |
232 func GetService(name string) net.Addr { | |
233 mu.Lock() | |
234 defer mu.Unlock() | |
235 return services[name] | |
236 } | |
237 </pre> | |
238 | |
239 <h3>Primitive unprotected variable</h3> | |
240 | |
241 <p> | |
242 Data races can happen on variables of primitive types as well (<code>bool</code> , <code>int</code>, <code>int64</code>), like in the following example: | |
243 </p> | |
244 | |
245 <pre> | |
246 type Watchdog struct { last int64 } | |
247 | |
248 func (w *Watchdog) KeepAlive() { | |
249 w.last = time.Now().UnixNano() // First conflicting access. | |
250 } | |
251 | |
252 func (w *Watchdog) Start() { | |
253 go func() { | |
254 for { | |
255 time.Sleep(time.Second) | |
256 // Second conflicting access. | |
257 if w.last < time.Now().Add(-10*time.Second).UnixNano() { | |
258 fmt.Println("No keepalives for 10 seconds. Dying .") | |
259 os.Exit(1) | |
260 } | |
261 } | |
262 }() | |
263 } | |
264 </pre> | |
265 | |
266 <p> | |
267 Even such "innocent" data races can lead to hard to debug problems caused by (1) non-atomicity of the memory accesses, (2) interference with compiler optimizati ons and (3) processor memory access reordering issues. | |
268 </p> | |
269 | |
270 <p> | |
271 To fix such data race one can use (aside from <code>chan</code> and <code>sync.M utex</code>) package <a href="/pkg/sync/atomic"><code>sync/atomic</code></a>, wh ich provides atomic operations on primitive types. <a href="/pkg/sync/atomic">< code>sync/atomic</code></a> functions solve all of the above issues. | |
272 </p> | |
273 | |
274 <pre> | |
275 type Watchdog struct { last int64 } | |
276 | |
277 func (w *Watchdog) KeepAlive() { | |
278 atomic.StoreInt64(&w.last, time.Now().UnixNano()) | |
279 } | |
280 | |
281 func (w *Watchdog) Start() { | |
282 go func() { | |
283 for { | |
284 time.Sleep(time.Second) | |
285 if atomic.LoadInt64(&w.last) < time.Now().Add(-10*time.S econd).UnixNano() { | |
286 fmt.Println("No keepalives for 10 seconds. Dying .") | |
287 os.Exit(1) | |
288 } | |
289 } | |
290 }() | |
291 } | |
292 </pre> | |
293 | |
294 <h2>Supported Platforms</h2> | |
295 | |
296 <p> | |
297 Supported platforms are darwin/amd64, linux/amd64 and windows/amd64. | |
298 </p> | |
299 | |
300 <h2>Runtime Overheads</h2> | |
301 | |
302 <p> | |
303 The data race detector significantly increases both memory consumption and execu tion time. The concrete numbers are highly dependent on the particular program, but some reference numbers would be: memory consumption ~5-10x, execution time ~2-20x. | |
304 </p> | |
OLD | NEW |