Issue 12352044: state: hopefully make tests more reliable

Can't Edit
Can't Publish+Mail
Start Review

Created:
11 years, 11 months ago by rog

Modified:
11 years, 11 months ago

Reviewers:
mp+178357, axw1, dimitern, gz

Visibility:
Public.

Description

state: hopefully make tests more reliable We had a sporadic test failure in MachineSuite.TestWatchMachine: the last AssertOneChange call received an extra event. My analysis of the failure goes something like this. There are two possible scenarios: scenario 1: call Machine.Watch remove machine StartSync state/watcher polls, finds that object has disappeared EntityWatcher asks to watch object EntityWatcher sends initial value on channel state/watcher produces no value, because there's no object. test passes scenario 2: call Machine.Watch remove machine StartSync EntityWatcher asks to watch object EntityWatcher sends initial value on channel state/watcher produces a value for the object state/watcher polls, finds that object has disappeared EntityWatcher receives watcher change and sends to Changes test reads unexpected value test fails The problem is that when StartSync returns we have no idea whether the state/watcher code has started to poll for new changes or not. By using Sync instead of StartSync, we ensure that at least the state/watcher code is in a known state before continuing to assert whether the watchers built on top of it it will or won't send events. The only reason for StartSync's existence, AFAIR, is to avoid a possible deadlock situation where the state/watcher code is trying to send on a watcher channel but the test code is not ready to receive on that channel. However, in all the places that use NotifyWatcherC and StringsWatcherC, the state/watcher code is actually sending to a separate goroutine which is always ready to receive any change. So it's safe to use Sync rather than StartSync. I do not understand the "hence cannot verify real-world coalescence behaviour" comment in the NewLax* functions. If there is coalescence to be verified, using StartSync rather than Sync is a poor way of verifying it, as the only allowance that StartSync makes is a few nanoseconds and one or two context switches - that's not much opportunity for coalescence to happen, and all tests pass with this CL, which leads me to suspect smoke and mirrors. If anyone sees a MachineSuite.TestWatchMachine sporadic failure with this CL applied, I'd very much like to know, as it means my analysis is flawed :-). FWIW, another possibility would be to make StartSync return when it's certain that the watcher will start a poll before doing anything else. This would involve a certain amount of restructuring of the state/watcher code. https://code.launchpad.net/~rogpeppe/juju-core/359-no-lax/+merge/178357 (do not edit description out of merge proposal)

Patch Set 1 #

Patch Set 2 : state: hopefully make tests more reliable #

Created: 11 years, 11 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Stats (+21 lines, -53 lines)			Patch
A	[revision details]	View	1 chunk	+2 lines, -0 lines	0 comments	Download
M	state/machine_test.go	View	4 chunks	+4 lines, -4 lines	0 comments	Download
M	state/service_test.go	View	2 chunks	+2 lines, -2 lines	0 comments	Download
M	state/state_test.go	View	3 chunks	+3 lines, -3 lines	0 comments	Download
M	state/testing/watcher.go	View	6 chunks	+8 lines, -42 lines	0 comments	Download
M	state/unit_test.go	View	2 chunks	+2 lines, -2 lines	0 comments	Download

Messages

Total messages: 4

Expand All Messages | Collapse All Messages

LGTM. If I understand this proposal correctly, it should also fix the bug I just ...

11 years, 11 months ago (2013-08-07 18:04:36 UTC) #3

axw1

11 years, 11 months ago (2013-08-08 01:06:37 UTC) #4

On 2013/08/02 18:37:40, rog wrote:
> Please take a look.

Unnecessary now, but I had a closer look this morning and LGTM.

Expand All Messages | Collapse All Messages