Descriptionstate: hopefully make tests more reliable
We had a sporadic test failure in MachineSuite.TestWatchMachine:
the last AssertOneChange call received an extra event.
My analysis of the failure goes something like this.
There are two possible scenarios:
scenario 1:
call Machine.Watch
remove machine
StartSync
state/watcher polls, finds that object has disappeared
EntityWatcher asks to watch object
EntityWatcher sends initial value on channel
state/watcher produces no value, because there's no object.
test passes
scenario 2:
call Machine.Watch
remove machine
StartSync
EntityWatcher asks to watch object
EntityWatcher sends initial value on channel
state/watcher produces a value for the object
state/watcher polls, finds that object has disappeared
EntityWatcher receives watcher change and sends to Changes
test reads unexpected value
test fails
The problem is that when StartSync returns we have no idea
whether the state/watcher code has started to poll for new
changes or not.
By using Sync instead of StartSync, we ensure that at least
the state/watcher code is in a known state before continuing
to assert whether the watchers built on top of it it will or won't
send events.
The only reason for StartSync's existence, AFAIR, is to avoid
a possible deadlock situation where the state/watcher code
is trying to send on a watcher channel but the test code is
not ready to receive on that channel.
However, in all the places that use NotifyWatcherC and StringsWatcherC,
the state/watcher code is actually sending to a separate goroutine
which is always ready to receive any change. So it's safe
to use Sync rather than StartSync.
I do not understand the "hence cannot verify real-world coalescence behaviour"
comment in the NewLax* functions. If there is coalescence to be verified,
using StartSync rather than Sync is a poor way of verifying it, as
the only allowance that StartSync makes is a few nanoseconds and one or two context
switches - that's not much opportunity for coalescence to happen,
and all tests pass with this CL, which leads me to suspect smoke
and mirrors.
If anyone sees a MachineSuite.TestWatchMachine sporadic failure
with this CL applied, I'd very much like to know, as it means my
analysis is flawed :-).
FWIW, another possibility would be to make StartSync return when
it's certain that the watcher will start a poll before doing anything else.
This would involve a certain amount of restructuring of the
state/watcher code.
https://code.launchpad.net/~rogpeppe/juju-core/359-no-lax/+merge/178357
(do not edit description out of merge proposal)
Patch Set 1 #Patch Set 2 : state: hopefully make tests more reliable #
MessagesTotal messages: 4
|