Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(1026)

Issue 10420045: state: aggressive removal of pending units

Can't Edit
Can't Publish+Mail
Start Review
Created:
10 years, 10 months ago by fwereade
Modified:
10 years, 10 months ago
Reviewers:
mp+170723, wallyworld, rog
Visibility:
Public.

Description

state: aggressive removal of pending units In response to lp:1190715, we make the following changes: * service destruction now adds a cleanup doc that churns through all a service's units destroying or removing them where possible. * unit destruction short-circuiting is now attempted when the unit agent has not yet set a status. ...and make the following opportunistic changes for sanity's and clarity's sakes: * clarify and make consistent the similar transaction-building loops in service.go and unit.go (in particular, Unit.Remove was noticeably simplified). * resolve races around unit destruction in which charm changes and machine assignment changes are not properly taken into account. ...supported by testing changes as follows: * added SetRetryHooks for verifying a commonly-interesting-to-test scenario (set up txn; break txn; verify effect of retried txn) * SetBeforeHook became SetBeforeHooks, because that happened while I was figuring out SetRetryHooks, and I decided it was a nice change. * preventUnitDestroyRemove is much simpler and no longer needs the state argument; this is annoyingly noisy. * assertUnitLife and assertUnitRemoved were taken off UnitSuite and used elsewhere. The macroscopic effect of this is currently limited. Units that are manually removed will be removed more quickly more often, which is nice; but the big problem in lp:1190715 is related to service destruction, and that's not directly helped by this branch. When TheMue merges the cleanup worker integration branch, though, this bug will be fixed pretty nicely for all environments running this code. If we were also to explicitly call Cleanup in the CLI after destroying a service, we could make clients with this code work correctly against environments back to 1.10 as well. Is this important enough to justify the ickiness? Or will we just recommend that everybody upgrade everything as soon as possible, rendering the issue somewhat moot? Discuss. https://code.launchpad.net/~fwereade/juju-core/state-quick-remove-units/+merge/170723 (do not edit description out of merge proposal)

Patch Set 1 #

Total comments: 14

Patch Set 2 : state: aggressive removal of pending units #

Unified diffs Side-by-side diffs Delta from patch set Stats (+419 lines, -156 lines) Patch
A [revision details] View 1 1 chunk +2 lines, -0 lines 0 comments Download
M cmd/jujud/machine_test.go View 1 2 chunks +17 lines, -1 line 0 comments Download
M state/conn_test.go View 1 chunk +1 line, -1 line 0 comments Download
M state/export_test.go View 1 3 chunks +34 lines, -5 lines 0 comments Download
M state/life_test.go View 1 chunk +1 line, -1 line 0 comments Download
M state/machine_test.go View 1 4 chunks +9 lines, -7 lines 0 comments Download
M state/relation.go View 2 chunks +2 lines, -11 lines 0 comments Download
M state/relationunit.go View 1 chunk +3 lines, -0 lines 0 comments Download
M state/relationunit_test.go View 1 chunk +1 line, -0 lines 0 comments Download
M state/service.go View 6 chunks +22 lines, -14 lines 0 comments Download
M state/service_test.go View 1 4 chunks +50 lines, -4 lines 0 comments Download
M state/state.go View 1 3 chunks +56 lines, -15 lines 0 comments Download
M state/state_test.go View 2 chunks +8 lines, -14 lines 0 comments Download
M state/tools_test.go View 1 chunk +1 line, -1 line 0 comments Download
M state/unit.go View 3 chunks +60 lines, -50 lines 0 comments Download
M state/unit_test.go View 1 12 chunks +124 lines, -31 lines 0 comments Download
M worker/deployer/deployer_test.go View 1 3 chunks +8 lines, -0 lines 0 comments Download
M worker/resumer/resumer_test.go View 1 chunk +1 line, -1 line 0 comments Download
M worker/uniter/filter_test.go View 1 3 chunks +19 lines, -0 lines 0 comments Download

Messages

Total messages: 4
fwereade
Please take a look.
10 years, 10 months ago (2013-06-21 00:57:31 UTC) #1
wallyworld
I like the new txn hooks. Very nice way to test concurrent modification issues. I ...
10 years, 10 months ago (2013-06-21 02:37:40 UTC) #2
rog
LGTM although I'm sure I don't get all the subtlety. A few minor suggestions below. ...
10 years, 10 months ago (2013-06-21 12:03:28 UTC) #3
fwereade
10 years, 10 months ago (2013-06-21 13:36:17 UTC) #4
Please take a look.

https://codereview.appspot.com/10420045/diff/1/state/service_test.go
File state/service_test.go (right):

https://codereview.appspot.com/10420045/diff/1/state/service_test.go#newcode1058
state/service_test.go:1058: for i := 0; i < 5; i++ {
On 2013/06/21 12:03:28, rog wrote:
> for i := range units {

Nice, thanks.

https://codereview.appspot.com/10420045/diff/1/state/state.go
File state/state.go (right):

https://codereview.appspot.com/10420045/diff/1/state/state.go#newcode1164
state/state.go:1164: Insert: doc,
On 2013/06/21 12:03:28, rog wrote:
> On 2013/06/21 02:37:40, wallyworld wrote:
> > Not really required I guess, but should we add an Assert: txn.DocMissing ?
> 
> -1. no need to have more round-trips than necessary, and NewObjectId
> will ensure that the object does not exist.

Agreed. Added doc comment.

https://codereview.appspot.com/10420045/diff/1/state/unit.go
File state/unit.go (right):

https://codereview.appspot.com/10420045/diff/1/state/unit.go#newcode257
state/unit.go:257: // Where possible, we'd like to be able to short-circuit unit
destruction
On 2013/06/21 12:03:28, rog wrote:
> this is a very useful comment, thanks.
> i can't quite see how it directly relates
> to the code below though. perhaps it
> might be better placed somewhere else?

Hmm. I *think* it's explaining and justifying precisely the code below.

I tried putting the explanation inline with the code, but that put the concepts
out-of-order; doing it this way seemed to make the whole business much clearer.

https://codereview.appspot.com/10420045/diff/1/state/unit_test.go
File state/unit_test.go (right):

https://codereview.appspot.com/10420045/diff/1/state/unit_test.go#newcode377
state/unit_test.go:377: }, func() {
On 2013/06/21 02:37:40, wallyworld wrote:
> Do we need a check here to test the status/state of the unit prior to
destroying
> the machine? After all, we need to ensure that the unit.Destroy() didn't mess
> unexpectedly with the unit.

I'm depending on the machine.Destroy() failing due to the assigned unit. But
that's not guaranteed for all time, so, yeah, probably best to swap the order of
the checks.

https://codereview.appspot.com/10420045/diff/1/state/unit_test.go#newcode383
state/unit_test.go:383: )()
On 2013/06/21 12:03:28, rog wrote:
> i just had to think quite hard about this code.
> 
> i wonder if it might be nicer if these functions
> might return a more specific type than just a
> func(), making its intended function and use
> more obvious.
> 
> type TransactionChecker func()
> func (c TransactionChecker) Check() {
>     return c()
> }
> 
> Then you'd change SetTransactionHooks to:
> 
> func SetTransactionHooks(c *C, st *State, transactionHooks ...TransactionHook)
> TransactionChecker
> 
> (the body wouldn't need to change)
> 
> and the above code would look like:
> 
> defer state.SetRetryHooks(....).Check()
> 
> This way it's easy to see in the tests which things
> are checking transactions and which are simply
> reverting some change (we could use a similar "Reverter"
> idiom for that kind of thing).

Very nice, tyvm.

https://codereview.appspot.com/10420045/diff/1/state/unit_test.go#newcode456
state/unit_test.go:456: // A unit assigned to a provisioned machine is still
removed directly.
On 2013/06/21 02:37:40, wallyworld wrote:
> I think we need to enhance the comment to say "A unit which is still Pending
and
> assigned to a provisioned machine is removed directly."
> 
> Or something like that. 

Done.
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b