|
|
Descriptionstate: lifecycle implementation started
The branch contains the type definition and some utility
functions for lifecycles in general. A first integration
into relation is done and the tests changed (very naive by
just setting the lifecycle state before removing). So the
tests of state are working while any outer relying code
may fail. This has to be done in a follow-up CL.
https://code.launchpad.net/~themue/juju-core/go-state-lifecycle-relation/+merge/117382
(do not edit description out of merge proposal)
Patch Set 1 #
Total comments: 46
Patch Set 2 : state: lifecycle implementation started #
Total comments: 13
Patch Set 3 : state: lifecycle implementation started #
Total comments: 13
Patch Set 4 : state: lifecycle implementation started #Patch Set 5 : state: lifecycle implementation started #
MessagesTotal messages: 23
Please take a look.
Sign in to reply to this message.
looks plausible. a few minor comments below. https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode18 state/lifecycle.go:18: func (l Life) checkNext(next Life) error { this seems a little more involved than it has to be. perhaps something like this might be better? func (l Life) checkNext(next Life) error { switch { case l == next: return nil case l == Alive && next == Dying: return nil case l == Dying && next == Dead: return nil } return fmt.Errorf("illegal lifecycle state change from %q to %q", l, next) } https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode38 state/lifecycle.go:38: func createLifeNode(zk *zookeeper.Conn, basepath string) (err error) { does the life state need to be stored in a separate node? most entities already have an associated config node, and the life state could be stored there. maybe there's a good reason for the separation though. https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode40 state/lifecycle.go:40: defer errorContextf(&err, "cannot create lifecycle state node %q", lifepath) i wonder if the error context is necessary here, given that the zookeeper error mentions the path and the create operation. https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode48 state/lifecycle.go:48: defer errorContextf(&err, "cannot write lifecycle state node %q", lifepath) ditto https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode56 state/lifecycle.go:56: defer errorContextf(&err, "cannot read lifecycle state node %q", lifepath) ditto https://codereview.appspot.com/6455065/diff/1/state/lifecycle_test.go File state/lifecycle_test.go (right): https://codereview.appspot.com/6455065/diff/1/state/lifecycle_test.go#newcode21 state/lifecycle_test.go:21: c.Assert(Alive.checkNext(Dying), IsNil) this seems like a good place for a table-driven test. https://codereview.appspot.com/6455065/diff/1/state/relation.go File state/relation.go (right): https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode101 state/relation.go:101: func (r *Relation) readLife() (err error) { why is this a different method from Life?
Sign in to reply to this message.
No changes to watchers in this CL? https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode18 state/lifecycle.go:18: func (l Life) checkNext(next Life) error { s/checkNext/isNextValid/ perhaps? https://codereview.appspot.com/6455065/diff/1/state/relation.go File state/relation.go (right): https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode101 state/relation.go:101: func (r *Relation) readLife() (err error) { On 2012/07/31 07:48:58, rog wrote: > why is this a different method from Life? Because of potential caching in the future, perhaps.
Sign in to reply to this message.
Good start, couple of issues https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode18 state/lifecycle.go:18: func (l Life) checkNext(next Life) error { validChange? https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode38 state/lifecycle.go:38: func createLifeNode(zk *zookeeper.Conn, basepath string) (err error) { On 2012/07/31 07:48:58, rog wrote: > does the life state need to be stored in a separate node? most entities already > have an associated config node, and the life state could be stored there. maybe > there's a good reason for the separation though. ISTM that we'll be watching a *lot* of these, and handling events for every unrelated change will be somewhat tedious and unnecessary so, or reflection, I'm +1 on separate nodes. https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode40 state/lifecycle.go:40: defer errorContextf(&err, "cannot create lifecycle state node %q", lifepath) On 2012/07/31 07:48:58, rog wrote: > i wonder if the error context is necessary here, given that the zookeeper error > mentions the path and the create operation. I reckon the extra information has some value, and is more readable besides. https://codereview.appspot.com/6455065/diff/1/state/lifecycle_test.go File state/lifecycle_test.go (right): https://codereview.appspot.com/6455065/diff/1/state/lifecycle_test.go#newcode21 state/lifecycle_test.go:21: c.Assert(Alive.checkNext(Dying), IsNil) On 2012/07/31 07:48:58, rog wrote: > this seems like a good place for a table-driven test. +1 https://codereview.appspot.com/6455065/diff/1/state/lifecycle_test.go#newcode50 state/lifecycle_test.go:50: defer testing.ZkRemoveTree(s.ZkConn, basepath) If ZkConnSuite doesn't nuke ZK at the end of each test, surely it should? https://codereview.appspot.com/6455065/diff/1/state/relation.go File state/relation.go (right): https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode79 state/relation.go:79: life Life I'm not sure this pulls its weight. https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode119 state/relation.go:119: return writeLifeNode(r.st.zk, r.zkPath(), r.life) We probably shouldn't alter r.life until we know the write succeeded. https://codereview.appspot.com/6455065/diff/1/state/service.go File state/service.go (right): https://codereview.appspot.com/6455065/diff/1/state/service.go#newcode211 state/service.go:211: r := &Relation{s.st, key, Alive, make([]RelationEndpoint, len(tr.Endpoints))} Shouldn't you get the actual life value rather than just assume it's alive? TBH this life field seems to be causing more problems than it solves... https://codereview.appspot.com/6455065/diff/1/state/service_test.go File state/service_test.go (left): https://codereview.appspot.com/6455065/diff/1/state/service_test.go#oldcode443 state/service_test.go:443: err = s.State.RemoveRelation(rel) I haven't seen any new tests for RemoveRelation -- surely it should panic if the relation is not Dead? https://codereview.appspot.com/6455065/diff/1/state/state.go File state/state.go (right): https://codereview.appspot.com/6455065/diff/1/state/state.go#newcode422 state/state.go:422: t, err := readTopology(s.zk) Why isn't this all happening inside retryTopologyChange? https://codereview.appspot.com/6455065/diff/1/state/state.go#newcode427 state/state.go:427: return fmt.Errorf("not found") I suspect this shouldn't be an error any more... we're just cleaning up, we might not be the only concurrent caller, and if it's already gone then we're happy. https://codereview.appspot.com/6455065/diff/1/state/state.go#newcode434 state/state.go:434: return fmt.Errorf("lifecycle is not dead") I think this is a panic. https://codereview.appspot.com/6455065/diff/1/state/state.go#newcode443 state/state.go:443: // The effects the reusage of keys. Drop the commented code I think
Sign in to reply to this message.
https://codereview.appspot.com/6455065/diff/1/state/relation.go File state/relation.go (right): https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode79 state/relation.go:79: life Life On 2012/07/31 09:12:33, fwereade wrote: > I'm not sure this pulls its weight. Not sure what weight you are referring to, but this is the way it's done in mstate after a few previous heavier iterations. https://codereview.appspot.com/6455065/diff/1/state/service.go File state/service.go (right): https://codereview.appspot.com/6455065/diff/1/state/service.go#newcode211 state/service.go:211: r := &Relation{s.st, key, Alive, make([]RelationEndpoint, len(tr.Endpoints))} > Shouldn't you get the actual life value rather than just assume it's > alive? TBH this life field seems to be causing more problems than it > solves... I think this is just part of the initial lifecycle bootstrap, after we modify state client this would change in a subsequent CL. https://codereview.appspot.com/6455065/diff/1/state/state.go File state/state.go (right): https://codereview.appspot.com/6455065/diff/1/state/state.go#newcode427 state/state.go:427: return fmt.Errorf("not found") > I suspect this shouldn't be an error any more... we're just cleaning up, > we might not be the only concurrent caller, and if it's already gone > then we're happy. I thought there's one one agent responsible for calling Remove*, the single one receiving the Dead event. I think this is an error to be called more than once. The corrective agent might call this independently, but the CE should deal with it failing anyway.
Sign in to reply to this message.
https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode9 state/lifecycle.go:9: type Life string Hmm, this simplifies returning errors, but the integer approach in mstate makes state change checking trivial because all you need to do is if (newLife < life) { ...
Sign in to reply to this message.
On 2012/07/31 09:36:47, aram wrote: > https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go > File state/lifecycle.go (right): > > https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode9 > state/lifecycle.go:9: type Life string > Hmm, this simplifies returning errors, but the integer approach in mstate makes > state change checking trivial because all you need to do is > > if (newLife < life) { ... Does a string cost us so much as to not be worth the clarity?
Sign in to reply to this message.
On 2012/07/31 09:29:17, aram wrote: > https://codereview.appspot.com/6455065/diff/1/state/relation.go > File state/relation.go (right): > > https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode79 > state/relation.go:79: life Life > On 2012/07/31 09:12:33, fwereade wrote: > > I'm not sure this pulls its weight. > > Not sure what weight you are referring to, but this is the way it's done in > mstate after a few previous heavier iterations. > > https://codereview.appspot.com/6455065/diff/1/state/service.go > File state/service.go (right): > > https://codereview.appspot.com/6455065/diff/1/state/service.go#newcode211 > state/service.go:211: r := &Relation{s.st, key, Alive, make([]RelationEndpoint, > len(tr.Endpoints))} > > Shouldn't you get the actual life value rather than just assume it's > > alive? TBH this life field seems to be causing more problems than it > > solves... > > I think this is just part of the initial lifecycle bootstrap, after we > modify state client this would change in a subsequent CL. If we just dropped the field and called Life everywhere we needed it, then we could have code that wasn't actively inconsistent. We could add the caching as required in a subsequent CL. > > https://codereview.appspot.com/6455065/diff/1/state/state.go > File state/state.go (right): > > https://codereview.appspot.com/6455065/diff/1/state/state.go#newcode427 > state/state.go:427: return fmt.Errorf("not found") > > I suspect this shouldn't be an error any more... we're just cleaning up, > > we might not be the only concurrent caller, and if it's already gone > > then we're happy. > > I thought there's one one agent responsible for calling Remove*, the > single one receiving the Dead event. I think this is an error to be > called more than once. The corrective agent might call this independently, > but the CE should deal with it failing anyway. My point is just: there are concurrent callers of this. The end goal is that the relation no longer exist. At least one of the callers will see the relation already gone from the topology; if we treat this as an error, but the other client failed before it could delete the ZK nodes, how will we ever be able to collect the relation nodes?
Sign in to reply to this message.
> Does a string cost us so much as to not be worth the clarity? Actually it was about improving clarity. Here's rog's suggestion: http://play.golang.org/p/ZLVcLY96Oy
Sign in to reply to this message.
First round: https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode18 state/lifecycle.go:18: func (l Life) checkNext(next Life) error { checkNext seems like a bad approach altogether. We need to know more than whether a change is allowed to make a decision of what to do (e.g. Alive => Dying when Dead is fine, but does nothing). I suggest inlining logic into SetLife. https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode38 state/lifecycle.go:38: func createLifeNode(zk *zookeeper.Conn, basepath string) (err error) { On 2012/07/31 09:12:33, fwereade wrote: > On 2012/07/31 07:48:58, rog wrote: > > does the life state need to be stored in a separate node? most entities > already > > have an associated config node, and the life state could be stored there. > maybe > > there's a good reason for the separation though. > > ISTM that we'll be watching a *lot* of these, and handling events for every > unrelated change will be somewhat tedious and unnecessary so, or reflection, I'm > +1 on separate nodes. I agree with rog here. We have to watch a lot of things in general. Maybe I just haven't seen the issue clearly yet, but it sounds to me like separating the nodes will actually create *more* things for watching, rather than less. https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode56 state/lifecycle.go:56: defer errorContextf(&err, "cannot read lifecycle state node %q", lifepath) This is all changing given we don't need independent nodes, but FWIW let's please not sprawl errorContextf more than strictly necessary. fmt.Errorf works well. https://codereview.appspot.com/6455065/diff/1/state/relation.go File state/relation.go (right): https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode79 state/relation.go:79: life Life On 2012/07/31 09:29:17, aram wrote: > On 2012/07/31 09:12:33, fwereade wrote: > > I'm not sure this pulls its weight. > > Not sure what weight you are referring to, but this is the way it's done in > mstate after a few previous heavier iterations. I believe mstate is different in that the value is mapped straight from the database. The issue is related to the caching topic we debated at the end of last week. https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode79 state/relation.go:79: life Life Frank, have you seen the work by rog on agent tooling? It sounds like we could have a single implementation of resourceLife that could be used everywhere. https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode114 state/relation.go:114: if err := r.life.checkNext(life); err != nil { This doesn't work, as it's testing a memory value. The database could have shifted several times while we're sitting on our couch evaluating this. It must be put in a retry change block. Also, checkNext is a red-herring. You must do a local test here that takes into account what you want the end result to be. Alive is *never* acceptable. Dying is *always* fine, but only does anything if it is currenlty Alive, as someone else could have shifted from Alive to Dying and to Dead concurrently, and the Dying effect is guaranteed even then. Setting to Dead is fine unless it's Alive, but only does anything if not already Dead.
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode9 state/lifecycle.go:9: type Life string On 2012/07/31 09:36:48, aram wrote: > Hmm, this simplifies returning errors, but the integer approach in mstate makes > state change checking trivial because all you need to do is > > if (newLife < life) { ... Changed to an int based implementation. https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode18 state/lifecycle.go:18: func (l Life) checkNext(next Life) error { On 2012/07/31 19:35:55, niemeyer wrote: > checkNext seems like a bad approach altogether. We need to know more than > whether a change is allowed to make a decision of what to do (e.g. Alive => > Dying when Dead is fine, but does nothing). I suggest inlining logic into > SetLife. Kept a simpler implementation suggested by rog. IMHO state transitions should be controlled for more safety. But it could be moved to writeZkNode. https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode38 state/lifecycle.go:38: func createLifeNode(zk *zookeeper.Conn, basepath string) (err error) { On 2012/07/31 19:35:55, niemeyer wrote: > On 2012/07/31 09:12:33, fwereade wrote: > > On 2012/07/31 07:48:58, rog wrote: > > > does the life state need to be stored in a separate node? most entities > > already > > > have an associated config node, and the life state could be stored there. > > maybe > > > there's a good reason for the separation though. > > > > ISTM that we'll be watching a *lot* of these, and handling events for every > > unrelated change will be somewhat tedious and unnecessary so, or reflection, > I'm > > +1 on separate nodes. > > I agree with rog here. We have to watch a lot of things in general. Maybe I just > haven't seen the issue clearly yet, but it sounds to me like separating the > nodes will actually create *more* things for watching, rather than less. Done. https://codereview.appspot.com/6455065/diff/1/state/lifecycle.go#newcode56 state/lifecycle.go:56: defer errorContextf(&err, "cannot read lifecycle state node %q", lifepath) On 2012/07/31 19:35:55, niemeyer wrote: > This is all changing given we don't need independent nodes, but FWIW let's > please not sprawl errorContextf more than strictly necessary. fmt.Errorf works > well. Done. https://codereview.appspot.com/6455065/diff/1/state/lifecycle_test.go File state/lifecycle_test.go (right): https://codereview.appspot.com/6455065/diff/1/state/lifecycle_test.go#newcode21 state/lifecycle_test.go:21: c.Assert(Alive.checkNext(Dying), IsNil) On 2012/07/31 09:12:33, fwereade wrote: > On 2012/07/31 07:48:58, rog wrote: > > this seems like a good place for a table-driven test. > +1 Done. https://codereview.appspot.com/6455065/diff/1/state/lifecycle_test.go#newcode50 state/lifecycle_test.go:50: defer testing.ZkRemoveTree(s.ZkConn, basepath) On 2012/07/31 09:12:33, fwereade wrote: > If ZkConnSuite doesn't nuke ZK at the end of each test, surely it should? Done. https://codereview.appspot.com/6455065/diff/1/state/relation.go File state/relation.go (right): https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode101 state/relation.go:101: func (r *Relation) readLife() (err error) { On 2012/07/31 08:19:12, aram wrote: > On 2012/07/31 07:48:58, rog wrote: > > why is this a different method from Life? > > Because of potential caching in the future, perhaps. Yes, but changed it now due to new lifecycle type. https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode114 state/relation.go:114: if err := r.life.checkNext(life); err != nil { On 2012/07/31 19:35:55, niemeyer wrote: > This doesn't work, as it's testing a memory value. The database could have > shifted several times while we're sitting on our couch evaluating this. It must > be put in a retry change block. > > Also, checkNext is a red-herring. You must do a local test here that takes into > account what you want the end result to be. Alive is *never* acceptable. Dying > is *always* fine, but only does anything if it is currenlty Alive, as someone > else could have shifted from Alive to Dying and to Dead concurrently, and the > Dying effect is guaranteed even then. Setting to Dead is fine unless it's Alive, > but only does anything if not already Dead. Done. https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode119 state/relation.go:119: return writeLifeNode(r.st.zk, r.zkPath(), r.life) On 2012/07/31 09:12:33, fwereade wrote: > We probably shouldn't alter r.life until we know the write succeeded. Done. https://codereview.appspot.com/6455065/diff/1/state/service.go File state/service.go (right): https://codereview.appspot.com/6455065/diff/1/state/service.go#newcode211 state/service.go:211: r := &Relation{s.st, key, Alive, make([]RelationEndpoint, len(tr.Endpoints))} On 2012/07/31 09:29:17, aram wrote: > > Shouldn't you get the actual life value rather than just assume it's > > alive? TBH this life field seems to be causing more problems than it > > solves... > > I think this is just part of the initial lifecycle bootstrap, after we > modify state client this would change in a subsequent CL. The current lifecycle state is read with the the following statement. An alternative would be the creation of different constructor functions for creating or reading a relation. https://codereview.appspot.com/6455065/diff/1/state/service_test.go File state/service_test.go (left): https://codereview.appspot.com/6455065/diff/1/state/service_test.go#oldcode443 state/service_test.go:443: err = s.State.RemoveRelation(rel) On 2012/07/31 09:12:33, fwereade wrote: > I haven't seen any new tests for RemoveRelation -- surely it should panic if the > relation is not Dead? Done. https://codereview.appspot.com/6455065/diff/1/state/state.go File state/state.go (right): https://codereview.appspot.com/6455065/diff/1/state/state.go#newcode422 state/state.go:422: t, err := readTopology(s.zk) On 2012/07/31 09:12:33, fwereade wrote: > Why isn't this all happening inside retryTopologyChange? Done. https://codereview.appspot.com/6455065/diff/1/state/state.go#newcode434 state/state.go:434: return fmt.Errorf("lifecycle is not dead") On 2012/07/31 09:12:33, fwereade wrote: > I think this is a panic. Done. https://codereview.appspot.com/6455065/diff/1/state/state.go#newcode443 state/state.go:443: // The effects the reusage of keys. On 2012/07/31 09:12:33, fwereade wrote: > Drop the commented code I think Done.
Sign in to reply to this message.
https://codereview.appspot.com/6455065/diff/11001/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/11001/state/lifecycle.go#newcode96 state/lifecycle.go:96: value, found := config.Get(lifeKey) Isn't there a race between config.Get() and config.{Set,Write}()? https://codereview.appspot.com/6455065/diff/11001/state/lifecycle_test.go File state/lifecycle_test.go (right): https://codereview.appspot.com/6455065/diff/11001/state/lifecycle_test.go#new... state/lifecycle_test.go:90: c.Assert(func() { lc.SetLife(test.panicLife) }, PanicMatches, panicMessage) Did not know about PanicMatches, thanks! https://codereview.appspot.com/6455065/diff/11001/state/state.go File state/state.go (right): https://codereview.appspot.com/6455065/diff/11001/state/state.go#newcode419 state/state.go:419: return fmt.Errorf("cannot retrieve lifecacle state of relation %q: %s", r, err) s/lifecacle/lifecycle/
Sign in to reply to this message.
https://codereview.appspot.com/6455065/diff/11001/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/11001/state/lifecycle.go#newcode26 state/lifecycle.go:26: var transitions = [nLife][nLife]bool{ I love shorter names in general, but perhaps validTransitions is a better name here?
Sign in to reply to this message.
Thx Aram, missed to check the real changes in the config. Will add it. https://codereview.appspot.com/6455065/diff/11001/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/11001/state/lifecycle.go#newcode96 state/lifecycle.go:96: value, found := config.Get(lifeKey) On 2012/08/07 10:20:26, aram wrote: > Isn't there a race between config.Get() and config.{Set,Write}()? Oh, yes, missed to cross-check the returned changes here.
Sign in to reply to this message.
https://codereview.appspot.com/6455065/diff/11001/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/11001/state/lifecycle.go#newcode91 state/lifecycle.go:91: func (lc *lifecycle) SetLife(next Life) error { The issues described last week are still not handled: https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode114 state/relation.go:114: if err := r.life.checkNext(life); err != nil { This doesn't work, as it's testing a memory value. The database could have shifted several times while we're sitting on our couch evaluating this. It must be put in a retry change block. Also, checkNext is a red-herring. You must do a local test here that takes into account what you want the end result to be. Alive is *never* acceptable. Dying is *always* fine, but only does anything if it is currenlty Alive, as someone else could have shifted from Alive to Dying and to Dead concurrently, and the Dying effect is guaranteed even then. Setting to Dead is fine unless it's Alive, but only does anything if not already Dead. https://codereview.appspot.com/6455065/diff/11001/state/relation.go File state/relation.go (right): https://codereview.appspot.com/6455065/diff/11001/state/relation.go#newcode111 state/relation.go:111: func (r *Relation) Life() Life { We shouldn't have to declare those methods for every type. Embedding a reasonable type should be enough. You can use Roger's work as a reference. https://codereview.appspot.com/6455065/diff/11001/state/relation.go#newcode125 state/relation.go:125: // Refresh reloads the cached data of the relation. // Refresh reloads the state of the lifecycle. This is somewhat unfortunate since it's caching a specific setting and ignoring the rest, but I guess it's a good compromise to maintain compatibility with mstate for the moment.
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/6455065/diff/11001/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/11001/state/lifecycle.go#newcode26 state/lifecycle.go:26: var transitions = [nLife][nLife]bool{ On 2012/08/07 10:29:07, aram wrote: > I love shorter names in general, but perhaps validTransitions is a better name > here? Done. https://codereview.appspot.com/6455065/diff/11001/state/lifecycle.go#newcode91 state/lifecycle.go:91: func (lc *lifecycle) SetLife(next Life) error { On 2012/08/07 15:09:36, niemeyer wrote: > The issues described last week are still not handled: > > https://codereview.appspot.com/6455065/diff/1/state/relation.go#newcode114 > state/relation.go:114: if err := r.life.checkNext(life); err != nil { > This doesn't work, as it's testing a memory value. The database could have > shifted several times while we're sitting on our couch evaluating this. It must > be put in a retry change block. > > Also, checkNext is a red-herring. You must do a local test here that takes into > account what you want the end result to be. Alive is *never* acceptable. Dying > is *always* fine, but only does anything if it is currenlty Alive, as someone > else could have shifted from Alive to Dying and to Dead concurrently, and the > Dying effect is guaranteed even then. Setting to Dead is fine unless it's Alive, > but only does anything if not already Dead. Done. https://codereview.appspot.com/6455065/diff/11001/state/relation.go File state/relation.go (right): https://codereview.appspot.com/6455065/diff/11001/state/relation.go#newcode111 state/relation.go:111: func (r *Relation) Life() Life { On 2012/08/07 15:09:36, niemeyer wrote: > We shouldn't have to declare those methods for every type. Embedding a > reasonable type should be enough. You can use Roger's work as a reference. Done. https://codereview.appspot.com/6455065/diff/11001/state/relation.go#newcode125 state/relation.go:125: // Refresh reloads the cached data of the relation. On 2012/08/07 15:09:36, niemeyer wrote: > // Refresh reloads the state of the lifecycle. > > This is somewhat unfortunate since it's caching a specific setting and ignoring > the rest, but I guess it's a good compromise to maintain compatibility with > mstate for the moment. Done, but change to not caching life and reload it every time in Life() would be simple. https://codereview.appspot.com/6455065/diff/11001/state/state.go File state/state.go (right): https://codereview.appspot.com/6455065/diff/11001/state/state.go#newcode419 state/state.go:419: return fmt.Errorf("cannot retrieve lifecacle state of relation %q: %s", r, err) On 2012/08/07 10:20:26, aram wrote: > s/lifecacle/lifecycle/ Done.
Sign in to reply to this message.
https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go#newcode45 state/lifecycle.go:45: // init initializes the lifecycle with the state Alive . Why do we need this? We should be creating the actual nodes with life set to Alive in the first place, rather than changing them to Alive after the fact via this method. https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go#newcode88 state/lifecycle.go:88: // Attention! The config map is defined by the ConfigNode. As we debated ad infinitum, there are other configuration nodes that we manipulate without ConfigNode, so this isn't really a relevant note. What might be more useful is: // Note that there are unknown keys in the node. https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go#newcode98 state/lifecycle.go:98: if !validTransitions[current][next] { Please add a test verifying that SetLife(Dying) when the current state is Dead works and does nothing. This was mentioned in the very first review I made and still seems unhandled. https://codereview.appspot.com/6455065/diff/17001/state/service.go File state/service.go (right): https://codereview.appspot.com/6455065/diff/17001/state/service.go#newcode212 state/service.go:212: if err := r.Refresh(); err != nil { This workflow doesn't make sense: 1) r := newRelation 2) r.Refresh 3) r.endpoint = ... If we're going to be adding Refresh now, the sites that call it should be sane, otherwise we're corrupting the whole logic in a way that we won't be able to get out of in the near future. Please ping me online for us to sort this out. https://codereview.appspot.com/6455065/diff/17001/state/state.go File state/state.go (right): https://codereview.appspot.com/6455065/diff/17001/state/state.go#newcode348 state/state.go:348: err = newRelation(s, relationKey).lifecycle.init() Why writing a second time to the node just for setting life=alive? https://codereview.appspot.com/6455065/diff/17001/state/state.go#newcode401 state/state.go:401: if err = r.Refresh(); err != nil { Same case as before.
Sign in to reply to this message.
Added some clearing comments for the next step. https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go#newcode45 state/lifecycle.go:45: // init initializes the lifecycle with the state Alive . On 2012/08/13 15:48:49, niemeyer wrote: > Why do we need this? We should be creating the actual nodes with life set to > Alive in the first place, rather than changing them to Alive after the fact via > this method. Maybe this comment is misleading. init() is to write the initial Alive into the config node. init() is called immediately after the creation of the instance, so the config node should be empty so far. But I preferred to take care that it's really so. https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go#newcode88 state/lifecycle.go:88: // Attention! The config map is defined by the ConfigNode. On 2012/08/13 15:48:49, niemeyer wrote: > As we debated ad infinitum, there are other configuration nodes that we > manipulate without ConfigNode, so this isn't really a relevant note. > > What might be more useful is: > > // Note that there are unknown keys in the node. Ack, will add the comment. But even if other places also have direct access to the content of a config node it doesn't has to be the right way. My feeling for types and encapsulation is that the best way to access this node by the type and its methods (maybe done OOP for too long *smile*). So a compromise would be to extend ConfigNode with a method like RetryChange(). What do you think? https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go#newcode98 state/lifecycle.go:98: if !validTransitions[current][next] { On 2012/08/13 15:48:49, niemeyer wrote: > Please add a test verifying that SetLife(Dying) when the current state is Dead > works and does nothing. This was mentioned in the very first review I made and > still seems unhandled. > It's a result of some iterations with Aram. The change (and test) is simple. Which are the use cases for changing a state from Dead to Dying? https://codereview.appspot.com/6455065/diff/17001/state/service.go File state/service.go (right): https://codereview.appspot.com/6455065/diff/17001/state/service.go#newcode212 state/service.go:212: if err := r.Refresh(); err != nil { On 2012/08/13 15:48:49, niemeyer wrote: > This workflow doesn't make sense: > > 1) r := newRelation > 2) r.Refresh > 3) r.endpoint = ... > > If we're going to be adding Refresh now, the sites that call it should be sane, > otherwise we're corrupting the whole logic in a way that we won't be able to get > out of in the near future. Please ping me online for us to sort this out. OK, have to admit that I've been a bit lazy here. Will create two different constructors now, one creating a real new instance (with initializing the config node) and one reading it from ZK. That is split into the pais new/init and new/Refresh today. https://codereview.appspot.com/6455065/diff/17001/state/state.go File state/state.go (right): https://codereview.appspot.com/6455065/diff/17001/state/state.go#newcode348 state/state.go:348: err = newRelation(s, relationKey).lifecycle.init() On 2012/08/13 15:48:49, niemeyer wrote: > Why writing a second time to the node just for setting life=alive? The config node is written here for the first time, initialized with life=Alive. Once again, will split it into to constructors (createRelation and readReleation) to make it more clear. https://codereview.appspot.com/6455065/diff/17001/state/state.go#newcode401 state/state.go:401: if err = r.Refresh(); err != nil { On 2012/08/13 15:48:49, niemeyer wrote: > Same case as before. Ditto.
Sign in to reply to this message.
Please take a look.
Sign in to reply to this message.
Please take a look.
Sign in to reply to this message.
https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go File state/lifecycle.go (right): https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go#newcode45 state/lifecycle.go:45: // init initializes the lifecycle with the state Alive . On 2012/08/14 12:36:06, TheMue wrote: > On 2012/08/13 15:48:49, niemeyer wrote: > > Why do we need this? We should be creating the actual nodes with life set to > > Alive in the first place, rather than changing them to Alive after the fact > via > > this method. > > Maybe this comment is misleading. init() is to write the initial Alive into the > config node. init() is called immediately after the creation of the instance, so > the config node should be empty so far. But I preferred to take care that it's > really so. The comment isn't misleading. The logic itself is wrong. You can't create a node that has no lifecycle state and then initialize it after the fact. Meanwhile, someone else will pick it up, and will see wrong state. This, specifically, is a relevant race condition: 98 r := newRelation(st, key) 99 if err := r.lifecycle.init(); err != nil { We've been going back and forth on this, and I'm feeling unproductive reviewing this over and over about the same problems. I suggest we stop work on state, put that aside, and focus entirely on mstate. Let's implement watches and presence, and move forward.
Sign in to reply to this message.
On 2012/08/28 17:15:06, niemeyer wrote: > https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go > File state/lifecycle.go (right): > > https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go#newcode45 > state/lifecycle.go:45: // init initializes the lifecycle with the state Alive . > On 2012/08/14 12:36:06, TheMue wrote: > > On 2012/08/13 15:48:49, niemeyer wrote: > > > Why do we need this? We should be creating the actual nodes with life set to > > > Alive in the first place, rather than changing them to Alive after the fact > > via > > > this method. > > > > Maybe this comment is misleading. init() is to write the initial Alive into > the > > config node. init() is called immediately after the creation of the instance, > so > > the config node should be empty so far. But I preferred to take care that it's > > really so. > > The comment isn't misleading. The logic itself is wrong. You can't create a node > that has no lifecycle state and then initialize it after the fact. Meanwhile, > someone else will pick it up, and will see wrong state. > > This, specifically, is a relevant race condition: > > 98 r := newRelation(st, key) > 99 if err := r.lifecycle.init(); err != nil { > > We've been going back and forth on this, and I'm feeling unproductive reviewing > this over and over about the same problems. > > I suggest we stop work on state, put that aside, and focus entirely on mstate. > Let's implement watches and presence, and move forward. Take a look at newRelation(), it doesn't create a node, only the struct with lifecycle initialized: func newRelation(st *State, key string) *Relation { r := &Relation{ st: st, key: key, endpoints: []RelationEndpoint{}, } r.lifecycle = lifecycle{ zk: st.zk, path: r.zkConfigPath(), } return r } We've done it before with &Relation{}. newRelation() is only because of the internal initialization of the embedded lifecycle struct.
Sign in to reply to this message.
On 2012/08/28 17:24:11, TheMue wrote: > On 2012/08/28 17:15:06, niemeyer wrote: > > https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go > > File state/lifecycle.go (right): > > > > https://codereview.appspot.com/6455065/diff/17001/state/lifecycle.go#newcode45 > > state/lifecycle.go:45: // init initializes the lifecycle with the state Alive > . > > On 2012/08/14 12:36:06, TheMue wrote: > > > On 2012/08/13 15:48:49, niemeyer wrote: > > > > Why do we need this? We should be creating the actual nodes with life set > to > > > > Alive in the first place, rather than changing them to Alive after the > fact > > > via > > > > this method. > > > > > > Maybe this comment is misleading. init() is to write the initial Alive into > > the > > > config node. init() is called immediately after the creation of the > instance, > > so > > > the config node should be empty so far. But I preferred to take care that > it's > > > really so. > > > > The comment isn't misleading. The logic itself is wrong. You can't create a > node > > that has no lifecycle state and then initialize it after the fact. Meanwhile, > > someone else will pick it up, and will see wrong state. > > > > This, specifically, is a relevant race condition: > > > > 98 r := newRelation(st, key) > > 99 if err := r.lifecycle.init(); err != nil { > > > > We've been going back and forth on this, and I'm feeling unproductive > reviewing > > this over and over about the same problems. > > > > I suggest we stop work on state, put that aside, and focus entirely on mstate. > > Let's implement watches and presence, and move forward. > > Take a look at newRelation(), it doesn't create a node, only the struct with > lifecycle initialized: What is this logic doing Frank: 46 func (lc *lifecycle) init() error { 47 config, err := readConfigNode(lc.zk, lc.path) 48 if err != nil { 49 return err 50 } 51 config.Set(lifeKey, Alive) 52 changes, err := config.Write() 53 if err != nil { 54 return err 55 } Either: 1) The node was created before without a lifecycle or 2) An empty node is being read, and is being written down just with lifecycle information and nothing else Both are wrong. This is feeling very unproductive to me, Frank. This is in review for several weeks, and the issues that are brought up are repeatedly receiving replies that indicate lack of care instead of being fixed. I'm marking the branch as rejected.
Sign in to reply to this message.
|