|
|
Created:
12 years, 6 months ago by dave Modified:
12 years, 6 months ago Reviewers:
mp+107714 Visibility:
Public. |
Descriptioncmd/jujud: strawman provisioning agent
This is a strawman provisioning agent proposal.
Following Gustavo's suggestion, this revision does not include
any functionality to respond to machines changes.
https://code.launchpad.net/~dave-cheney/juju/go-provisioning-strawman/+merge/107714
(do not edit description out of merge proposal)
Patch Set 1 #Patch Set 2 : cmd/jujud: strawman provisioning agent #
Total comments: 11
Patch Set 3 : cmd/jujud: strawman provisioning agent #Patch Set 4 : cmd/jujud: strawman provisioning agent #
Total comments: 11
Patch Set 5 : cmd/jujud: strawman provisioning agent #Patch Set 6 : cmd/jujud: strawman provisioning agent #Patch Set 7 : cmd/jujud: strawman provisioning agent #Patch Set 8 : cmd/jujud: strawman provisioning agent #Patch Set 9 : cmd/jujud: strawman provisioning agent #
Total comments: 18
Patch Set 10 : cmd/jujud: strawman provisioning agent #Patch Set 11 : cmd/jujud: strawman provisioning agent #Patch Set 12 : cmd/jujud: strawman provisioning agent #
Total comments: 28
Patch Set 13 : cmd/jujud: strawman provisioning agent #Patch Set 14 : cmd/jujud: strawman provisioning agent #
MessagesTotal messages: 16
this is looking nice. a few minor remarks below. https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go File cmd/jujud/provisioning.go (right): https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:17: const PROVIDER_MACHINE_ID = "provider-machine-id" s/PROVIDER_MACHINE_ID/providerMachineId/ UNDERSCORED_CAPS aren't conventional. https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:120: return fmt.Errorf("machine-%010d already reports a provider id %q, skipping", m.Id(), id) s/machine-%010d/machine %d/ https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:208: return "", fmt.Errorf("findProviderId: machine-%010d key not found: %q", m.Id(), PROVIDER_MACHINE_ID) i think 'machine %d' would work better - the %010 stuff is detail internal to state. https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:210: if _, ok := id.(string); !ok { rather than doing the type conversion twice, perhaps: if id, ok := id.(string); ok { return id, nil } return "", fmt.Errorf("machine %d has invalid value for %s: %#v", m.Id(), PROVIDER_MACHINE_ID, id) https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:221: if len(insts) < 1 { this can't happen.
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go File cmd/jujud/provisioning.go (right): https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:17: const PROVIDER_MACHINE_ID = "provider-machine-id" Beleted https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:120: return fmt.Errorf("machine-%010d already reports a provider id %q, skipping", m.Id(), id) On 2012/05/29 08:41:18, rog wrote: > s/machine-%010d/machine %d/ Done, maybe *state.Machine should grow a .String() method https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:208: return "", fmt.Errorf("findProviderId: machine-%010d key not found: %q", m.Id(), PROVIDER_MACHINE_ID) On 2012/05/29 08:41:18, rog wrote: > i think 'machine %d' would work better - the %010 stuff is detail internal to > state. Done. https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:210: if _, ok := id.(string); !ok { Done. Ta On 2012/05/29 08:41:18, rog wrote: > rather than doing the type conversion twice, perhaps: > > if id, ok := id.(string); ok { > return id, nil > } > return "", fmt.Errorf("machine %d has invalid value for %s: %#v", m.Id(), > PROVIDER_MACHINE_ID, id) https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:210: if _, ok := id.(string); !ok { On 2012/05/29 08:41:18, rog wrote: > rather than doing the type conversion twice, perhaps: > > if id, ok := id.(string); ok { > return id, nil > } > return "", fmt.Errorf("machine %d has invalid value for %s: %#v", m.Id(), > PROVIDER_MACHINE_ID, id) Done. https://codereview.appspot.com/6250068/diff/2001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:221: if len(insts) < 1 { I don't think it hurts to check. On 2012/05/29 08:41:18, rog wrote: > this can't happen.
Sign in to reply to this message.
This is going in a great direction Dave, and I'm excited to see this. At the same time, I'm concerned about the approach. I'm not comfortable with us knowingly getting in critical chunks of the functionality while knowing that it "lacks robust error checking and retry logic", as I can't help you fixing issues and catching up bugs that you tell me are known but you don't care about right now. I'd prefer if this spike, which is a great test bed, was now broken down into smaller chunks that you are actually *sure* about, so that we and the other developers can jointly review and agree is a good step forward. I don't mind if each of those steps are not entirely functional, for example. We can have a first one that is just about connecting and reconnecting to the environment reliably, for instance, ignoring any machine management, and so on. What follows is a few additional ideas on it: https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go File cmd/jujud/provisioning.go (right): https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:17: const PROVIDER_MACHINE_ID = "provider-machine-id" providerMachineId would be the convention, I think https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning_test.go File cmd/jujud/provisioning_test.go (right): https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning_test.... cmd/jujud/provisioning_test.go:18: log.Debug = true Shouldn't the suite be using the testing package's log helper instead? https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning_test.... cmd/jujud/provisioning_test.go:126: <-time.After(2 * time.Second) That's not great. That's the kind of reason that we need a proper Stop method on the provisioning agent, so that we can be sure that it is happy to terminate, rather than waiting a random amount of time for it to finish running. Otherwise, we're getting into exactly the same kind of issue we got into in the Python implementation. Please have a look at the underlying infrastructure of watchers for inspiration. They reliably terminate, both erroring or not. They also do that synchronously so that it is possible to wait, and to check for errors from them. We need something like that for the provisioning agent.
Sign in to reply to this message.
Some more comments. Also, it just occurred to me that, as a very first step getting just the testing environment in place, with zero functionality, in a way that reliably starts and stops, is already a fantastic task on itself. https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go File cmd/jujud/provisioning.go (right): https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:53: a.State, err = state.Open(&a.Conf.StateInfo) This is a rather extensive function that might be broken down further in more manageable bits for clarity. https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:75: // step 2. listen for changes to the environment or the machine topology and action both. Step 1 needs to be able to be more resilient, and step 2 must be able to fallback to step 1 in case of issues with the connection. It must also be able to deal with Stop of the watcher and re-watching, but that should be somewhat natural. https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:122: return fmt.Errorf("machine-%010d already reports a provider id %q, skipping", m.Id(), id) We should never talk about internal ZooKeeper keys to the user. This should be saying "machine %d" instead. https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:134: // store the reference from the provider in ZK Let's please not reference ZooKeeper all the time. It is the environment state that we have at hand, with a nice abstraction that we created precisely so we don't think in terms of ZooKeeper all the time. ZooKeeper is hopefully going away soon, but the state will remain as-is.
Sign in to reply to this message.
Hi Gustavo, Thank you for your feedback. > At the same time, I'm concerned about the approach. I'm not comfortable > with us knowingly getting in critical chunks of the functionality while > knowing that it "lacks robust > error checking and retry logic", as I can't help you fixing issues and > catching up bugs that you tell me are known but you don't care about > right now. To explain myself (as I wrote that description a few days ago). The PA does check all errors, but at the moment does not attempt to retry any operations -- they all bubble up to the Run method and cause the program to exit. > I'd prefer if this spike, which is a great test bed, was now broken down > into smaller chunks that you are actually *sure* about, so that we and > the other developers can jointly review and agree is a good step > forward. So would I, this proposal was an attempt to provide a minimum viable provisioning agent. > I don't mind if each of those steps are not entirely functional, for > example. We can have a first one that is just about connecting and > reconnecting to the environment reliably, for instance, ignoring any > machine management, and so on. Yes, I can certainly do that. > What follows is a few additional ideas on it: > > > https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go > File cmd/jujud/provisioning.go (right): > > https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go#ne... > > cmd/jujud/provisioning.go:17: const PROVIDER_MACHINE_ID = > "provider-machine-id" > providerMachineId would be the convention, I think That is a relic, the constant is not used anymore now. > https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning_test.go > File cmd/jujud/provisioning_test.go (right): > > https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning_test.... > cmd/jujud/provisioning_test.go:18: log.Debug = true > Shouldn't the suite be using the testing package's log helper instead? I'll figure out how to use that properly, this was actually an attempt to simulate the --debug flag to the PA, which didn't work. > https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning_test.... > cmd/jujud/provisioning_test.go:126: <-time.After(2 * time.Second) > That's not great. That's the kind of reason that we need a proper Stop > method on the provisioning agent, so that we can be sure that it is > happy to terminate, rather than waiting a random amount of time for it > to finish running. Otherwise, we're getting into exactly the same kind > of issue we got into in the Python implementation. Understood. This timer is a currently required because the PA expects to be running from main, via ProvisioningAgent.Run. I will have a think about how to add a Tomb that all agents in that package can use to exit reliably. > Please have a look at the underlying infrastructure of watchers for > inspiration. They reliably terminate, both erroring or not. They also do > that synchronously so that it is possible to wait, and to check for > errors from them. > > We need something like that for the provisioning agent.
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go File cmd/jujud/provisioning.go (right): https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:17: const PROVIDER_MACHINE_ID = "provider-machine-id" On 2012/05/30 21:09:21, niemeyer wrote: > providerMachineId would be the convention, I think Done. https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:75: // step 2. listen for changes to the environment or the machine topology and action both. Understood, will redo. On 2012/05/30 21:27:20, niemeyer wrote: > Step 1 needs to be able to be more resilient, and step 2 must be able to > fallback to step 1 in case of issues with the connection. It must also be able > to deal with Stop of the watcher and re-watching, but that should be somewhat > natural. https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:122: return fmt.Errorf("machine-%010d already reports a provider id %q, skipping", m.Id(), id) I think it is time that state.Machine grows a String method, then discussions like these go away. On 2012/05/30 21:27:20, niemeyer wrote: > We should never talk about internal ZooKeeper keys to the user. This should be > saying "machine %d" instead. https://codereview.appspot.com/6250068/diff/6001/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:134: // store the reference from the provider in ZK On 2012/05/30 21:27:20, niemeyer wrote: > Let's please not reference ZooKeeper all the time. It is the environment state > that we have at hand, with a nice abstraction that we created precisely so we > don't think in terms of ZooKeeper all the time. ZooKeeper is hopefully going > away soon, but the state will remain as-is. Done.
Sign in to reply to this message.
Please take a look.
Sign in to reply to this message.
Thanks Dave. That is now looking like a great step forward that we can polish for integration easily. Some comments: https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go File cmd/jujud/provisioning.go (right): https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:37: st, err := state.Open(&a.Conf.StateInfo) What happens if the connection is broken? We'll need to reestablish it and restart from the beginning, since all the assumptions are now wrong. Maybe this has to be moved into the outer loop or something. Will leave details with you at this point. That said, I suggest keeping this for the branch coming right after this one. It feels like what we have here is a solid step forward, and we can comfortably iterate over the state reestablishment in a new branch before continuing with everything else. https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:56: type environment struct { Those two types feel unnecessary. It looks like your model would work equally well if we had environWatcher and machinesWatcher both within the Provisioner struct in place of the environment and machines fields, plus stopMachinesWatcher and stopEnvironWatcher methods (the two invalidate ones), with no further changes. https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:71: func (e *environment) invalidate() { This should return the error it finds. Call sites can selectively ignore it, but methods such as Provisioner.Stop should return the problems it finds, so we can have a handle on them if we want (testing should definitely check, for instance). https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:73: log.Printf("provisioner: environment watcher exited: %v", e.watcher.Stop()) That Stop call is the most critical task done by this method. It shouldn't be disguised at the end of a log call. https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:136: func (p *Provisioner) innerLoop() { Nice organization. https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:152: log.Printf("provisioning: new configuartion applied") s/configuartion/environment configuration/ (note typo) https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:159: } I'm wondering if rather than "continue" in the problems above, we should "break", and here have something that verifies if the state is still sane? Otherwise this will be an infinite loop, won't it? https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:167: p.environment.invalidate() I suggest logic like this here: p.tomb.Kill(nil) err1 := p.stopEnvironWatcher() err2 := p.stopMachinesWatcher() err3 := p.tomb.Wait() for err := []error{err3, err2, err1} { if err != nil { return err } } return nil
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go File cmd/jujud/provisioning.go (right): https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:37: st, err := state.Open(&a.Conf.StateInfo) On 2012/05/31 11:46:59, niemeyer wrote: > What happens if the connection is broken? We'll need to reestablish it and > restart from the beginning, since all the assumptions are now wrong. Maybe this > has to be moved into the outer loop or something. Will leave details with you at > this point. I have been operating under the assumption (I think it's correct based on what William and Roger were talking about yesterday) that all the agents will run under upstart so can assume that if they exit or SEGV they will be restarted. > That said, I suggest keeping this for the branch coming right after this one. It > feels like what we have here is a solid step forward, and we can comfortably > iterate over the state reestablishment in a new branch before continuing with > everything else. I agree, i'd like to see that as a future enhancement, and lean on upstart for the moment. See also comments about state.IsClosed/IsValid below. https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:56: type environment struct { On 2012/05/31 11:46:59, niemeyer wrote: > Those two types feel unnecessary. It looks like your model would work equally > well if we had environWatcher and machinesWatcher both within the Provisioner > struct in place of the environment and machines fields, plus stopMachinesWatcher > and stopEnvironWatcher methods (the two invalidate ones), with no further > changes. This was an indulgence. I am besotted with the ability to bind a few methods to a structure and then embed the structure as a way of composing more complex types. I've redone it, but retained a wrapper methods. https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:71: func (e *environment) invalidate() { On 2012/05/31 11:46:59, niemeyer wrote: > This should return the error it finds. Call sites can selectively ignore it, but > methods such as Provisioner.Stop should return the problems it finds, so we can > have a handle on them if we want (testing should definitely check, for > instance). Done. https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:73: log.Printf("provisioner: environment watcher exited: %v", e.watcher.Stop()) On 2012/05/31 11:46:59, niemeyer wrote: > That Stop call is the most critical task done by this method. It shouldn't be > disguised at the end of a log call. Done. https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:136: func (p *Provisioner) innerLoop() { On 2012/05/31 11:46:59, niemeyer wrote: > Nice organization. Thank you. As much as I would like to have only one loop, there is a hard requirement to have a valid Environ before reading anything off the machines channel. To remove that would mean that every machine change has to be recorded in some structure which is serviced by a goroutine. That means more tomb.Tombs to check, locks around the Envrion, etc. This is most of what would be required for a concurrent PA. I do want to make the PA concurrent in the future, but not just yet. https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:152: log.Printf("provisioning: new configuartion applied") On 2012/05/31 11:46:59, niemeyer wrote: > s/configuartion/environment configuration/ (note typo) Done. https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:159: } On 2012/05/31 11:46:59, niemeyer wrote: > I'm wondering if rather than "continue" in the problems above, we should > "break", and here have something that verifies if the state is still sane? > Otherwise this will be an infinite loop, won't it? At the moment, by my reading, a watcher is always returned from any of the calls to state.New*Watcher, but if the underlying state is broken, they will immediately go into Dead state and close their channel. What I think is needed here is the state/watcher.go methods to be able to return an error when you ask for a watcher. Or, a method like state.IsClosed() (or similar) if it existed could be put a the top of the loop, ie. for !p.st.IsClosed() { select { ... } } https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:167: p.environment.invalidate() > p.tomb.Kill(nil) > err1 := p.stopEnvironWatcher() > err2 := p.stopMachinesWatcher() > err3 := p.tomb.Wait() > for err := []error{err3, err2, err1} { > if err != nil { > return err > } > } > return nil SGTM. I'm never sure how to merge several error values into one.
Sign in to reply to this message.
https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go File cmd/jujud/provisioning.go (right): https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:37: st, err := state.Open(&a.Conf.StateInfo) > What happens if the connection is broken? We'll need to reestablish it and > restart from the beginning, since all the assumptions are now wrong. Maybe this > has to be moved into the outer loop or something. Will leave details with you at > this point. I looked into this more this morning and it looks like if the underlying connection to ZK is interrupted then the c client reconnects behind the scenes. I tested this by running the PA, then shutting down zk, then starting it again a minute later. None of the watchers exited. I'm going to keep playing with this, but at this point, if gozk never reports an error when it cant talk to the zk server, I can't think of a way to detect connection failure. Cheers Dave
Sign in to reply to this message.
Please take a look.
Sign in to reply to this message.
Please take a look.
Sign in to reply to this message.
https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go File cmd/jujud/provisioning.go (right): https://codereview.appspot.com/6250068/diff/5004/cmd/jujud/provisioning.go#ne... cmd/jujud/provisioning.go:37: st, err := state.Open(&a.Conf.StateInfo) On 2012/06/01 01:58:27, dfc wrote: > > What happens if the connection is broken? We'll need to reestablish it and > > restart from the beginning, since all the assumptions are now wrong. Maybe > this > > has to be moved into the outer loop or something. Will leave details with you > at > > this point. > > I looked into this more this morning and it looks like if the underlying > connection to ZK is interrupted then the c client reconnects behind the scenes. > I tested this by running the PA, then shutting down zk, then starting it again a > minute later. None of the watchers exited. > > I'm going to keep playing with this, but at this point, if gozk never reports an > error when it cant talk to the zk server, I can't think of a way to detect > connection failure. We've covered this on juju-dev@. A TODO would be good here as well. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go File cmd/jujud/provisioning.go (right): https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:56: // environChanges returns a channel that will receive the new *ConfigNode As we discussed, that logic can be simplified as we can bubble up onto the top from any hiccups, rather than retrying in place. I'm happy both for this branch to be changed to cover that, or for it to be integrated with these and then refactored. I'll comment in-place and let the decision with you. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:57: // when a change is detected. // when a change in the environment configuration is detected. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:65: // stopEnvironWatcher stops and invalidates the current environWatcher. s/and invalidates// https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:69: log.Printf("provisioning: environWatcher reported error on Stop: %v", err) This is Printf rather than Debugf, so we need to be a bit nicer to the user (no method and field names). Something like this might do: "environment configuration watcher: %v" https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:76: // changes returns a channel that will receive the new *ConfigNode when a Needs updating. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:89: log.Printf("provisioning: machinesWatcehr reported error on Stop: %v", err) "machines watcher: %v" https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:107: // TODO(dfc) we need a method like state.IsValid() here to exit cleanly if IsConnected? https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:116: continue This should return an error.. it won't help to continue here. The state will be broken. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:121: log.Printf("provisioner: unable to create environment from supplied configuration: %v", err) "provisioner loaded invalid environment configuration: %v" https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:124: log.Printf("provisioner: valid environment configured") "provisioner loaded new environment configuration" We spent a good while in London fiddling with messages and trying to imagine how the actual error and log messages look like with prefixes, and the conclusion we got to is that it doesn't really improve much the understanding, and that we've been terribly inconsistent so far. The agreement we're trying to get to (which still accepts input) is being a bit more clear in the messages themselves. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:133: // TODO(dfc) we need a method like state.IsValid() here to exit cleanly if IsConnected? https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:142: return fmt.Errorf("environWatcher closed") "environment configuration watcher was closed" https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:146: log.Printf("provisioner: new configuration received, but was not valid: %v", err) "provisioner loaded invalid environment configuration: %v" https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:150: log.Printf("provisioner: new environment configuartion applied") "provisioner loaded new environment configuration"
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go File cmd/jujud/provisioning.go (right): https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:56: // environChanges returns a channel that will receive the new *ConfigNode On 2012/06/04 23:22:23, niemeyer wrote: > As we discussed, that logic can be simplified as we can bubble up onto the top > from any hiccups, rather than retrying in place. I'm happy both for this branch > to be changed to cover that, or for it to be integrated with these and then > refactored. I'll comment in-place and let the decision with you. PTAL. This has been incorporated into the loop/innerloop https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:57: // when a change is detected. On 2012/06/04 23:22:23, niemeyer wrote: > // when a change in the environment configuration is detected. Done. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:65: // stopEnvironWatcher stops and invalidates the current environWatcher. On 2012/06/04 23:22:23, niemeyer wrote: > s/and invalidates// Done. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:69: log.Printf("provisioning: environWatcher reported error on Stop: %v", err) On 2012/06/04 23:22:23, niemeyer wrote: > This is Printf rather than Debugf, so we need to be a bit nicer to the user (no > method and field names). Something like this might do: > > "environment configuration watcher: %v" Done. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:76: // changes returns a channel that will receive the new *ConfigNode when a On 2012/06/04 23:22:23, niemeyer wrote: > Needs updating. Deleted. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:89: log.Printf("provisioning: machinesWatcehr reported error on Stop: %v", err) On 2012/06/04 23:22:23, niemeyer wrote: > "machines watcher: %v" Done. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:107: // TODO(dfc) we need a method like state.IsValid() here to exit cleanly if On 2012/06/04 23:22:23, niemeyer wrote: > IsConnected? Done. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:116: continue On 2012/06/04 23:22:23, niemeyer wrote: > This should return an error.. it won't help to continue here. The state will be > broken. PTAL, the collection of the error (if present) is handled by the defer now. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:121: log.Printf("provisioner: unable to create environment from supplied configuration: %v", err) On 2012/06/04 23:22:23, niemeyer wrote: > "provisioner loaded invalid environment configuration: %v" Done. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:124: log.Printf("provisioner: valid environment configured") Fair enough. I have no preference apart from consistency, so +1 On 2012/06/04 23:22:23, niemeyer wrote: > "provisioner loaded new environment configuration" > > We spent a good while in London fiddling with messages and trying to imagine how > the actual error and log messages look like with prefixes, and the conclusion we > got to is that it doesn't really improve much the understanding, and that we've > been terribly inconsistent so far. The agreement we're trying to get to (which > still accepts input) is being a bit more clear in the messages themselves. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:133: // TODO(dfc) we need a method like state.IsValid() here to exit cleanly if On 2012/06/04 23:22:23, niemeyer wrote: > IsConnected? Done. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:142: return fmt.Errorf("environWatcher closed") On 2012/06/04 23:22:23, niemeyer wrote: > "environment configuration watcher was closed" Done. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:146: log.Printf("provisioner: new configuration received, but was not valid: %v", err) On 2012/06/04 23:22:23, niemeyer wrote: > "provisioner loaded invalid environment configuration: %v" Done. https://codereview.appspot.com/6250068/diff/16001/cmd/jujud/provisioning.go#n... cmd/jujud/provisioning.go:150: log.Printf("provisioner: new environment configuartion applied") On 2012/06/04 23:22:23, niemeyer wrote: > "provisioner loaded new environment configuration" Done.
Sign in to reply to this message.
That last set of changes are awesome. LGTM.
Sign in to reply to this message.
*** Submitted: cmd/jujud: strawman provisioning agent This is a strawman provisioning agent proposal. Following Gustavo's suggestion, this revision does not include any functionality to respond to machines changes. R=rog, niemeyer CC= https://codereview.appspot.com/6250068
Sign in to reply to this message.
|