|
|
DescriptionSerialize the uniter hook executions.
Implement a file system lock, and use that in the uniter to serialize the hook
execution across different uniters running on a single machine.
https://code.launchpad.net/~thumper/juju-core/fslock-mashup/+merge/159543
(do not edit description out of merge proposal)
Patch Set 1 #
Total comments: 61
Patch Set 2 : Serialize the uniter hook executions. #
Total comments: 3
Patch Set 3 : Serialize the uniter hook executions. #
Total comments: 2
MessagesTotal messages: 16
Please take a look.
Sign in to reply to this message.
Comment changes have been made in the branch and pushed. Visible in the merge proposal, but not resubmitted here. I do have a question about the last uniter test. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go File utils/fslock/fslock.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode101 utils/fslock/fslock.go:101: // Create a temporary directory (in the temp dir), and then move it to the right name. Isn't created in the temp dir any more. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode197 utils/fslock/fslock.go:197: // Now move the temp directory to the lock directory. Gah, this is the wrong way around. https://codereview.appspot.com/8849043/diff/1/worker/uniter/uniter_test.go File worker/uniter/uniter_test.go (right): https://codereview.appspot.com/8849043/diff/1/worker/uniter/uniter_test.go#ne... worker/uniter/uniter_test.go:420: verifyHookSyncLockLocked, Can someone check this please? I want the uniter's init method to have been run, but not any hooks. If we have something that is going to wait or check on a hook, it'll block.
Sign in to reply to this message.
a few comments on fslock https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go File utils/fslock/fslock.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode103 utils/fslock/fslock.go:103: tempLockName := fmt.Sprintf(".%s", hex.EncodeToString(lock.nonce)) or "." + hex.EncodeToString(lock.nonce) or fmt.Sprintf(".%x", lock.nonce) https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode109 utils/fslock/fslock.go:109: err = ioutil.WriteFile(path.Join(tempDirName, heldFilename), lock.nonce, 0755) i think this should write the message too into the same file. that way we've only got 7 fs operations per lock rather and 10. i'm still concerned about disk churn with this implementation - we seem to be using directories from of some vague notion of portability rather than actually knowing what our platform constraints are. one other thing that occurs to me is that we're creating the temp directory each time we attempt to acquire the lock, where actually we need only create it once per Lock; then "acquire" is simply a rename and the Lock method can clean up the temp directory if it fails. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode123 utils/fslock/fslock.go:123: log.Infof("Lock %q beaten to the dir rename, %s, currently held: %s", lock.name, message, lock.Message()) this is a Debugf at best. in fact, i don't think it even justifies that - we don't want this printing 3 or 4 times a second while several charms wait for a long-lived hook. perhaps Lock could print it after every n iterations. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode138 utils/fslock/fslock.go:138: func (lock *Lock) Lock(message string) error { i'd be tempted to merge this implementation with LockWithTimeout. perhaps something like: // lockWithTimeout tries to acquire the lock. If the deadline is // non-zero and it cannot acquire the lock before then, // it returns ErrTimeout. func (lock *Lock) lockWithDeadline(deadline time.Time, message string) error { for { acquired, err := lock.acquire(message) if err != nil { return err } if acquired { return nil } if !deadline.IsZero() && time.Now().After(deadline) { return ErrTimeout } time.Sleep(lockWaitDelay) } panic("unreachable") } then Lock can call lock.lockWithDeadline(time.Time{}, message) and LockWithTimeout can call lock.lockWithDeadline(time.Now().Add(timeout), message) https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode150 utils/fslock/fslock.go:150: log.Infof("Attempt Lock failed %q, %s, currently held: %s", lock.name, message, currMessage) as with the log message above, i don't think it's a good thing (and isn't it printing essentially the same info anyway?) https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode195 utils/fslock/fslock.go:195: tempLockName := fmt.Sprintf(".%s.%s", lock.name, hex.EncodeToString(lock.nonce)) fmt.Sprintf(".%s.%x", lock.name, lock.nonce) https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode216 utils/fslock/fslock.go:216: func (lock *Lock) SetMessage(message string) error { do we need this? https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go File utils/fslock/fslock_test.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go#new... utils/fslock/fslock_test.go:292: var lockState = new(int32) i think i'd do: lockState := int32(1) and use &lockState below. but YMMV https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go#new... utils/fslock/fslock_test.go:306: c.Check(err, IsNil) return if this fails? (and the checks below too, probably - we don't want to carry on for many attempts after a failure) https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go#new... utils/fslock/fslock_test.go:314: _ = atomic.AddInt32(lockState, -1) or just ignore the return https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go#new... utils/fslock/fslock_test.go:318: _ = atomic.AddInt64(counter, 1) or just ignore the return
Sign in to reply to this message.
Generally, LGTM, with some trivials. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go File utils/fslock/fslock.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode47 utils/fslock/fslock.go:47: func generateNonce() ([]byte, error) { please, use trivial(utils?)/uuid: NewUUID() - we have way too much such functions scattered all over the place. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode57 utils/fslock/fslock.go:57: // expression `^[a-z]+[a-z0-9.-]*`. s/`^[a-z]+[a-z0-9.-]*`/defined by nameRegexp/ ? https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode123 utils/fslock/fslock.go:123: log.Infof("Lock %q beaten to the dir rename, %s, currently held: %s", lock.name, message, lock.Message()) On 2013/04/18 08:20:28, rog wrote: > this is a Debugf at best. > in fact, i don't think it even justifies that - we don't want this printing 3 or > 4 times a second while several charms wait for a long-lived hook. perhaps Lock > could print it after every n iterations. Also, please add "utils: " badge as prefix and make the message start with a lowercase letter. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode150 utils/fslock/fslock.go:150: log.Infof("Attempt Lock failed %q, %s, currently held: %s", lock.name, message, currMessage) please: lowercase first letter, add "utils: " badge
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go File utils/fslock/fslock.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode47 utils/fslock/fslock.go:47: func generateNonce() ([]byte, error) { On 2013/04/18 08:37:05, dimitern wrote: > please, use trivial(utils?)/uuid: NewUUID() - we have way too much such > functions scattered all over the place. Sorry, I tried, but not simple at this stage. It seems overkill to write the string representation of the uuid to the file, but also there is no way to create a uuid from a byte slice, no equality operator, no way to easily write the bytes out in one line: uuid.Raw()[:] doesn't work. I'd be open to approving a patch later that changed this, but not right now. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode57 utils/fslock/fslock.go:57: // expression `^[a-z]+[a-z0-9.-]*`. On 2013/04/18 08:37:05, dimitern wrote: > s/`^[a-z]+[a-z0-9.-]*`/defined by nameRegexp/ ? This is documentation on the public method. The package private nameRegexp isn't documented, so we'd be pointing people to something they can't see. I'd prefer to keep this explicit. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode103 utils/fslock/fslock.go:103: tempLockName := fmt.Sprintf(".%s", hex.EncodeToString(lock.nonce)) On 2013/04/18 08:20:28, rog wrote: > or "." + hex.EncodeToString(lock.nonce) > > or fmt.Sprintf(".%x", lock.nonce) fmt.Sprintf(".%x", lock.nonce) is much nicer, used. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode109 utils/fslock/fslock.go:109: err = ioutil.WriteFile(path.Join(tempDirName, heldFilename), lock.nonce, 0755) On 2013/04/18 08:20:28, rog wrote: > i think this should write the message too into the same file. > that way we've only got 7 fs operations per lock rather and 10. > > i'm still concerned about disk churn with this implementation - we seem to be > using directories from of some vague notion of portability rather than actually > knowing what our platform constraints are. > > one other thing that occurs to me is that we're creating the temp directory each > time we attempt to acquire the lock, where actually we need only create it once > per Lock; then "acquire" is simply a rename > and the Lock method can clean up the temp directory if it fails. I think you are being needlessly concerned. acquire is only called if the lock is unlocked, and fails generally if only someone else beat them to it. Given that we can acquire locks in sub-millisecond time (as shown by the tests), I think merging the held and message files isn't worth it. held is binary, and message is text. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode123 utils/fslock/fslock.go:123: log.Infof("Lock %q beaten to the dir rename, %s, currently held: %s", lock.name, message, lock.Message()) On 2013/04/18 08:37:05, dimitern wrote: > On 2013/04/18 08:20:28, rog wrote: > > this is a Debugf at best. > > in fact, i don't think it even justifies that - we don't want this printing 3 > or > > 4 times a second while several charms wait for a long-lived hook. perhaps Lock > > could print it after every n iterations. > > Also, please add "utils: " badge as prefix and make the message start with a > lowercase letter. Just removed it. Don't think it added too much value. The logs will show which hooks are running. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode138 utils/fslock/fslock.go:138: func (lock *Lock) Lock(message string) error { On 2013/04/18 08:20:28, rog wrote: > i'd be tempted to merge this implementation with LockWithTimeout. > > perhaps something like: > > // lockWithTimeout tries to acquire the lock. If the deadline is > // non-zero and it cannot acquire the lock before then, > // it returns ErrTimeout. > func (lock *Lock) lockWithDeadline(deadline time.Time, message string) error { > for { > acquired, err := lock.acquire(message) > if err != nil { > return err > } > if acquired { > return nil > } > if !deadline.IsZero() && time.Now().After(deadline) { > return ErrTimeout > } > time.Sleep(lockWaitDelay) > } > panic("unreachable") > } > > then Lock can call lock.lockWithDeadline(time.Time{}, message) > and LockWithTimeout can call lock.lockWithDeadline(time.Now().Add(timeout), > message) No. Names are all about intent. If I want to lock and not have a deadline, I don't want to call a method called lockWithDeadline. "Clean Code" tells us so. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode150 utils/fslock/fslock.go:150: log.Infof("Attempt Lock failed %q, %s, currently held: %s", lock.name, message, currMessage) On 2013/04/18 08:37:05, dimitern wrote: > please: lowercase first letter, add "utils: " badge why? https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode150 utils/fslock/fslock.go:150: log.Infof("Attempt Lock failed %q, %s, currently held: %s", lock.name, message, currMessage) On 2013/04/18 08:20:28, rog wrote: > as with the log message above, i don't think it's a good thing (and isn't it > printing essentially the same info anyway?) It shows us the lock contention, and we only write out the message if it changes, that is the lock gets grabbed by someone else. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode195 utils/fslock/fslock.go:195: tempLockName := fmt.Sprintf(".%s.%s", lock.name, hex.EncodeToString(lock.nonce)) On 2013/04/18 08:20:28, rog wrote: > fmt.Sprintf(".%s.%x", lock.name, lock.nonce) Done. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode216 utils/fslock/fslock.go:216: func (lock *Lock) SetMessage(message string) error { On 2013/04/18 08:20:28, rog wrote: > do we need this? Possibly not, but now it is written and tested, I'd prefer to see some use of this before we remove it as YAGNI. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go File utils/fslock/fslock_test.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go#new... utils/fslock/fslock_test.go:306: c.Check(err, IsNil) On 2013/04/18 08:20:28, rog wrote: > return if this fails? (and the checks below too, probably - we don't want to > carry on for many attempts after a failure) Changed Checks to Asserts. That will exit the goroutine. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go#new... utils/fslock/fslock_test.go:314: _ = atomic.AddInt32(lockState, -1) On 2013/04/18 08:20:28, rog wrote: > or just ignore the return Done. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go#new... utils/fslock/fslock_test.go:318: _ = atomic.AddInt64(counter, 1) On 2013/04/18 08:20:28, rog wrote: > or just ignore the return For some reason, I thought I had to deal with all the return values.
Sign in to reply to this message.
LGTM with some comments -- an an exhortation not to actually *land* anything until we understand the packaging/releases situation. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go File utils/fslock/fslock.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode57 utils/fslock/fslock.go:57: // expression `^[a-z]+[a-z0-9.-]*`. On 2013/04/18 23:19:37, thumper wrote: > On 2013/04/18 08:37:05, dimitern wrote: > > s/`^[a-z]+[a-z0-9.-]*`/defined by nameRegexp/ ? > > This is documentation on the public method. The package private nameRegexp > isn't documented, so we'd be pointing people to something they can't see. I'd > prefer to keep this explicit. Maybe best to make it explicit as an exported const, and reference that, than to enable drift by describing it twice. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode109 utils/fslock/fslock.go:109: err = ioutil.WriteFile(path.Join(tempDirName, heldFilename), lock.nonce, 0755) On 2013/04/18 23:19:37, thumper wrote: > On 2013/04/18 08:20:28, rog wrote: > > i think this should write the message too into the same file. > > that way we've only got 7 fs operations per lock rather and 10. > > > > i'm still concerned about disk churn with this implementation - we seem to be > > using directories from of some vague notion of portability rather than > actually > > knowing what our platform constraints are. > > > > one other thing that occurs to me is that we're creating the temp directory > each > > time we attempt to acquire the lock, where actually we need only create it > once > > per Lock; then "acquire" is simply a rename > > and the Lock method can clean up the temp directory if it fails. > > I think you are being needlessly concerned. > > acquire is only called if the lock is unlocked, and fails generally if only > someone else beat them to it. Given that we can acquire locks in > sub-millisecond time (as shown by the tests), I think merging the held and > message files isn't worth it. held is binary, and message is text. Sounds sane to me: if we're suffering serious contention we'll see it in the logs anyway. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode123 utils/fslock/fslock.go:123: log.Infof("Lock %q beaten to the dir rename, %s, currently held: %s", lock.name, message, lock.Message()) On 2013/04/18 23:19:37, thumper wrote: > On 2013/04/18 08:37:05, dimitern wrote: > > On 2013/04/18 08:20:28, rog wrote: > > > this is a Debugf at best. > > > in fact, i don't think it even justifies that - we don't want this printing > 3 > > or > > > 4 times a second while several charms wait for a long-lived hook. perhaps > Lock > > > could print it after every n iterations. > > > > Also, please add "utils: " badge as prefix and make the message start with a > > lowercase letter. > > Just removed it. Don't think it added too much value. > The logs will show which hooks are running. I think explicit logging of contention is a good thing and deserves an Infof: as stated, this case will only ever be hit during a very narrow window, so I think that we can comfortably diagnose 3-4 messages/sec as indicating a Problem. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode150 utils/fslock/fslock.go:150: log.Infof("Attempt Lock failed %q, %s, currently held: %s", lock.name, message, currMessage) On 2013/04/18 23:19:37, thumper wrote: > On 2013/04/18 08:37:05, dimitern wrote: > > please: lowercase first letter, add "utils: " badge > > why? Convention. Mixes of upper and lower case have crept in here and there, but I think it's good to be consistent. Type names, etc, should ofc be accurate, but a log message is generally not really enough of a sentence to deserve an initial capital. IMO :) https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode150 utils/fslock/fslock.go:150: log.Infof("Attempt Lock failed %q, %s, currently held: %s", lock.name, message, currMessage) On 2013/04/18 23:19:37, thumper wrote: > On 2013/04/18 08:20:28, rog wrote: > > as with the log message above, i don't think it's a good thing (and isn't it > > printing essentially the same info anyway?) > > It shows us the lock contention, and we only write out the message if it > changes, that is the lock gets grabbed by someone else. sgtm https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode216 utils/fslock/fslock.go:216: func (lock *Lock) SetMessage(message string) error { On 2013/04/18 23:19:37, thumper wrote: > On 2013/04/18 08:20:28, rog wrote: > > do we need this? > > Possibly not, but now it is written and tested, I'd prefer to see some use of > this before we remove it as YAGNI. That's somewhat contradictory ;p. However I am myself not sure what the eventual requirements for this sort of locking will be, so I'm loath to consign this functionality to a forgotten personal branch forever. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go File utils/fslock/fslock_test.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go#new... utils/fslock/fslock_test.go:306: c.Check(err, IsNil) On 2013/04/18 23:19:37, thumper wrote: > On 2013/04/18 08:20:28, rog wrote: > > return if this fails? (and the checks below too, probably - we don't want to > > carry on for many attempts after a failure) > > Changed Checks to Asserts. That will exit the goroutine. I forget the precise failure mode, but IIRC Asserts in goroutines don't end well. Check returns failure, though. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go#new... utils/fslock/fslock_test.go:318: _ = atomic.AddInt64(counter, 1) On 2013/04/18 23:19:37, thumper wrote: > On 2013/04/18 08:20:28, rog wrote: > > or just ignore the return > > For some reason, I thought I had to deal with all the return values. That does feel consistent with the general principles of the language, indeed. https://codereview.appspot.com/8849043/diff/1/worker/uniter/uniter_test.go File worker/uniter/uniter_test.go (right): https://codereview.appspot.com/8849043/diff/1/worker/uniter/uniter_test.go#ne... worker/uniter/uniter_test.go:420: verifyHookSyncLockLocked, On 2013/04/18 04:16:52, thumper wrote: > Can someone check this please? I want the uniter's init method to have been > run, but not any hooks. If we have something that is going to wait or check on > a hook, it'll block. I think it's easier to do a quickStart, and then stop the uniter, grab the lock, start it up again and waitHooks{}. (startUniter *might* waitHooks{"config-changed"}, I forget, that's not what we want, keep an eye out) Good to also test what happens when something else grabs the lock while the uniter's still running but before a (new, cause-by-test) config-changed or similar is induced.
Sign in to reply to this message.
On 2013/04/19 00:24:49, fwereade wrote: > LGTM with some comments -- an an exhortation not to actually *land* anything > until we understand the packaging/releases situation. Holding off until the freeze is lifted. > https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go > File utils/fslock/fslock.go (right): > > https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode57 > utils/fslock/fslock.go:57: // expression `^[a-z]+[a-z0-9.-]*`. > On 2013/04/18 23:19:37, thumper wrote: > > On 2013/04/18 08:37:05, dimitern wrote: > > > s/`^[a-z]+[a-z0-9.-]*`/defined by nameRegexp/ ? > > > > This is documentation on the public method. The package private nameRegexp > > isn't documented, so we'd be pointing people to something they can't see. I'd > > prefer to keep this explicit. > > Maybe best to make it explicit as an exported const, and reference that, than to > enable drift by describing it twice. Sounds like a plan. > https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go > File utils/fslock/fslock_test.go (right): > > https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go#new... > utils/fslock/fslock_test.go:306: c.Check(err, IsNil) > On 2013/04/18 23:19:37, thumper wrote: > > On 2013/04/18 08:20:28, rog wrote: > > > return if this fails? (and the checks below too, probably - we don't want to > > > carry on for many attempts after a failure) > > > > Changed Checks to Asserts. That will exit the goroutine. > > I forget the precise failure mode, but IIRC Asserts in goroutines don't end > well. Check returns failure, though. I read the code, and Assert just calls runtime.Goexit, which terminates the calling goroutine. Since the calling goroutine in this instance is just the testing function, it's fine. Other goroutines are still running, so don't expect all the test to stop. This is one reason why there is the check at the end that we had the expected number of locks taken. > https://codereview.appspot.com/8849043/diff/1/worker/uniter/uniter_test.go > File worker/uniter/uniter_test.go (right): > > https://codereview.appspot.com/8849043/diff/1/worker/uniter/uniter_test.go#ne... > worker/uniter/uniter_test.go:420: verifyHookSyncLockLocked, > On 2013/04/18 04:16:52, thumper wrote: > > Can someone check this please? I want the uniter's init method to have been > > run, but not any hooks. If we have something that is going to wait or check > on > > a hook, it'll block. > > I think it's easier to do a quickStart, and then stop the uniter, grab the lock, > start it up again and waitHooks{}. (startUniter *might* > waitHooks{"config-changed"}, I forget, that's not what we want, keep an eye out) > > Good to also test what happens when something else grabs the lock while the > uniter's still running but before a (new, cause-by-test) config-changed or > similar is induced. But that isn't what this one is testing... The first test case does exactly what you say. It is the tests that have an initial lock that is owned by someone else, and we want to start a uniter, have the init method called (which is where it would break its own lock) and then confirm that the lock owned by someone else is still there.
Sign in to reply to this message.
https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go File utils/fslock/fslock.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode109 utils/fslock/fslock.go:109: err = ioutil.WriteFile(path.Join(tempDirName, heldFilename), lock.nonce, 0755) On 2013/04/18 23:19:37, thumper wrote: > On 2013/04/18 08:20:28, rog wrote: > > i think this should write the message too into the same file. > > that way we've only got 7 fs operations per lock rather and 10. > > > > i'm still concerned about disk churn with this implementation - we seem to be > > using directories from of some vague notion of portability rather than > actually > > knowing what our platform constraints are. > > > > one other thing that occurs to me is that we're creating the temp directory > each > > time we attempt to acquire the lock, where actually we need only create it > once > > per Lock; then "acquire" is simply a rename > > and the Lock method can clean up the temp directory if it fails. > > I think you are being needlessly concerned. > > acquire is only called if the lock is unlocked, and fails generally if only > someone else beat them to it. Given that we can acquire locks in > sub-millisecond time (as shown by the tests), I think merging the held and > message files isn't worth it. held is binary, and message is text. yeah, sorry, i'd missed the Stat at the start of this function. that said, you wouldn't need the Stat if you'd already created the directory and just tried the rename. BTW have you tried the tests on an ec2 instance? i'd be interested to know how well they performed there. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode123 utils/fslock/fslock.go:123: log.Infof("Lock %q beaten to the dir rename, %s, currently held: %s", lock.name, message, lock.Message()) On 2013/04/19 00:24:49, fwereade wrote: > On 2013/04/18 23:19:37, thumper wrote: > > On 2013/04/18 08:37:05, dimitern wrote: > > > On 2013/04/18 08:20:28, rog wrote: > > > > this is a Debugf at best. > > > > in fact, i don't think it even justifies that - we don't want this > printing > > 3 > > > or > > > > 4 times a second while several charms wait for a long-lived hook. perhaps > > Lock > > > > could print it after every n iterations. > > > > > > Also, please add "utils: " badge as prefix and make the message start with a > > > lowercase letter. > > > > Just removed it. Don't think it added too much value. > > The logs will show which hooks are running. > > I think explicit logging of contention is a good thing and deserves an Infof: as > stated, this case will only ever be hit during a very narrow window, so I think > that we can comfortably diagnose 3-4 messages/sec as indicating a Problem. if we have a charm with three subordinates and its install hook takes a few minutes (not uncommon), this will surely be the expected case, no? https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode138 utils/fslock/fslock.go:138: func (lock *Lock) Lock(message string) error { On 2013/04/18 23:19:37, thumper wrote: > On 2013/04/18 08:20:28, rog wrote: > > i'd be tempted to merge this implementation with LockWithTimeout. > > > > perhaps something like: > > > > // lockWithTimeout tries to acquire the lock. If the deadline is > > // non-zero and it cannot acquire the lock before then, > > // it returns ErrTimeout. > > func (lock *Lock) lockWithDeadline(deadline time.Time, message string) error { > > for { > > acquired, err := lock.acquire(message) > > if err != nil { > > return err > > } > > if acquired { > > return nil > > } > > if !deadline.IsZero() && time.Now().After(deadline) { > > return ErrTimeout > > } > > time.Sleep(lockWaitDelay) > > } > > panic("unreachable") > > } > > > > then Lock can call lock.lockWithDeadline(time.Time{}, message) > > and LockWithTimeout can call lock.lockWithDeadline(time.Now().Add(timeout), > > message) > > No. Names are all about intent. If I want to lock and not have a deadline, I > don't want to call a method called lockWithDeadline. "Clean Code" tells us so. and the fact that both functions are actually identical but for one if statement doesn't raise your DRY hackles? i find that a little odd, a feeling made more acute by the fact that they are now subtly different for no obvious reason. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode216 utils/fslock/fslock.go:216: func (lock *Lock) SetMessage(message string) error { On 2013/04/19 00:24:49, fwereade wrote: > On 2013/04/18 23:19:37, thumper wrote: > > On 2013/04/18 08:20:28, rog wrote: > > > do we need this? > > > > Possibly not, but now it is written and tested, I'd prefer to see some use of > > this before we remove it as YAGNI. > > That's somewhat contradictory ;p. > > However I am myself not sure what the eventual requirements for this sort of > locking will be, so I'm loath to consign this functionality to a forgotten > personal branch forever. i'd prefer to lose it. AFAICS this functionality is the only reason that it's convenient to use an extra message file. please let's keep things minimal unto we *do* know what the eventual requirements are, otherwise things will come to depend on this functionality for no good reason. every unnecessary line of code in the source tree is a part of the burden we have to carry.
Sign in to reply to this message.
On 2013/04/19 04:43:59, thumper wrote: > But that isn't what this one is testing... The first test case does exactly > what you say. It is the tests that have an initial lock that is owned by > someone else, and we want to start a uniter, have the init method called (which > is where it would break its own lock) and then confirm that the lock owned by > someone else is still there. Yep, I'm on crack there. Sorry. How about waitAddresses{}, waitHooks{}, verifylocked{}, unlock{}, waitHooks{"install"...}?
Sign in to reply to this message.
A few more thoughts: https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go File utils/fslock/fslock.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode123 utils/fslock/fslock.go:123: log.Infof("Lock %q beaten to the dir rename, %s, currently held: %s", lock.name, message, lock.Message()) On 2013/04/19 06:26:01, rog wrote: > if we have a charm with three subordinates and its install hook takes a few > minutes (not uncommon), this will surely be the expected case, no? I don't think so, because acquire will only get this far if the lock was *not* held, so that's actually 3 or 4 minutes in which this is guaranteed *not* to happen. When all the subordinates fight for the lock, that's another matter. (FWIW the specific situation you described is impossible, because the uniter won't start creating its own subordinates until it's in "started", but various distinct subordinates' install hooks could absolutely contend.) https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode138 utils/fslock/fslock.go:138: func (lock *Lock) Lock(message string) error { On 2013/04/19 06:26:01, rog wrote: > On 2013/04/18 23:19:37, thumper wrote: > > No. Names are all about intent. If I want to lock and not have a deadline, I > > don't want to call a method called lockWithDeadline. "Clean Code" tells us > so. > > and the fact that both functions are actually identical but for one if statement > doesn't raise your DRY hackles? > i find that a little odd, a feeling made more acute by the fact that they are > now subtly different for no obvious reason. Consider DRY hackles raised. But the following feels like a reasonably clean separation of responsibilities... func (lock *Lock) lock(message string, onFailure func() error) error { for { if acquired, err := lock.acquire(message); err != nil { return err } else if acquired { return nil } if err := onFailure(); err != nil { return err } time.Sleep(lockDelay) } panic("unreachable") } func (lock *Lock) Lock(message string) error { heldMessage := "" return lock.lock(message, func() error { if currMessage := lock.Message(); currMessage != heldMessage { log.Infof("blah") heldMessage = currMessage } return nil }) } https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode216 utils/fslock/fslock.go:216: func (lock *Lock) SetMessage(message string) error { On 2013/04/19 06:26:01, rog wrote: > i'd prefer to lose it. AFAICS this functionality is the only reason that it's > convenient to use an extra message file. I don't think there's a connection. It's not because it's "convenient", it's because the two pieces of information have different meanings and uses. > please let's keep things minimal unto we *do* know what the eventual > requirements are, otherwise things will come to depend on this functionality for > no good reason. every unnecessary line of code in the source tree is a part of > the burden we have to carry. A coherent and well-tested abstraction is I think worth a few extra lines of code that don't currently happen to be used in anger. If it's used, it's used, and that's great (I really don't think "someone might use it" is a reason not to implement it...); but if it's not we can drop it at our leisure *without* the sort of tedious reanalysis of the whole type that'd be necessary to safely add code like this (as we'd have to do if we deleted it now and wanted it later). https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode226 utils/fslock/fslock.go:226: tempFilename := tempFile.Name() I think a file's name is still available after it's closed. Maybe we shouldn't depend on that though. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go File utils/fslock/fslock_test.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock_test.go#new... utils/fslock/fslock_test.go:30: fslock.SetLockWaitDelay(1 * time.Millisecond) Can we make this return the original value, and store that on the suite, so we can set to back to what it definitely was before we started? https://codereview.appspot.com/8849043/diff/9001/worker/uniter/uniter.go File worker/uniter/uniter.go (right): https://codereview.appspot.com/8849043/diff/9001/worker/uniter/uniter.go#newc... worker/uniter/uniter.go:291: if err = u.hookLock.Lock(fmt.Sprintf("%s: running hook %q", u.unit.Name(), hookName)); err != nil { Hey, wait. This will make the task unkillable... we should be able to do something like: u.hookLock.Acquire(message) select { case <-u.tomb.Dying(): return tomb.ErrDying case <-u.hookLock.Acquired(): defer u.hookLock.Release() } ...but I think this bug is strictly less bad than the one it's trying to solve, so I'm ok to land this if there'll be a followup to resolve it soon.
Sign in to reply to this message.
https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go File utils/fslock/fslock.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode47 utils/fslock/fslock.go:47: func generateNonce() ([]byte, error) { On 2013/04/18 23:19:37, thumper wrote: > On 2013/04/18 08:37:05, dimitern wrote: > > please, use trivial(utils?)/uuid: NewUUID() - we have way too much such > > functions scattered all over the place. > > Sorry, I tried, but not simple at this stage. > > It seems overkill to write the string representation of the uuid to the file, > but also there is no way to create a uuid from a byte slice, no equality > operator, no way to easily write the bytes out in one line: uuid.Raw()[:] > doesn't work. > > I'd be open to approving a patch later that changed this, but not right now. nonce, err := utils.UUID() ... lock := &Lock{ nonce: nonce[:], } will work just fine. you can still store the uuid as a []byte to make it easy to read and compare. BTW you might find it worth your while to read some of the language spec again. In particular http://golang.org/ref/spec#Conversions and http://golang.org/ref/spec#Comparison_operators . all that said, i don't mind this code much. the string representation of a uuid seems wilfully obscure to me and this code is trivial. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode123 utils/fslock/fslock.go:123: log.Infof("Lock %q beaten to the dir rename, %s, currently held: %s", lock.name, message, lock.Message()) On 2013/04/19 08:36:33, fwereade wrote: > On 2013/04/19 06:26:01, rog wrote: > > if we have a charm with three subordinates and its install hook takes a few > > minutes (not uncommon), this will surely be the expected case, no? > > I don't think so, because acquire will only get this far if the lock was *not* > held, so that's actually 3 or 4 minutes in which this is guaranteed *not* to > happen. When all the subordinates fight for the lock, that's another matter. > > (FWIW the specific situation you described is impossible, because the uniter > won't start creating its own subordinates until it's in "started", but various > distinct subordinates' install hooks could absolutely contend.) sorry, i thought you were referring to both log messages. this one will indeed not be shown many times and can probably remain (although i still think it's just noise) https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode138 utils/fslock/fslock.go:138: func (lock *Lock) Lock(message string) error { On 2013/04/19 08:36:33, fwereade wrote: > On 2013/04/19 06:26:01, rog wrote: > > On 2013/04/18 23:19:37, thumper wrote: > > > No. Names are all about intent. If I want to lock and not have a deadline, > I > > > don't want to call a method called lockWithDeadline. "Clean Code" tells us > > so. > > > > and the fact that both functions are actually identical but for one if > statement > > doesn't raise your DRY hackles? > > i find that a little odd, a feeling made more acute by the fact that they are > > now subtly different for no obvious reason. > > Consider DRY hackles raised. But the following feels like a reasonably clean > separation of responsibilities... > > func (lock *Lock) lock(message string, onFailure func() error) error { > for { > if acquired, err := lock.acquire(message); err != nil { > return err > } else if acquired { > return nil > } > if err := onFailure(); err != nil { > return err > } > time.Sleep(lockDelay) > } > panic("unreachable") > } > > func (lock *Lock) Lock(message string) error { > heldMessage := "" > return lock.lock(message, func() error { > if currMessage := lock.Message(); currMessage != heldMessage { > log.Infof("blah") > heldMessage = currMessage > } > return nil > }) > } to me that seems more complex and harder to follow than my suggestion. we only need one degree of freedom - a general onFailure function argument seems like overkill here. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode138 utils/fslock/fslock.go:138: func (lock *Lock) Lock(message string) error { > No. Names are all about intent. If I want to lock and not have a deadline, I > don't want to call a method called lockWithDeadline. "Clean Code" tells us so. ok, call it something different then. lock0, generalLock, lockWithPossibleDeadline, sharedLockImplementation, i don't really mind, but duplicating all that code for one if seems wrong. FWIW passing a zero time.Time to signify no timeout is not an uncommon idiom (see net.Conn.SetDeadline for example) https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode216 utils/fslock/fslock.go:216: func (lock *Lock) SetMessage(message string) error { On 2013/04/19 08:36:33, fwereade wrote: > On 2013/04/19 06:26:01, rog wrote: > > i'd prefer to lose it. AFAICS this functionality is the only reason that it's > > convenient to use an extra message file. > > I don't think there's a connection. It's not because it's "convenient", it's > because the two pieces of information have different meanings and uses. it also has significant runtime implications. this is a piece of core infrastructure - let's make it work efficiently. > > please let's keep things minimal unto we *do* know what the eventual > > requirements are, otherwise things will come to depend on this functionality > for > > no good reason. every unnecessary line of code in the source tree is a part of > > the burden we have to carry. > > A coherent and well-tested abstraction is I think worth a few extra lines of > code that don't currently happen to be used in anger. If it's used, it's used, > and that's great (I really don't think "someone might use it" is a reason not to > implement it...); but if it's not we can drop it at our leisure *without* the > sort of tedious reanalysis of the whole type that'd be necessary to safely add > code like this (as we'd have to do if we deleted it now and wanted it later). YAGNI https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode226 utils/fslock/fslock.go:226: tempFilename := tempFile.Name() On 2013/04/19 08:36:33, fwereade wrote: > I think a file's name is still available after it's closed. Maybe we shouldn't > depend on that though. it's fine to do that. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode227 utils/fslock/fslock.go:227: fmt.Fprint(tempFile, message) i think this should probably check for error - we might actually care if we can't set the message because the filesystem is full or something.
Sign in to reply to this message.
This branch has languished long enough, and I think Tim's more than capable of addressing the remaining comments according to his own taste and judgment now the issues have been raised. I favour merging it as soon as possible so we can see how it handles in action (but let's followup with the channel interface ASAP, because I'm not 100% sure it is impossible for the current implementation to deadlock somehow). https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go File utils/fslock/fslock.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode138 utils/fslock/fslock.go:138: func (lock *Lock) Lock(message string) error { On 2013/04/19 09:56:14, rog wrote: > to me that seems more complex and harder to follow than my suggestion. > we only need one degree of freedom - a general onFailure function > argument seems like overkill here. To me it seems ideal, at least in theory, because it actually separates the common bits from the implementation-specific bits. But this has become a ludicrous bikeshed; so Tim, please do as you wish. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode216 utils/fslock/fslock.go:216: func (lock *Lock) SetMessage(message string) error { On 2013/04/19 09:56:14, rog wrote: > it also has significant runtime implications. this is a piece of core > infrastructure - let's make it work efficiently. Let's stick with clarity, for now; when the performance problems apparently inherent in this approach manifest, we will surely address them. > YAGNI If Tim thinks it'll be needed, I'm willing to give him the benefit of the doubt. If it becomes clearly useless, the cost of deleting it is trivial; and if it gets used, hey, nice foresight. It's not as though this code is forcing the implementation strategy for a large chunk of the codebase... https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode227 utils/fslock/fslock.go:227: fmt.Fprint(tempFile, message) On 2013/04/19 09:56:14, rog wrote: > i think this should probably check for error - we might actually care if we > can't set the message because the filesystem is full or something. +1 https://codereview.appspot.com/8849043/diff/9001/worker/uniter/uniter.go File worker/uniter/uniter.go (right): https://codereview.appspot.com/8849043/diff/9001/worker/uniter/uniter.go#newc... worker/uniter/uniter.go:291: if err = u.hookLock.Lock(fmt.Sprintf("%s: running hook %q", u.unit.Name(), hookName)); err != nil { Or even just: select { case <-u.tomb.Dying(): return tomb.ErrDying case err := <-u.hookLock.Acquire(): if err != nil { return err } defer u.hookLock.Release() } ...but Acquire's clearly not quite the right name. Anyway, ponder it, and please do a followup that unblocks the uniter if it dies when waiting for the lock.
Sign in to reply to this message.
I managed to completely misunderstand rog's earlier comment about a single internal lock function that was shared between the two public Lock methods. Sorry about that. Yes rog was right in that the two functions being almost identical did rub my DRY principles. I now have one common shared internal lock function. I also got rid of SetMessage. If we find that we need it later, we could bring it back. Did I mention before that I am stubborn? https://codereview.appspot.com/8849043/diff/9001/worker/uniter/uniter.go File worker/uniter/uniter.go (right): https://codereview.appspot.com/8849043/diff/9001/worker/uniter/uniter.go#newc... worker/uniter/uniter.go:291: if err = u.hookLock.Lock(fmt.Sprintf("%s: running hook %q", u.unit.Name(), hookName)); err != nil { On 2013/04/24 20:38:50, fwereade wrote: > Or even just: > > select { > case <-u.tomb.Dying(): > return tomb.ErrDying > case err := <-u.hookLock.Acquire(): > if err != nil { > return err > } > defer u.hookLock.Release() > } > > ...but Acquire's clearly not quite the right name. Anyway, ponder it, and please > do a followup that unblocks the uniter if it dies when waiting for the lock. I've tweaked this somewhat, so instead of having the lock have to care about channels, there is a LockWithFunc that has a callback to decide if to continue waiting. This callback checks the tomb.
Sign in to reply to this message.
https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go File utils/fslock/fslock.go (right): https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode47 utils/fslock/fslock.go:47: func generateNonce() ([]byte, error) { On 2013/04/19 09:56:14, rog wrote: > On 2013/04/18 23:19:37, thumper wrote: > > On 2013/04/18 08:37:05, dimitern wrote: > > > please, use trivial(utils?)/uuid: NewUUID() - we have way too much such > > > functions scattered all over the place. > > > > Sorry, I tried, but not simple at this stage. > > > > It seems overkill to write the string representation of the uuid to the file, > > but also there is no way to create a uuid from a byte slice, no equality > > operator, no way to easily write the bytes out in one line: uuid.Raw()[:] > > doesn't work. > > > > I'd be open to approving a patch later that changed this, but not right now. > > nonce, err := utils.UUID() > ... > lock := &Lock{ > nonce: nonce[:], > } > > will work just fine. you can still store the uuid as a []byte > to make it easy to read and compare. > > BTW you might find it worth your while to read some of the language > spec again. In particular http://golang.org/ref/spec#Conversions > and http://golang.org/ref/spec#Comparison_operators . > > all that said, i don't mind this code much. the string representation of a > uuid seems wilfully obscure to me and this code is trivial. Done. https://codereview.appspot.com/8849043/diff/1/utils/fslock/fslock.go#newcode57 utils/fslock/fslock.go:57: // expression `^[a-z]+[a-z0-9.-]*`. On 2013/04/19 00:24:49, fwereade wrote: > On 2013/04/18 23:19:37, thumper wrote: > > On 2013/04/18 08:37:05, dimitern wrote: > > > s/`^[a-z]+[a-z0-9.-]*`/defined by nameRegexp/ ? > > > > This is documentation on the public method. The package private nameRegexp > > isn't documented, so we'd be pointing people to something they can't see. I'd > > prefer to keep this explicit. > > Maybe best to make it explicit as an exported const, and reference that, than to > enable drift by describing it twice. Done.
Sign in to reply to this message.
*** Submitted: Serialize the uniter hook executions. Implement a file system lock, and use that in the uniter to serialize the hook execution across different uniters running on a single machine. R=rog, dimitern, fwereade CC= https://codereview.appspot.com/8849043
Sign in to reply to this message.
LGTM and thanks for listening :-) a few minor comments below that you might want to address in another branch if you feel like it. it's nice to have a genuine use for LockWithFunc. https://codereview.appspot.com/8849043/diff/21001/utils/fslock/fslock.go File utils/fslock/fslock.go (right): https://codereview.appspot.com/8849043/diff/21001/utils/fslock/fslock.go#newc... utils/fslock/fslock.go:179: return lock.lockLoop(message, continueFunc) i know you've already submitted, but one minor suggestion: given they're exactly the same thing, why not just rename lockLoop to LockWithFunc and have the other lock functions call that? https://codereview.appspot.com/8849043/diff/21001/utils/fslock/fslock.go#newc... utils/fslock/fslock.go:214: // BreakLock forcably breaks the lock that is currently being held. "forcibly breaks a lock if it is currently being held" i think might be more accurate, as there's nothing that checks if the lock is held, and no error is returned if it is not.
Sign in to reply to this message.
|