|
|
Descriptionutils: added Tailer for tailing of logs in API
The Tailer is the initial component of the debug logging command
of the API. It allows the filtered tailing of any ReaderSeeker.
If no filter is passed all lines will be written in the passed
Writer, otherwise only those where the filter function returns
true. The initial number of lines can also be specified, the
filter already works here. So if a File (which is a ReaderSeeker)
containes 100 lines, 10 lines are wanted and 5 match to the
filter only those 5 lines are returned.
https://code.launchpad.net/~themue/juju-core/057-tailer/+merge/197522
(do not edit description out of merge proposal)
Patch Set 1 #Patch Set 2 : utils: added Tailer for tailing of logs in API #
Total comments: 30
Patch Set 3 : utils: added Tailer for tailing of logs in API #
Total comments: 20
Patch Set 4 : utils: added Tailer for tailing of logs in API #
Total comments: 13
Patch Set 5 : utils: added Tailer for tailing of logs in API #
Total comments: 13
Patch Set 6 : utils: added Tailer for tailing of logs in API #
Total comments: 4
Patch Set 7 : utils: added Tailer for tailing of logs in API #
MessagesTotal messages: 12
Please take a look.
Sign in to reply to this message.
Please take a look.
Sign in to reply to this message.
This is a good start, but I have a few comments and suggestions. I wonder if it might sit well inside its own package rather being added to the grab-bag of stuff in utils. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go File utils/tailer.go (right): https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode30 utils/tailer.go:30: func StartFileTailer(filename string, lines int, filter TailerFilterFunc, I think this function is unnecessary. It's trivial for other code to open a file. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode40 utils/tailer.go:40: func StartTailer(readSeeker io.ReadSeeker, lines int, filter TailerFilterFunc, Please document this function properly. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode87 utils/tailer.go:87: println("> error:", err.Error()) d https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode98 utils/tailer.go:98: if buffer[i] == '\n' { This isn't strictly accurate (what happens when we have an unterminated line at the end of the file?) but it's probably ok if we never filter unterminated lines. A comment about why it's ok might prevent future puzzling over the correctness of the code. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode100 utils/tailer.go:100: if foundNewlines-1 == t.lines { s/==/>=/ defensively. Also, if we move the if statement before the increment, then you can lose the "-1". https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode127 utils/tailer.go:127: line, err := reader.ReadString('\n') I think I'd be tempted to use ReadSlice here (changing the type of filter to func([]byte)bool. That way the tailer won't need to generate any garbage at all AFAICS. Unfortunately that limits the line size to the size of the bufio buffer, which may be acceptable. An alternative is to use ReadLine and allocate only if the line is too long. Here's some code I wrote a while ago to do that, in case it might be useful: // readLine reads a line from r. // The returned byte slice is only valid until the // next read call on r. func readLine(r *bufio.Reader) ([]byte, error) { line, isPrefix, err := r.ReadLine() if !isPrefix { return line, err } buf := append([]byte(nil), line...) for isPrefix && err == nil { line, isPrefix, err = r.ReadLine() buf = append(buf, line...) } return buf, err } https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode147 utils/tailer.go:147: for { This code is exactly the same as the code above. Can't we do with just this? (if we just delete the loop above and use NewTimer(0), I think it might just work). https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode149 utils/tailer.go:149: if len(line) > 0 { I think we should ignore the line at this point if it is unterminated, something that's quite possible if the file is being written in arbitrary chunks. If we find an unterminated line, we should seek back to the start of the line before we start reading again. That means we could pass the line to filter without the trailing \n, meaning we can be more resilient if something starts producing lines ending in \r\n. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode151 utils/tailer.go:151: writer.WriteString(line) We should definitely check the error here and in Flush below - I don't think we want to carry on trying to write to a dead network connection or full filesystem. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode162 utils/tailer.go:162: t.readSeeker.Seek(0, os.SEEK_END) This is wrong, I think - we could miss data if some has been written in between seeing the EOF and this seek. I think the only time we want to seek is if we get an unterminated line. That said, there's another tricky case - what happens if the file gets truncated? I wonder if that should be done with separate logic to detect whether the file has suddenly reduced in size. We'll need separate logic anyway to decide when to reopen the file when log files roll over, so that might work out ok. https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go File utils/tailer_test.go (right): https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newco... utils/tailer_test.go:25: buffer := bytes.NewBuffer([]byte{}) var buffer bytes.Buffer is simpler and does almost the same thing. https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newco... utils/tailer_test.go:124: func assertCollected(c *gc.C, buffer *bytes.Buffer, collected []string, addon func([]string), timeout bool) { Please could we have a comment describing what this function is doing, please? https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newco... utils/tailer_test.go:162: var rs *readSeeker = new(readSeeker) var rs readSeeker https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newco... utils/tailer_test.go:206: if r.pos == len(r.buffer) { I think we can usefully use the result of copy here: if r.pos >= len(r.buffer) { return 0, io.EOF } n := copy(p, r.buffer[r.pos:]) r.pos += n return n, nil https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newco... utils/tailer_test.go:244: func signal(c *gc.C, sigc chan struct{}) { Given that the signal channel is buffered, can't we just send on the channel?
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go File utils/tailer.go (right): https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode30 utils/tailer.go:30: func StartFileTailer(filename string, lines int, filter TailerFilterFunc, On 2013/12/05 16:20:37, rog wrote: > I think this function is unnecessary. It's trivial for other code to open a > file. Done. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode40 utils/tailer.go:40: func StartTailer(readSeeker io.ReadSeeker, lines int, filter TailerFilterFunc, On 2013/12/05 16:20:37, rog wrote: > Please document this function properly. Done. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode87 utils/tailer.go:87: println("> error:", err.Error()) On 2013/12/05 16:20:37, rog wrote: > d Done. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode98 utils/tailer.go:98: if buffer[i] == '\n' { On 2013/12/05 16:20:37, rog wrote: > This isn't strictly accurate (what happens when we have an unterminated line at > the end of the file?) but it's probably ok if we never filter unterminated > lines. A comment about why it's ok might prevent future puzzling over the > correctness of the code. Added a comment at the type declaration. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode100 utils/tailer.go:100: if foundNewlines-1 == t.lines { On 2013/12/05 16:20:37, rog wrote: > s/==/>=/ > > defensively. > > Also, if we move the if statement before the increment, > then you can lose the "-1". Done. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode127 utils/tailer.go:127: line, err := reader.ReadString('\n') On 2013/12/05 16:20:37, rog wrote: > I think I'd be tempted to use ReadSlice here (changing the type of filter to > func([]byte)bool. > That way the tailer won't need to generate any garbage at all AFAICS. > > Unfortunately that limits the line size to the size of the bufio buffer, which > may be acceptable. An alternative is to use ReadLine and allocate only if the > line is too long. > > Here's some code I wrote a while ago to do that, in case it might be useful: > > // readLine reads a line from r. > // The returned byte slice is only valid until the > // next read call on r. > func readLine(r *bufio.Reader) ([]byte, error) { > line, isPrefix, err := r.ReadLine() > if !isPrefix { > return line, err > } > buf := append([]byte(nil), line...) > for isPrefix && err == nil { > line, isPrefix, err = r.ReadLine() > buf = append(buf, line...) > } > return buf, err > } Good hint, using it. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode147 utils/tailer.go:147: for { On 2013/12/05 16:20:37, rog wrote: > This code is exactly the same as the code above. > Can't we do with just this? (if we just delete the > loop above and use NewTimer(0), I think it might > just work). Yep, the split has been due to the initial approach of scanning all up to the end first. Removed. Thanks. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode149 utils/tailer.go:149: if len(line) > 0 { On 2013/12/05 16:20:37, rog wrote: > I think we should ignore the line at this point if it is unterminated, something > that's quite possible if the file is being written in arbitrary chunks. If we > find an unterminated line, we should seek back to the start of the line before > we start reading again. > > That means we could pass the line to filter without the trailing \n, meaning we > can be more resilient if something starts producing lines ending in \r\n. Found a solution for unterminated lines, but currently ignoring possible \r\n and passing the termination to the filter too. For the work with other delimiters than \n more changes are needed. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode151 utils/tailer.go:151: writer.WriteString(line) On 2013/12/05 16:20:37, rog wrote: > We should definitely check the error here and in Flush below - I don't think we > want to carry on trying to write to a dead network connection or full > filesystem. Done. https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode162 utils/tailer.go:162: t.readSeeker.Seek(0, os.SEEK_END) On 2013/12/05 16:20:37, rog wrote: > This is wrong, I think - we could miss data if some has been > written in between seeing the EOF and this seek. > I think the only time we want to seek is if we get an unterminated line. > > That said, there's another tricky case - what happens if the file gets > truncated? I wonder if that should be done with separate logic to detect whether > the file has suddenly reduced in size. We'll need separate logic anyway to > decide when to reopen the file when log files roll over, so that might work out > ok. Removed the seek and keeping the truncation in mind. https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go File utils/tailer_test.go (right): https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newco... utils/tailer_test.go:25: buffer := bytes.NewBuffer([]byte{}) On 2013/12/05 16:20:37, rog wrote: > var buffer bytes.Buffer > > is simpler and does almost the same thing. Almost, but I need the pointer (for io.Writer in NewTailer() and in the assertion). https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newco... utils/tailer_test.go:124: func assertCollected(c *gc.C, buffer *bytes.Buffer, collected []string, addon func([]string), timeout bool) { On 2013/12/05 16:20:37, rog wrote: > Please could we have a comment describing what this function is doing, please? Done. https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newco... utils/tailer_test.go:162: var rs *readSeeker = new(readSeeker) On 2013/12/05 16:20:37, rog wrote: > var rs readSeeker Done. https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newco... utils/tailer_test.go:206: if r.pos == len(r.buffer) { On 2013/12/05 16:20:37, rog wrote: > I think we can usefully use the result of copy here: > > if r.pos >= len(r.buffer) { > return 0, io.EOF > } > > n := copy(p, r.buffer[r.pos:]) > r.pos += n > return n, nil Awesome, totally forgot the return value of copy. *sigh* Have to get better regarding slice operations. https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newco... utils/tailer_test.go:244: func signal(c *gc.C, sigc chan struct{}) { On 2013/12/05 16:20:37, rog wrote: > Given that the signal channel is buffered, can't we just send on the channel? Oh, yes, that's left from an older approach.
Sign in to reply to this message.
Getting there! A few more thoughts and suggestions below. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go File utils/tailer/tailer.go (right): https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:41: // Writer. The reading beginns the specified number of matching lines s/beginns/begins/ https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:43: func NewStandardTailer(readSeeker io.ReadSeeker, writer io.Writer, lines int, filter TailerFilterFunc) *Tailer { Given that this function is going to be the one that everyone calls, I think I'd name it NewTailer, but do we actually need to expose buffer size and poll interval publicly except to the tailer tests? https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:126: readBuffer := make([]byte, t.bufsize) This seems rather complex to me - I don't fully trust myself to vet the logic. Also, I don't think we need to have a separate line buffer and read buffer - we can use a single buffer and read directly into it. That way we can avoid allocating a new buffer every time round the loop too, although it is necessary to copy something from the start to the end. I started trying to explain a better possiblity, but got carried away... Here's the kind of thing I mean. Still somewhat complex, but I've tried to keep the invariants simple. It needs testing (I've left in one deliberate mistake that I hope to see a test for). I suggest some internal tests that test this function in isolation with quite a few different inputs. // seekLastLines sets the read position of the ReadSeeker to the // wanted number of filtered lines before the end. func (t *Tailer) seekLastLines() error { offset, err := t.readSeeker.Seek(0, os.SEEK_END) if err != nil { return err } seekPos := int64(0) found := 0 buf := make([]byte, minRead) SeekLoop: for offset > 0 { // buf contains the data left over from the // previous iteration. space := cap(buf) - len(buf) if space < minRead { // grow buffer newBuf := make([]byte, len(buf), cap(buf)*2) copy(newBuf, buf) buf = newBuf space = cap(buf) - len(buf) } if int64(space) > offset { // Use exactly the right amount of space if there's // only a small amount remaining. space = int(offset) } // copy data remaining from last time to the end of the buffer, // so we can read into the right place. copy(buf[space:cap(buf)], buf) buf = buf[0 : len(buf)+space] offset -= int64(space) _, err := t.readSeeker.Seek(offset, os.SEEK_SET) if err != nil { return err } _, err = io.ReadFull(t.readSeeker, buf[0:space]) if err != nil { return err } // Find the end of the last line in the buffer. // This will discard any unterminated line at the end // of the file. end := bytes.LastIndex(buf, delim) if end == -1 { // No end of line found - discard incomplete // line and continue looking. If this happens // at the beginning of the file, we don't care // because we're going to stop anyway. buf = buf[:0] continue } end++ for { start := bytes.LastIndex(buf[0:end-1], delim) if start == -1 { break } start++ if t.isValid(buf[start:end]) { found++ if found >= t.lines { seekPos = offset + int64(start) break SeekLoop } } end = start - 1 } // Leave the last line in buf, as we don't know whether // it's complete or not. buf = buf[0:end] } // Final positioning. t.readSeeker.Seek(seekPos, os.SEEK_SET) return nil } https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:198: // Reached beginnig of data. s/beginnig/beginning/ https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:212: buffer, err := t.reader.ReadBytes(delimiter) ReadBytes can't return ErrBufferFull. I think you want to use ReadSlice. Also, if the first call succeeds, we should just return the slice that we got from ReadSlice - that way in the usual case that the line is shorter than the bufio buffer we can avoid any allocation. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:226: if len(line) == 0 || line[len(line)-1] == delimiter { This can't happen. (from the docs: "returns err != nil if and only if the returned data does not end in delim") https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:234: t.readSeeker.Seek(offset, os.SEEK_END) This is racy. We can't seek from the end because the end might be constantly changing. I think we need to keep track of where we're reading in the file and seek back to the absolute offset of the start of the partial line. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:247: return false Why? https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:252: return t.filter(line) Consider trimming \r?\n from the end of the line before calling filter? https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer_test.go File utils/tailer/tailer_test.go (right): https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:4: package tailer_test I'd like to see some unit tests of seekLastLines in here. There's a lot of logic in there that is untested here (with respect to long lines, line boundaries, read errors etc)
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go File utils/tailer/tailer.go (right): https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:41: // Writer. The reading beginns the specified number of matching lines On 2013/12/10 15:14:25, rog wrote: > s/beginns/begins/ Done. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:43: func NewStandardTailer(readSeeker io.ReadSeeker, writer io.Writer, lines int, filter TailerFilterFunc) *Tailer { On 2013/12/10 15:14:25, rog wrote: > Given that this function is going to be the one that everyone calls, I think I'd > name it NewTailer, but do we actually need to expose buffer size and poll > interval publicly except to the tailer tests? Done. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:126: readBuffer := make([]byte, t.bufsize) On 2013/12/10 15:14:25, rog wrote: > This seems rather complex to me - I don't fully trust myself to vet the logic. Used it, even if I have the same troubles following your array logic and the needed debugging. Thankfully the comments helped to follow the idea. And at least it has less lines. ;) Thx. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:198: // Reached beginnig of data. On 2013/12/10 15:14:25, rog wrote: > s/beginnig/beginning/ Not needed anymore due to new function. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:212: buffer, err := t.reader.ReadBytes(delimiter) On 2013/12/10 15:14:25, rog wrote: > ReadBytes can't return ErrBufferFull. > > I think you want to use ReadSlice. > Also, if the first call succeeds, we should just return the slice that we got > from ReadSlice - that way in the usual case that the line is shorter than the > bufio buffer we can avoid any allocation. Done. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:226: if len(line) == 0 || line[len(line)-1] == delimiter { On 2013/12/10 15:14:25, rog wrote: > This can't happen. (from the docs: "returns err != nil if and only if the > returned data does not end in delim") Done. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:234: t.readSeeker.Seek(offset, os.SEEK_END) On 2013/12/10 15:14:25, rog wrote: > This is racy. We can't seek from the end because the end might be constantly > changing. I think we need to keep track of where we're reading in the file and > seek back to the absolute offset of the start of the partial line. Done. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:247: return false On 2013/12/10 15:14:25, rog wrote: > Why? Yeah, too much automatic filtering here. The receiver of the data should decide. Removed it. https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:252: return t.filter(line) On 2013/12/10 15:14:25, rog wrote: > Consider trimming \r?\n from the end of the line before calling filter? Why?
Sign in to reply to this message.
Another round of suggestions, with one or two things still to fix. Thanks for bearing with me! https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go File utils/tailer/tailer.go (right): https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:252: return t.filter(line) On 2013/12/11 17:13:05, mue wrote: > On 2013/12/10 15:14:25, rog wrote: > > Consider trimming \r?\n from the end of the line before calling filter? > > Why? Then the filter functions don't need to worry about line termination characters at all and we know that we're safe even when we have windows stuff generating \r\n lines. https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go File utils/tailer/tailer.go (right): https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:131: buffer := make([]byte, t.bufsize) If we've got t.bufsize, "buf" would seem logical as a name. Not that it matters much though. https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:211: line = append(line, slice...) Unfortunately I think this is wrong, as we may be appending to the internal bufio.Reader buffer, which may be overwritten by the ReadSlice call. How about separating the first-slice case from the appending logic? Something like this, perhaps? // readLine reads the next valid line from the reader, even if it is // larger than the reader buffer. func (t *Tailer) readLine() ([]byte, error) { for { slice, err := t.reader.ReadSlice(delimiter) if err == nil { if t.isValid(slice) { return slice, nil } continue } line := append([]byte(nil), slice...) for err == bufio.ErrBufferFull { slice, err = t.reader.ReadSlice(delimiter) line = append(line, slice...) } switch err { case nil: if t.isValid(line) { return line, nil } case io.EOF: // EOF without delimiter, step back. t.readSeeker.Seek(-int64(len(line)), os.SEEK_CUR) return nil, err default: return nil, err } } } https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:224: t.readSeeker.Seek(-offset, os.SEEK_CUR) Ah, that's nicer than my suggestion, thanks. https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go File utils/tailer/tailer_test.go (right): https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:30: buffer := bytes.NewBuffer(nil) I'm afraid the way you're using this is racy (and in all the other tests too) because you're reading from the buffer (with assertCollected) concurrently with the tailer writing to it. You could use io.Pipe instead of bytes.Buffer to avoid the problem. BTW whenever you've got goroutine-based code, it's worth running go test -race to check this kind of stuff. https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:47: func (tailerSuite) TestLaggedTermination(c *gc.C) { I see 10 of these tests are almost identical. This suggests to me that a table-based test might work well here, and make it easier to see what's actually being tested. Then we could easily add quite a few more tests (for example, I'd like to see tests with blank lines in various places, and probably some more too). https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:264: disturber := func(lines []string) { This should probably be defined just before it's used. https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:331: sigc <- struct{}{} What's the point of this? We know the data has been written when this function returns.
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go File utils/tailer/tailer.go (right): https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:131: buffer := make([]byte, t.bufsize) On 2013/12/11 18:27:41, rog wrote: > If we've got t.bufsize, "buf" would seem logical as a name. > Not that it matters much though. Ack, but in the other direction. ;) https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:211: line = append(line, slice...) On 2013/12/11 18:27:41, rog wrote: > Unfortunately I think this is wrong, as we may be appending to the internal > bufio.Reader buffer, which may be overwritten by the ReadSlice call. > > How about separating the first-slice case from the appending logic? > > Something like this, perhaps? > > // readLine reads the next valid line from the reader, even if it is > // larger than the reader buffer. > func (t *Tailer) readLine() ([]byte, error) { > for { > slice, err := t.reader.ReadSlice(delimiter) > if err == nil { > if t.isValid(slice) { > return slice, nil > } > continue > } > line := append([]byte(nil), slice...) > for err == bufio.ErrBufferFull { > slice, err = t.reader.ReadSlice(delimiter) > line = append(line, slice...) > } > switch err { > case nil: > if t.isValid(line) { > return line, nil > } > case io.EOF: > // EOF without delimiter, step back. > t.readSeeker.Seek(-int64(len(line)), os.SEEK_CUR) > return nil, err > default: > return nil, err > } > } > } Yeah, that's better, thx. https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go File utils/tailer/tailer_test.go (right): https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:30: buffer := bytes.NewBuffer(nil) On 2013/12/11 18:27:41, rog wrote: > I'm afraid the way you're using this is racy (and in all the other tests too) > because you're reading from the buffer (with assertCollected) concurrently with > the tailer writing to it. > > You could use io.Pipe instead of bytes.Buffer to avoid the problem. > > BTW whenever you've got goroutine-based code, it's worth running go test -race > to check this kind of stuff. Done. https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:47: func (tailerSuite) TestLaggedTermination(c *gc.C) { On 2013/12/11 18:27:41, rog wrote: > I see 10 of these tests are almost identical. > This suggests to me that a table-based test might work well here, and make it > easier to see what's actually being tested. > > Then we could easily add quite a few more tests (for example, I'd like to see > tests with blank lines in various places, and probably some more too). Done. https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:264: disturber := func(lines []string) { On 2013/12/11 18:27:41, rog wrote: > This should probably be defined just before it's used. Now different approach with table driven tests. https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:331: sigc <- struct{}{} On 2013/12/11 18:27:41, rog wrote: > What's the point of this? > We know the data has been written when this function returns. Ouch, yes.
Sign in to reply to this message.
Another round. Hopefully done after this one! https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go File utils/tailer/tailer.go (right): https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:49: func NewTailer(readSeeker io.ReadSeeker, writeCloser io.WriteCloser, lines int, filter TailerFilterFunc) *Tailer { I don't think there's any particular reason we want to give this code the responsibility for closing the writer. It's quite possible we might have several tailers writing to the same writer, for example. Better would be a Wait method (and perhaps a Dead method) to wait until it's finished. Then the caller has the option of closing the writer. https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go File utils/tailer/tailer_test.go (right): https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:40: { If we move this brace onto the previous line, we can save a level of indentation in all these tests. https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:42: data: data[26:29], I like the new tests much better now. One thought though: I'm not convinced that referring to "data" in all these tests makes them more readable - I have no idea what's in data[26:29] without counting or using grep -n. Something like the below shows me exactly what's going on without me needing to refer to multiple places, and is independent of any changes else in the code. It would make it considerably easier for me to scan down the tests to ensure that each one represents reasonable behaviour. { description: "lines are longer than buffer size", data: []string{ "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n", "0123456789012345678901234567890123456789012345678901\n", "the quick brown fox ", }, initialLinesWritten: 1, initialLinesRequested: 1, bufferSize: 5, initialCollectedData: []string{ "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n", }, appendedCollectedData: []string{ "0123456789012345678901234567890123456789012345678901\n", }, }, https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:215: line, err := buffer.ReadString('\n') If the tailer doesn't produce enough data, we'll block here forever I think, regardless of the timeout. I think you probably want something more like the below: func assertCollected(c *gc.C, reader io.Reader, compare []string, injection func([]string) ) { lineChan := make(chan string) go func() { defer close(lineChan) reader := bufio.NewReader(reader) for { line, err := reader.ReadString('\n') if !c.Check(err, gc.IsNil) { return } lineChan <- line } }() timeout := time.After(testing.LongWait) for { select { case line, ok := <-lineChan: c.Assert(ok, jc.IsTrue) deal with line as currently case <-timeout: timed out } } } https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:222: for i := 0; i < len(lines); i++ { c.Assert(lines, gc.DeepEquals, compare) ? https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:228: if err == io.EOF { Once we've got EOF once on a pipe, I don't think we can get any more data, can we?
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go File utils/tailer/tailer.go (right): https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go#new... utils/tailer/tailer.go:49: func NewTailer(readSeeker io.ReadSeeker, writeCloser io.WriteCloser, lines int, filter TailerFilterFunc) *Tailer { On 2013/12/12 14:46:39, rog wrote: > I don't think there's any particular reason we want to give this code the > responsibility for closing the writer. It's quite possible we might have several > tailers writing to the same writer, for example. > > Better would be a Wait method (and perhaps a Dead method) to wait until > it's finished. Then the caller has the option of closing the writer. Done as discussed. https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go File utils/tailer/tailer_test.go (right): https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:40: { On 2013/12/12 14:46:39, rog wrote: > If we move this brace onto the previous line, we can save a level of indentation > in all these tests. Done. https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:42: data: data[26:29], On 2013/12/12 14:46:39, rog wrote: > I like the new tests much better now. One thought though: > > I'm not convinced that referring to "data" in all these tests makes them more > readable - I have no idea what's in data[26:29] without counting or using grep > -n. > > Something like the below shows me exactly what's going > on without me needing to refer to multiple places, > and is independent of any changes else in the code. > > It would make it considerably easier for me to scan down > the tests to ensure that each one represents reasonable > behaviour. > > { > description: "lines are longer than buffer size", > data: []string{ > "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n", > "0123456789012345678901234567890123456789012345678901\n", > "the quick brown fox ", > }, > initialLinesWritten: 1, > initialLinesRequested: 1, > bufferSize: 5, > initialCollectedData: []string{ > "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n", > }, > appendedCollectedData: []string{ > "0123456789012345678901234567890123456789012345678901\n", > }, > }, > This case the test table would blow up again and also would be inconsistent (e.g. when using the standard larger group of lines). So now using three different data variables with more speaking names as a compromise. Also located them near to the table. https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:215: line, err := buffer.ReadString('\n') On 2013/12/12 14:46:39, rog wrote: > If the tailer doesn't produce enough data, we'll block here forever I think, > regardless of the timeout. > > I think you probably want something more like the below: > > func assertCollected(c *gc.C, reader io.Reader, compare []string, injection > func([]string) > ) { > lineChan := make(chan string) > go func() { > defer close(lineChan) > reader := bufio.NewReader(reader) > for { > line, err := reader.ReadString('\n') > if !c.Check(err, gc.IsNil) { > return > } > lineChan <- line > } > }() > timeout := time.After(testing.LongWait) > for { > select { > case line, ok := <-lineChan: > c.Assert(ok, jc.IsTrue) > deal with line as currently > case <-timeout: > timed out > } > } > } Done with external reading goroutine. https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:222: for i := 0; i < len(lines); i++ { On 2013/12/12 14:46:39, rog wrote: > c.Assert(lines, gc.DeepEquals, compare) ? Done. https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:228: if err == io.EOF { On 2013/12/12 14:46:39, rog wrote: > Once we've got EOF once on a pipe, I don't think we can get any more data, can > we? Yeah, missed it when moving from buffer to Pipe.
Sign in to reply to this message.
LGTM with the tests fixed, thanks! https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go File utils/tailer/tailer_test.go (right): https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.g... utils/tailer/tailer_test.go:42: data: data[26:29], On 2013/12/13 11:23:12, mue wrote: > On 2013/12/12 14:46:39, rog wrote: > > I like the new tests much better now. One thought though: > > > > I'm not convinced that referring to "data" in all these tests makes them more > > readable - I have no idea what's in data[26:29] without counting or using grep > > -n. > > > > Something like the below shows me exactly what's going > > on without me needing to refer to multiple places, > > and is independent of any changes else in the code. > > > > It would make it considerably easier for me to scan down > > the tests to ensure that each one represents reasonable > > behaviour. > > > > { > > description: "lines are longer than buffer size", > > data: []string{ > > "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n", > > "0123456789012345678901234567890123456789012345678901\n", > > "the quick brown fox ", > > }, > > initialLinesWritten: 1, > > initialLinesRequested: 1, > > bufferSize: 5, > > initialCollectedData: []string{ > > "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n", > > }, > > appendedCollectedData: []string{ > > "0123456789012345678901234567890123456789012345678901\n", > > }, > > }, > > > > This case the test table would blow up again and also would be inconsistent > (e.g. when using the standard larger group of lines). So now using three > different data variables with more speaking names as a compromise. Also located > them near to the table. I'm with you about alphabetData (which is also nicely memorable, which helps), but most of them do really benefit from having the data visible in the test. I processed my local copy to move much of the test data into the tests, and two issues became immediately obvious, though I'd missed them when cross-referencing, which I think is a reasonable indication that the tests are not currently that clear. Here's my modified version of the test. Only alphabetData remains. I'd prefer it if we could use this form. http://paste.ubuntu.com/6566524/ https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.go File utils/tailer/tailer_test.go (right): https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.... utils/tailer/tailer_test.go:100: description: "lines are longer than buffer size, missing termination of last line", How is the last line missing termination here? https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.... utils/tailer/tailer_test.go:117: data: unterminatedData[0:2], The last line doesn't seem to be missing termination here.
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.go File utils/tailer/tailer_test.go (right): https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.... utils/tailer/tailer_test.go:100: description: "lines are longer than buffer size, missing termination of last line", On 2013/12/13 12:52:14, rog wrote: > How is the last line missing termination here? Done. https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.... utils/tailer/tailer_test.go:117: data: unterminatedData[0:2], On 2013/12/13 12:52:14, rog wrote: > The last line doesn't seem to be missing termination here. Done.
Sign in to reply to this message.
|