|
|
Created:
13 years, 8 months ago by borman Modified:
13 years, 7 months ago Reviewers:
CC:
rsc, mattn, r2, dchest, golang-dev Visibility:
Public. |
Descriptioncsv: new package
csv reader/writer based on RFC 4180
Patch Set 1 #Patch Set 2 : diff -r e93aca7ce587 https://go.googlecode.com/hg/ #Patch Set 3 : diff -r e93aca7ce587 https://go.googlecode.com/hg/ #Patch Set 4 : diff -r 916057d73e34 https://go.googlecode.com/hg/ #Patch Set 5 : diff -r 916057d73e34 https://go.googlecode.com/hg/ #
Total comments: 72
Patch Set 6 : diff -r dd31a3b56536 https://go.googlecode.com/hg/ #Patch Set 7 : diff -r dd31a3b56536 https://go.googlecode.com/hg/ #
Total comments: 23
Patch Set 8 : diff -r dd31a3b56536 https://go.googlecode.com/hg/ #Patch Set 9 : diff -r dd31a3b56536 https://go.googlecode.com/hg/ #Patch Set 10 : diff -r dd31a3b56536 https://go.googlecode.com/hg/ #
Total comments: 12
Patch Set 11 : diff -r dd31a3b56536 https://go.googlecode.com/hg/ #Patch Set 12 : diff -r dd31a3b56536 https://go.googlecode.com/hg/ #
Total comments: 6
Patch Set 13 : diff -r dd31a3b56536 https://go.googlecode.com/hg/ #Patch Set 14 : diff -r dd31a3b56536 https://go.googlecode.com/hg/ #
MessagesTotal messages: 27
Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://go.googlecode.com/hg/
Sign in to reply to this message.
Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
Lots of little things but very good code. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go File src/pkg/csv/csv.go (right): http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode1 src/pkg/csv/csv.go:1: // Copyright 2011 The Go Authors. All rights reserved. (not about this line) maybe rename files to read.go and read_test.go? leaves open the possibility of writing later. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode83 src/pkg/csv/csv.go:83: // A Reader reads records from a CSV encoded file based on RFC4180. can tighten this up // A Reader reads records from a CSV-encoded file. // // As returned by NewReader, a Reader expects input conforming to RFC 4180. // The exported fields can be changed to customize the details before the // first call to Read or ReadAll. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode85 src/pkg/csv/csv.go:85: // A Reader is created using NewReader. Individual records are read from a can drop http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode88 src/pkg/csv/csv.go:88: // The behavior of the Reader can be altered through its public elements. can drop http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode90 src/pkg/csv/csv.go:90: // The Comma element determines the field delimiter. It defaults to ','. s/The Comma element determines/Comma is/ http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode92 src/pkg/csv/csv.go:92: // The Comment element, if not 0, identifies the comment character. Any s/The Comment element/Comment/ s/identifies/is/ s/Any line/Lines/ s/ up to the end of the line// http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode95 src/pkg/csv/csv.go:95: // If the FieldsPerRecord is greater than zero, Read requires each record to s/the // s/greater than zero/positive/ http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode97 src/pkg/csv/csv.go:97: // the number of fields in the first record and then requires future records to s/and then.*/, so that future records must http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode100 src/pkg/csv/csv.go:100: // If LazyQuotes is true then a quote may appear in an unquoted field and a s/ then/,/ http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode103 src/pkg/csv/csv.go:103: // If TrailingComma is true then the last field may be a unquoted empty field. s/ then/,/ http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode105 src/pkg/csv/csv.go:105: // If TrimLeadingSpace is true then leading white space in a field is ignored. s/ then/,/ s/a/each/ trailing too? if so s/Leading// ? http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode141 src/pkg/csv/csv.go:141: if r.eof { typically you can store err os.Error instead and then this gets simpler for record == nil && r.err == nil { record, r.err = r.parseRecord() } if record == nil { return nil, r.err } http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode168 src/pkg/csv/csv.go:168: // ReadAll reads all the remaining records from f as a slice of records. s/f as a slice of records/r/ http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode169 src/pkg/csv/csv.go:169: // Each record in the slice is represented as a slice of strings with each // Each record is a slice of fields. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode185 src/pkg/csv/csv.go:185: func ReadFile(path string) (records [][]string, err os.Error) { // ReadFile reads and returns all the records in the named // file using an RFC 4180 Reader. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode198 src/pkg/csv/csv.go:198: r.column += r.lrc I think (but am not sure) that column is usually number of UTF-8 sequences, so lrc should be 1 or 2, not the sum of the ReadRune byte counts. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode220 src/pkg/csv/csv.go:220: // skip reads runes up to include the rune delim or until error. s/include/and including/ http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode253 src/pkg/csv/csv.go:253: if r.Comment != 0 { r.Comment != 0 && rune == r.Comment http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode262 src/pkg/csv/csv.go:262: delim, err := r.parseField() If you return a bool saying whether there was a field, you can simplify the code here. haveField, delim, err := r.parseField() if haveField { fields = append(fields, r.field.String()) } http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode280 src/pkg/csv/csv.go:280: // (normally r.Comma or '\n'). s/normally // http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode283 src/pkg/csv/csv.go:283: I think some of the logic gets simpler if you check for trailing comma when you see the comma. // field rune, err := r.readRune() switch rune { case r.Comma, '\n': // ignore for now case '"': // quoted field ... leaves rune just past quote default: // unquoted field ... leaves rune just past quote } // delim switch rune { case r.Comma: ... peek for trailing comma... case '\n': ... line number fiddling ... } return rune, nil Also unifies the many returns. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode315 src/pkg/csv/csv.go:315: r.column -= r.lrc // report the comma, not the newline Isn't lrc the size of the \n not the size of the Comma? Gets easier if column is a rune count. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode321 src/pkg/csv/csv.go:321: for { // quoted field http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode343 src/pkg/csv/csv.go:343: r.column -= r.lrc Same issue. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode358 src/pkg/csv/csv.go:358: for { // unquoted field http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go File src/pkg/csv/csv_test.go (right): http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newcode5 src/pkg/csv/csv_test.go:5: package csv_test Unless there's a circular dependency (unless testing depends on csv), tests should be in the same package they are testing, so that they can test unexported things. package csv then drop the import and s/csv.//g http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:35: testfile = path.Join(os.Getenv("TEST_TMPDIR"), "testfile.csv") unused except in error messages http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:56: []string{"line1word1", "line1word2", "line1word3"}, s/[]string// In a homogeneous type like []T or map[T]V you can drop the T if it matches the outer type. Here T = []string so you can say {"line1word1", ...}. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:62: []string{"The \"word\""}, `` strings are your friend here. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:212: testCSV(t, &Test{ It's more common in Go to just define some test cases as data and then write a TestRead that loops over each case. var readTests = []struct{ Input string Comment int TrimLeadingSpace bool ... etc ... Output [][]string Error string }{ {Input: "a,b,c\n", Output: [][]string{{"a", "b", "c"}}}, ... } func TestRead(t *testing.T) { for i, tt := range readTests { r := strings.NewReader(tt.Input) r.Comment = tt.Comment r.TrimLeadingSpace = tt.TrimLeadingSpace if tt.Comma != 0 { r.Comma = tt.Comma } ... out, err := r.ReadAll() if tt.Error != "" { if err == nil || !strings.Contains(err.String(), tt.Error) { t.Errorf("#%d: error %v, want error %q", i, err, tt.Error) } } else if err != nil { t.Errorf("#%d: unexpected error %v", i, err) } else if !reflect.DeepEqual(out, tt.Output) { t.Errorf("#%d: out=%q want %q", i, out, tt.Output) } } } With that you only have to add one line to the file to make a new test, which encourages more shorter tests. If #%d isn't enough, you can add a Name string to the struct and print that instead. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:218: func TestNoEOL(t *testing.T) { For example this becomes {Name: "NoEOL", In: "w1,w2\nw3,w4", Out: [][]string{{"w1", "w2"}, {"w3", "w4"}}} http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:246: func TestQuotes(t *testing.T) { more quote tests, in lax mode: {In: `a,",b`, Out: [][]string{{"a", `"`, "b"}}} {In: `a,"",b`, ... {In: `a,""",b`, ... {In: `a,"""",b`, ... {In: `a,""""",b`, ...
Sign in to reply to this message.
http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go File src/pkg/csv/csv.go (right): http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode48 src/pkg/csv/csv.go:48: // ·Multi-line please don't use non-ascii character for comment. i'm not count for commenting to this CL, you should use "*" or other character for the leading mark.
Sign in to reply to this message.
On Jun 30, 2011, at 3:04 PM, mattn.jp@gmail.com wrote: > > http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go > File src/pkg/csv/csv.go (right): > > http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode48 > src/pkg/csv/csv.go:48: // ·Multi-line > please don't use non-ascii character for comment. i'm not count for > commenting to this CL, you should use "*" or other character for the > leading mark. although i agree with you that this instance is unnecessary, there's nothing wrong with using non-ASCII in comments when it makes things clearer. -rob
Sign in to reply to this message.
Hello rsc@golang.org, mattn.jp@gmail.com, r@google.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go File src/pkg/csv/csv.go (right): http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode1 src/pkg/csv/csv.go:1: // Copyright 2011 The Go Authors. All rights reserved. On 2011/06/30 04:32:38, rsc wrote: > (not about this line) > > maybe rename files to read.go and read_test.go? > leaves open the possibility of writing later. Funny you should mention that. I am adding a writer with this upload :-) I agree I should rename these but I am not sure what that will do with the comments in the CL. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode48 src/pkg/csv/csv.go:48: // ·Multi-line On 2011/06/30 05:04:46, mattn wrote: > please don't use non-ascii character for comment. i'm not count for commenting > to this CL, you should use "*" or other character for the leading mark. Russ, What is your call on this? http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode83 src/pkg/csv/csv.go:83: // A Reader reads records from a CSV encoded file based on RFC4180. On 2011/06/30 04:32:38, rsc wrote: > can tighten this up > > // A Reader reads records from a CSV-encoded file. > // > // As returned by NewReader, a Reader expects input conforming to RFC 4180. > // The exported fields can be changed to customize the details before the > // first call to Read or ReadAll. Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode90 src/pkg/csv/csv.go:90: // The Comma element determines the field delimiter. It defaults to ','. On 2011/06/30 04:32:38, rsc wrote: > s/The Comma element determines/Comma is/ Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode95 src/pkg/csv/csv.go:95: // If the FieldsPerRecord is greater than zero, Read requires each record to On 2011/06/30 04:32:38, rsc wrote: > s/the // > s/greater than zero/positive/ Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode97 src/pkg/csv/csv.go:97: // the number of fields in the first record and then requires future records to On 2011/06/30 04:32:38, rsc wrote: > s/and then.*/, so that future records must Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode100 src/pkg/csv/csv.go:100: // If LazyQuotes is true then a quote may appear in an unquoted field and a On 2011/06/30 04:32:38, rsc wrote: > s/ then/,/ Maybe no , either? http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode103 src/pkg/csv/csv.go:103: // If TrailingComma is true then the last field may be a unquoted empty field. On 2011/06/30 04:32:38, rsc wrote: > s/ then/,/ maybe no , either? http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode105 src/pkg/csv/csv.go:105: // If TrimLeadingSpace is true then leading white space in a field is ignored. On 2011/06/30 04:32:38, rsc wrote: > s/ then/,/ > s/a/each/ > > trailing too? > if so s/Leading// ? Only leading white space is ignored. If we want to ignore trailing white space I would make that a separate option (it is more difficult) http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode141 src/pkg/csv/csv.go:141: if r.eof { On 2011/06/30 04:32:38, rsc wrote: > typically you can store err os.Error instead > and then this gets simpler > > for record == nil && r.err == nil { > record, r.err = r.parseRecord() > } > if record == nil { > return nil, r.err > } Better yet, I can just get rid of r.eof! It turns out it was left over from when parseRecord was not doing quite the right thing. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode168 src/pkg/csv/csv.go:168: // ReadAll reads all the remaining records from f as a slice of records. On 2011/06/30 04:32:38, rsc wrote: > s/f as a slice of records/r/ Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode185 src/pkg/csv/csv.go:185: func ReadFile(path string) (records [][]string, err os.Error) { On 2011/06/30 04:32:38, rsc wrote: > // ReadFile reads and returns all the records in the named > // file using an RFC 4180 Reader. I was actually thinking of removing this. What do you think? http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode198 src/pkg/csv/csv.go:198: r.column += r.lrc On 2011/06/30 04:32:38, rsc wrote: > I think (but am not sure) that column is usually > number of UTF-8 sequences, so lrc should be 1 or 2, > not the sum of the ReadRune byte counts. Okay, I was wondering about that. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode220 src/pkg/csv/csv.go:220: // skip reads runes up to include the rune delim or until error. On 2011/06/30 04:32:38, rsc wrote: > s/include/and including/ Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode253 src/pkg/csv/csv.go:253: if r.Comment != 0 { On 2011/06/30 04:32:38, rsc wrote: > r.Comment != 0 && rune == r.Comment Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode280 src/pkg/csv/csv.go:280: // (normally r.Comma or '\n'). On 2011/06/30 04:32:38, rsc wrote: > s/normally // Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode283 src/pkg/csv/csv.go:283: On 2011/06/30 04:32:38, rsc wrote: > I think some of the logic gets simpler if you > check for trailing comma when you see the comma. > > // field > rune, err := r.readRune() > switch rune { > case r.Comma, '\n': > // ignore for now > case '"': > // quoted field > ... leaves rune just past quote > default: > // unquoted field > ... leaves rune just past quote > } > > // delim > switch rune { > case r.Comma: > ... peek for trailing comma... > case '\n': > ... line number fiddling ... > } > > return rune, nil > > Also unifies the many returns. Okay, I put the jist of this in. For '\n' I don't need to do anything special, just need to drop through on a comma. this did make the code easier, thanks. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode315 src/pkg/csv/csv.go:315: r.column -= r.lrc // report the comma, not the newline On 2011/06/30 04:32:38, rsc wrote: > Isn't lrc the size of the \n not the size of the Comma? > Gets easier if column is a rune count. Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode321 src/pkg/csv/csv.go:321: for { On 2011/06/30 04:32:38, rsc wrote: > // quoted field Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode343 src/pkg/csv/csv.go:343: r.column -= r.lrc On 2011/06/30 04:32:38, rsc wrote: > Same issue. Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode358 src/pkg/csv/csv.go:358: for { On 2011/06/30 04:32:38, rsc wrote: > // unquoted field Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go File src/pkg/csv/csv_test.go (right): http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newcode5 src/pkg/csv/csv_test.go:5: package csv_test On 2011/06/30 04:32:38, rsc wrote: > Unless there's a circular dependency > (unless testing depends on csv), tests should > be in the same package they are testing, so that > they can test unexported things. > > package csv > > then drop the import and s/csv.//g > Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:35: testfile = path.Join(os.Getenv("TEST_TMPDIR"), "testfile.csv") On 2011/06/30 04:32:38, rsc wrote: > unused except in error messages Yeah, this was from my very first tests before I remembered I didn't have to create files to test :-) http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:56: []string{"line1word1", "line1word2", "line1word3"}, On 2011/06/30 04:32:38, rsc wrote: > s/[]string// > In a homogeneous type like []T or map[T]V > you can drop the T if it matches the outer type. > Here T = []string so you can say {"line1word1", ...}. > Oh, that helps! http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:62: []string{"The \"word\""}, On 2011/06/30 04:32:38, rsc wrote: > `` strings are your friend here. Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:62: []string{"The \"word\""}, On 2011/06/30 04:32:38, rsc wrote: > `` strings are your friend here. Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:212: testCSV(t, &Test{ On 2011/06/30 04:32:38, rsc wrote: > It's more common in Go to just define some test cases > as data and then write a TestRead that loops over each case. > > var readTests = []struct{ > Input string > Comment int > TrimLeadingSpace bool > ... etc ... > Output [][]string > Error string > }{ > {Input: "a,b,c\n", Output: [][]string{{"a", "b", "c"}}}, > ... > } > > func TestRead(t *testing.T) { > for i, tt := range readTests { > r := strings.NewReader(tt.Input) > r.Comment = tt.Comment > r.TrimLeadingSpace = tt.TrimLeadingSpace > if tt.Comma != 0 { > r.Comma = tt.Comma > } > ... > out, err := r.ReadAll() > if tt.Error != "" { > if err == nil || !strings.Contains(err.String(), tt.Error) { > t.Errorf("#%d: error %v, want error %q", i, err, tt.Error) > } > } else if err != nil { > t.Errorf("#%d: unexpected error %v", i, err) > } else if !reflect.DeepEqual(out, tt.Output) { > t.Errorf("#%d: out=%q want %q", i, out, tt.Output) > } > } > } > > With that you only have to add one line to the file to make a new > test, which encourages more shorter tests. > If #%d isn't enough, you can add a Name string to the struct and > print that instead. > Okay, I will need more time than I have tonight to grok and rewrite with this. I will upload my other changes for now, however.
Sign in to reply to this message.
> I agree I should rename these but I am not sure what that will do with > the comments in the CL. They won't be on the new file, but so be it.
Sign in to reply to this message.
On Thu, Jun 30, 2011 at 4:48 AM, Russ Cox <rsc@golang.org> wrote: > > I agree I should rename these but I am not sure what that will do with > > the comments in the CL. > > They won't be on the new file, but so be it. > Okay. I will rename them at the end of the review then.
Sign in to reply to this message.
writer http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/writer.go File src/pkg/csv/writer.go (right): http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/writer.go#newcode23 src/pkg/csv/writer.go:23: // If UseCRLF is true then \r\n is written after each record otherwise If UseCRLF is true, the Writer ends each record with \r\n instead of \n. http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/writer.go#newcode44 src/pkg/csv/writer.go:44: _, err = w.w.WriteRune(w.Comma) Where these are right next to each other if _, err = w.w.WriteRune(w.Comma); err != nil { return } three more below http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/writer.go#newcode120 src/pkg/csv/writer.go:120: switch { might just be me but it would be easier for me if this just used an if. if field == "" || strings.IndexRune(field, w.Comma) >= 0 || strings.IndexAny(field, "\"\r\n") >= 0 { return true } rune, _, _ ... return unicode.IsSpace(rune) http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/writer.go#newcode126 src/pkg/csv/writer.go:126: rune, _, _ := strings.NewReader(field).ReadRune() rune, _, _ := utf8.DecodeRuneInString(field) avoids creating the new strings.Reader.
Sign in to reply to this message.
http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go File src/pkg/csv/csv.go (right): http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode48 src/pkg/csv/csv.go:48: // ·Multi-line On 2011/06/30 05:47:16, borman wrote: > On 2011/06/30 05:04:46, mattn wrote: > > please don't use non-ascii character for comment. i'm not count for commenting > > to this CL, you should use "*" or other character for the leading mark. > > Russ, What is your call on this? I don't care so much about the · vs *. The notation was not immediately clear though. It might be clearer to use Go syntax: {`the "word" is true`, `a "quoted-field"`} {`Multi-line field`, `comma is ,`} http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode100 src/pkg/csv/csv.go:100: // If LazyQuotes is true then a quote may appear in an unquoted field and a On 2011/06/30 05:47:16, borman wrote: > On 2011/06/30 04:32:38, rsc wrote: > > s/ then/,/ > > Maybe no , either? I think the , is needed to set off the if condition from the implication. With "if", the comma is combining two sentences ("LazyQuotes is true" and "a quote may appear..."). Same below. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode105 src/pkg/csv/csv.go:105: // If TrimLeadingSpace is true then leading white space in a field is ignored. On 2011/06/30 05:47:16, borman wrote: > On 2011/06/30 04:32:38, rsc wrote: > > s/ then/,/ > > s/a/each/ > > > > trailing too? > > if so s/Leading// ? > > Only leading white space is ignored. If we want to ignore trailing white space > I would make that a separate option (it is more difficult) Why do we care about either one? http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode185 src/pkg/csv/csv.go:185: func ReadFile(path string) (records [][]string, err os.Error) { On 2011/06/30 05:47:16, borman wrote: > On 2011/06/30 04:32:38, rsc wrote: > > // ReadFile reads and returns all the records in the named > > // file using an RFC 4180 Reader. > > I was actually thinking of removing this. What do you think? sgtm http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode251 src/pkg/csv/csv.go:251: } can still be simpler if haveField { append } if delim == '\n' || err != nil { return fields, err } http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode297 src/pkg/csv/csv.go:297: return 0, err Maybe I don't understand, but I thought that the input x,y,z,w x,y,z, x,y,, x,,, ,,, are all 4-field lines. If so, the code here should be the same as for r.Comma: drop through so that the empty field is recorded. http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode301 src/pkg/csv/csv.go:301: for unicode.IsSpace(rune) { Quoted: for { http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode325 src/pkg/csv/csv.go:325: return 0, r.error(ErrQuote) break Quoted http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode348 src/pkg/csv/csv.go:348: } If there is a second switch to check the final delimiter, then this can just be break http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode353 src/pkg/csv/csv.go:353: } and this too http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode356 src/pkg/csv/csv.go:356: this is already break
Sign in to reply to this message.
http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go File src/pkg/csv/csv.go (right): http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode104 src/pkg/csv/csv.go:104: Comma int // Field delimiter (set to to ',' by NewReader) Double "to" here.
Sign in to reply to this message.
I still need to rework the tests to the new form and rename the files. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go File src/pkg/csv/csv.go (right): http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode48 src/pkg/csv/csv.go:48: // ·Multi-line On 2011/06/30 15:13:02, rsc wrote: > On 2011/06/30 05:47:16, borman wrote: > > On 2011/06/30 05:04:46, mattn wrote: > > > please don't use non-ascii character for comment. i'm not count for > commenting > > > to this CL, you should use "*" or other character for the leading mark. > > > > Russ, What is your call on this? > > I don't care so much about the · vs *. > The notation was not immediately clear though. > It might be clearer to use Go syntax: > > {`the "word" is true`, `a "quoted-field"`} > > {`Multi-line > field`, `comma is ,`} Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode48 src/pkg/csv/csv.go:48: // ·Multi-line On 2011/06/30 15:13:02, rsc wrote: > On 2011/06/30 05:47:16, borman wrote: > > On 2011/06/30 05:04:46, mattn wrote: > > > please don't use non-ascii character for comment. i'm not count for > commenting > > > to this CL, you should use "*" or other character for the leading mark. > > > > Russ, What is your call on this? > > I don't care so much about the · vs *. > The notation was not immediately clear though. > It might be clearer to use Go syntax: > > {`the "word" is true`, `a "quoted-field"`} > > {`Multi-line > field`, `comma is ,`} Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode85 src/pkg/csv/csv.go:85: // A Reader is created using NewReader. Individual records are read from a On 2011/06/30 04:32:38, rsc wrote: > can drop Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode88 src/pkg/csv/csv.go:88: // The behavior of the Reader can be altered through its public elements. On 2011/06/30 04:32:38, rsc wrote: > can drop Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode100 src/pkg/csv/csv.go:100: // If LazyQuotes is true then a quote may appear in an unquoted field and a On 2011/06/30 15:13:02, rsc wrote: > On 2011/06/30 05:47:16, borman wrote: > > On 2011/06/30 04:32:38, rsc wrote: > > > s/ then/,/ > > > > Maybe no , either? > > I think the , is needed to set off the if condition > from the implication. With "if", the comma is > combining two sentences > ("LazyQuotes is true" and "a quote may appear..."). > > Same below. Done. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode105 src/pkg/csv/csv.go:105: // If TrimLeadingSpace is true then leading white space in a field is ignored. On 2011/06/30 15:13:02, rsc wrote: > On 2011/06/30 05:47:16, borman wrote: > > On 2011/06/30 04:32:38, rsc wrote: > > > s/ then/,/ > > > s/a/each/ > > > > > > trailing too? > > > if so s/Leading// ? > > > > Only leading white space is ignored. If we want to ignore trailing white > space > > I would make that a separate option (it is more difficult) > > Why do we care about either one? Trimming leading white space is very common. I think it will be necessary to have. http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv.go#newcode169 src/pkg/csv/csv.go:169: // Each record in the slice is represented as a slice of strings with each On 2011/06/30 04:32:38, rsc wrote: > // Each record is a slice of fields. Done. http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode104 src/pkg/csv/csv.go:104: // On 2011/06/30 15:44:46, dchest wrote: > Double "to" here. Done. http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode251 src/pkg/csv/csv.go:251: } On 2011/06/30 15:13:02, rsc wrote: > can still be simpler > > if haveField { > append > } > if delim == '\n' || err != nil { > return fields, err > } I still need to return something different on EOF vs other errors, but for the most part this simplification works and is simpler http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode297 src/pkg/csv/csv.go:297: return 0, err On 2011/06/30 15:13:02, rsc wrote: > Maybe I don't understand, but I thought that the input > x,y,z,w > x,y,z, > x,y,, > x,,, > ,,, > > > are all 4-field lines. If so, the code > here should be the same as for r.Comma: > drop through so that the empty field is recorded. I added this test to csv_test and it works as you would expect, every record has 4 fields. For the line that consists of just ",\n" I will initially read the , and then hit the first case of this switch. This drops us below where we peek ahead to the newline and then put the new line back on the stream, and return the first empty field. When we come back through we will have \n as our rune. We know it is an empty field (and that we are allowing trailing commas) so we can return. I should add a comment about trailing commas. I did realize I had broken blank lines. I have fixed that (and added a test). http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode301 src/pkg/csv/csv.go:301: for unicode.IsSpace(rune) { On 2011/06/30 15:13:02, rsc wrote: > Quoted: > for { Done. http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode325 src/pkg/csv/csv.go:325: return 0, r.error(ErrQuote) On 2011/06/30 15:13:02, rsc wrote: > break Quoted Done. http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode348 src/pkg/csv/csv.go:348: } On 2011/06/30 15:13:02, rsc wrote: > If there is a second switch to check the > final delimiter, then this can just be > > break I moved the common err checks down below the switch http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/csv.go#newcode356 src/pkg/csv/csv.go:356: On 2011/06/30 15:13:02, rsc wrote: > this is already > break Done. http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/writer.go File src/pkg/csv/writer.go (right): http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/writer.go#newcode23 src/pkg/csv/writer.go:23: // If UseCRLF is true then \r\n is written after each record otherwise On 2011/06/30 14:56:20, rsc wrote: > If UseCRLF is true, the Writer ends each record with \r\n instead of \n. Thanks. I need to take the succinctness course... http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/writer.go#newcode44 src/pkg/csv/writer.go:44: _, err = w.w.WriteRune(w.Comma) On 2011/06/30 14:56:20, rsc wrote: > Where these are right next to each other > > if _, err = w.w.WriteRune(w.Comma); err != nil { > return > } > > three more below Done. http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/writer.go#newcode120 src/pkg/csv/writer.go:120: switch { On 2011/06/30 14:56:20, rsc wrote: > might just be me but it would be easier for me if this > just used an if. > > if field == "" || strings.IndexRune(field, w.Comma) >= 0 || > strings.IndexAny(field, "\"\r\n") >= 0 { > return true > } > rune, _, _ ... > return unicode.IsSpace(rune) Sure, why not :-) http://codereview.appspot.com/4629085/diff/6006/src/pkg/csv/writer.go#newcode126 src/pkg/csv/writer.go:126: rune, _, _ := strings.NewReader(field).ReadRune() On 2011/06/30 14:56:20, rsc wrote: > rune, _, _ := utf8.DecodeRuneInString(field) > avoids creating the new strings.Reader. Great! That was just what I was looking for!
Sign in to reply to this message.
Hello rsc@golang.org, mattn.jp@gmail.com, r@google.com, dchest@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
>> rune, _, _ := utf8.DecodeRuneInString(field) >> avoids creating the new strings.Reader. > > Great! That was just what I was looking for! FWIW, a way to look, given that you were using strings.Reader's ReadRune: godoc -src strings ReadRune This is my favorite way to 'look up how to do something'. Russ
Sign in to reply to this message.
Hello rsc@golang.org, mattn.jp@gmail.com, r@google.com, dchest@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go File src/pkg/csv/csv_test.go (right): http://codereview.appspot.com/4629085/diff/6002/src/pkg/csv/csv_test.go#newco... src/pkg/csv/csv_test.go:212: testCSV(t, &Test{ On 2011/06/30 04:32:38, rsc wrote: > It's more common in Go to just define some test cases > as data and then write a TestRead that loops over each case. > > var readTests = []struct{ > Input string > Comment int > TrimLeadingSpace bool > ... etc ... > Output [][]string > Error string > }{ > {Input: "a,b,c\n", Output: [][]string{{"a", "b", "c"}}}, > ... > } > > func TestRead(t *testing.T) { > for i, tt := range readTests { > r := strings.NewReader(tt.Input) > r.Comment = tt.Comment > r.TrimLeadingSpace = tt.TrimLeadingSpace > if tt.Comma != 0 { > r.Comma = tt.Comma > } > ... > out, err := r.ReadAll() > if tt.Error != "" { > if err == nil || !strings.Contains(err.String(), tt.Error) { > t.Errorf("#%d: error %v, want error %q", i, err, tt.Error) > } > } else if err != nil { > t.Errorf("#%d: unexpected error %v", i, err) > } else if !reflect.DeepEqual(out, tt.Output) { > t.Errorf("#%d: out=%q want %q", i, out, tt.Output) > } > } > } > > With that you only have to add one line to the file to make a new > test, which encourages more shorter tests. > If #%d isn't enough, you can add a Name string to the struct and > print that instead. > Done!
Sign in to reply to this message.
LGTM http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/csv_test.go File src/pkg/csv/csv_test.go (right): http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/csv_test.go#newc... src/pkg/csv/csv_test.go:9: //"os" please remove. http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go File src/pkg/csv/writer.go (right): http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go#newcode58 src/pkg/csv/writer.go:58: if _, err = w.w.WriteRune('"'); err != nil { I guess using WriteByte() is better. bits fast? http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go#newcode68 src/pkg/csv/writer.go:68: _, err = w.w.WriteRune('\r') same above. http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go#newcode74 src/pkg/csv/writer.go:74: _, err = w.w.WriteRune('\n') same above. http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go#newcode84 src/pkg/csv/writer.go:84: if _, err = w.w.WriteRune('"'); err != nil { same above. http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go#newcode91 src/pkg/csv/writer.go:91: _, err = w.w.WriteRune('\n') same above.
Sign in to reply to this message.
http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/csv_test.go File src/pkg/csv/csv_test.go (right): http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/csv_test.go#newc... src/pkg/csv/csv_test.go:9: //"os" On 2011/07/01 04:08:43, mattn wrote: > please remove. Done. http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go File src/pkg/csv/writer.go (right): http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go#newcode58 src/pkg/csv/writer.go:58: if _, err = w.w.WriteRune('"'); err != nil { On 2011/07/01 04:08:43, mattn wrote: > I guess using WriteByte() is better. bits fast? Yes, WriteRune calls WriteByte for these ascii characters so we can just call it straight away. http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go#newcode68 src/pkg/csv/writer.go:68: _, err = w.w.WriteRune('\r') On 2011/07/01 04:08:43, mattn wrote: > same above. Done. http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go#newcode74 src/pkg/csv/writer.go:74: _, err = w.w.WriteRune('\n') On 2011/07/01 04:08:43, mattn wrote: > same above. Done. http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go#newcode84 src/pkg/csv/writer.go:84: if _, err = w.w.WriteRune('"'); err != nil { On 2011/07/01 04:08:43, mattn wrote: > same above. Done. http://codereview.appspot.com/4629085/diff/12003/src/pkg/csv/writer.go#newcode91 src/pkg/csv/writer.go:91: _, err = w.w.WriteRune('\n') On 2011/07/01 04:08:43, mattn wrote: > same above. Done.
Sign in to reply to this message.
Hello rsc@golang.org, mattn.jp@gmail.com, r@google.com, dchest@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
Looks great except for the duplication in the writer test. http://codereview.appspot.com/4629085/diff/14008/src/pkg/csv/Makefile File src/pkg/csv/Makefile (right): http://codereview.appspot.com/4629085/diff/14008/src/pkg/csv/Makefile#newcode1 src/pkg/csv/Makefile:1: # Copyright 2009 The Go Authors. All rights reserved. 2011 for new files http://codereview.appspot.com/4629085/diff/14008/src/pkg/csv/Makefile#newcode9 src/pkg/csv/Makefile:9: csv.go \ no space before \ please http://codereview.appspot.com/4629085/diff/14008/src/pkg/csv/writer_test.go File src/pkg/csv/writer_test.go (right): http://codereview.appspot.com/4629085/diff/14008/src/pkg/csv/writer_test.go#n... src/pkg/csv/writer_test.go:14: records = [][]string{ The reader test turned out very nice. A similar table might be good here.
Sign in to reply to this message.
http://codereview.appspot.com/4629085/diff/14008/src/pkg/csv/Makefile File src/pkg/csv/Makefile (right): http://codereview.appspot.com/4629085/diff/14008/src/pkg/csv/Makefile#newcode1 src/pkg/csv/Makefile:1: # Copyright 2009 The Go Authors. All rights reserved. On 2011/07/01 14:40:40, rsc wrote: > 2011 for new files Done. http://codereview.appspot.com/4629085/diff/14008/src/pkg/csv/Makefile#newcode9 src/pkg/csv/Makefile:9: csv.go \ On 2011/07/01 14:40:40, rsc wrote: > no space before \ please Done. http://codereview.appspot.com/4629085/diff/14008/src/pkg/csv/writer_test.go File src/pkg/csv/writer_test.go (right): http://codereview.appspot.com/4629085/diff/14008/src/pkg/csv/writer_test.go#n... src/pkg/csv/writer_test.go:14: records = [][]string{ On 2011/07/01 14:40:40, rsc wrote: > The reader test turned out very nice. > A similar table might be good here. My thought was "Oh, I can do this all in one test" and then "but need it to do CRLF, too". Using the table encouraged me to break it up into many little tests. Thanks.
Sign in to reply to this message.
Hello rsc@golang.org, mattn.jp@gmail.com, r@google.com, dchest@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
LGTM Please mv csv.go reader.go mv csv_test.go reader_test.go hg change 4629085 ... edit file list ... hg rm csv.go hg rm csv_test.go hg mail 4629085 Thanks.
Sign in to reply to this message.
Hello rsc@golang.org, mattn.jp@gmail.com, r@google.com, dchest@gmail.com (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
*** Submitted as http://code.google.com/p/go/source/detail?r=620afc45f351 *** csv: new package csv reader/writer based on RFC 4180 R=rsc, mattn.jp, r, dchest CC=golang-dev http://codereview.appspot.com/4629085 Committer: Russ Cox <rsc@golang.org>
Sign in to reply to this message.
|