|
|
|
Created:
14 years, 9 months ago by crawshaw2 Modified:
12 years, 5 months ago Reviewers:
CC:
golang-dev Visibility:
Public. |
Descriptionpath/filepath: Use a channel to return matches from Glob() instead of a slice.
For when a glob can match millions of files.
Patch Set 1 #Patch Set 2 : diff -r 65a786a78417 https://go.googlecode.com/hg/ #Patch Set 3 : diff -r 65a786a78417 https://go.googlecode.com/hg/ #
Total comments: 4
Patch Set 4 : diff -r 65a786a78417 https://go.googlecode.com/hg/ #
Total comments: 5
Patch Set 5 : diff -r 65a786a78417 https://go.googlecode.com/hg/ #Patch Set 6 : diff -r f782663275a7 https://go.googlecode.com/hg/ #MessagesTotal messages: 18
Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://go.googlecode.com/hg/
Sign in to reply to this message.
On 2011/03/25 17:08:36, david.crawshaw wrote: > Hello mailto:golang-dev@googlegroups.com (cc: mailto:golang-dev@googlegroups.com), > > I'd like you to review this change to > https://go.googlecode.com/hg/ This is almost a question phrased as a CL. Does this make sense? If so, another question: does a chan string copy the contents of the string? It's not clear to me. If so, I should probably be using chan *string. P.s. I was unable to sign in with crawshaw@google.com, so you get my personal email address.
Sign in to reply to this message.
I'm not sure whether this should be the default. On the one hand most of the time it is overkill. On the other hand, when it's not, it's in code that you didn't write, so this would help make that code just work. In this form, Glob should return a <-chan string and do the closing itself. That will make the usage a little less awkward. Russ
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, rsc (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
P.s. with generics I could write (predicating over T):
func collect(ch chan <-T) []T {
sl := make([]T)
for elem := range(ch) {
sl = append(sl, elem)
}
return sl
}
And then you could recover the old API via:
matches := collect(filepath.Glob("*/*/*"))
Sign in to reply to this message.
On Mar 25, 2011, at 10:52 AM, david.crawshaw@zentus.com wrote: > P.s. with generics I could write (predicating over T): > > func collect(ch chan <-T) []T { > sl := make([]T) > for elem := range(ch) { > sl = append(sl, elem) > } > return sl > } > > And then you could recover the old API via: > > matches := collect(filepath.Glob("*/*/*")) > > http://codereview.appspot.com/4276078/ True, but if f() returns a slice and then is changed to return a channel, the only change to the clients is that the the for statement's range clause goes from for _, x := range to for x := range -rob
Sign in to reply to this message.
http://codereview.appspot.com/4276078/diff/6001/src/pkg/path/filepath/match.go File src/pkg/path/filepath/match.go (right): http://codereview.appspot.com/4276078/diff/6001/src/pkg/path/filepath/match.g... src/pkg/path/filepath/match.go:210: // Glob returns the names of all files matching pattern or nil update comment http://codereview.appspot.com/4276078/diff/6001/src/pkg/path/filepath/match_t... File src/pkg/path/filepath/match_test.go (right): http://codereview.appspot.com/4276078/diff/6001/src/pkg/path/filepath/match_t... src/pkg/path/filepath/match_test.go:86: func contains(vector []string, s string) bool { this function is no longer used... http://codereview.appspot.com/4276078/diff/6001/src/pkg/path/filepath/match_t... src/pkg/path/filepath/match_test.go:97: func containsChan(ch <-chan string, s string) bool { ...so this could just be "contains".... http://codereview.appspot.com/4276078/diff/6001/src/pkg/path/filepath/match_t... src/pkg/path/filepath/match_test.go:122: if !containsChan(matches, tt.result) { ... but then it's hardly worth it. you could put the range here, but it's fine if you want a separate function
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, rsc, r2, r (cc: golang-dev@googlegroups.com), Please take another look.
Sign in to reply to this message.
On 2011/03/25 17:55:13, r2 wrote: > True, but if f() returns a slice and then is changed to return a channel, the > only change to the clients is that the the for statement's range clause goes > from > for _, x := range > to > for x := range > > -rob > I'm sure this must have been answered elsewhere, but again my search has failed: Why not have range be consistent over slices and channels, and not return an index? Then there could be an rangeIndex (or maybe, 'ranged') that gives you an index on both slices and channels.
Sign in to reply to this message.
channels are the inconsistency, not slices
Sign in to reply to this message.
http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match_... File src/pkg/path/filepath/match_test.go (right): http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match_... src/pkg/path/filepath/match_test.go:98: func collect(ch <-chan string) []string { i thought you were just going to rename containsChan to contains. that's a preferable design - no need to copy all the values just to search for them
Sign in to reply to this message.
http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match.go File src/pkg/path/filepath/match.go (right): http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match.... src/pkg/path/filepath/match.go:216: matches := make(chan string) Should this be a buffered channel? I often use unbuffered channels and channels of size 1, but unless I'm using a channel for rate control ("n tasks running at a time"), I'm never sure how large to make my "optimized-for-throughput" buffered channels. I end up picking some arbitrary number like 64. I'm curious what others do in cases like this.
Sign in to reply to this message.
http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match.go File src/pkg/path/filepath/match.go (right): http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match.... src/pkg/path/filepath/match.go:216: matches := make(chan string) Some buffering would be fine, since it allows the two halves to run more in parallel. There's no need for fine synchronization. An arbitrary number is probably fine. Any smallish value will get whatever benefit there is.
Sign in to reply to this message.
Cheers. (I just found hg upload.) http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match.go File src/pkg/path/filepath/match.go (right): http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match.... src/pkg/path/filepath/match.go:216: matches := make(chan string) On 2011/03/25 20:17:14, r wrote: > Some buffering would be fine, since it allows the two halves to run more in > parallel. There's no need for fine synchronization. > > An arbitrary number is probably fine. Any smallish value will get whatever > benefit there is. Done. Though it would be nice if I could pull the arbitrary number from a constant in some package like os or sync. http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match_... File src/pkg/path/filepath/match_test.go (right): http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match_... src/pkg/path/filepath/match_test.go:98: func collect(ch <-chan string) []string { On 2011/03/25 19:55:21, r wrote: > i thought you were just going to rename containsChan to contains. that's a > preferable design - no need to copy all the values just to search for them Then I can't use the output of the channel in the error message. I could call Glob() again for the error message, but that could hide an indeterminism bug.
Sign in to reply to this message.
Ping. Any interest in this? On 2011/03/25 20:27:32, david.crawshaw wrote: > Cheers. (I just found hg upload.) > > http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match.go > File src/pkg/path/filepath/match.go (right): > > http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match.... > src/pkg/path/filepath/match.go:216: matches := make(chan string) > On 2011/03/25 20:17:14, r wrote: > > Some buffering would be fine, since it allows the two halves to run more in > > parallel. There's no need for fine synchronization. > > > > An arbitrary number is probably fine. Any smallish value will get whatever > > benefit there is. > > Done. Though it would be nice if I could pull the arbitrary number from a > constant in some package like os or sync. > > http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match_... > File src/pkg/path/filepath/match_test.go (right): > > http://codereview.appspot.com/4276078/diff/10002/src/pkg/path/filepath/match_... > src/pkg/path/filepath/match_test.go:98: func collect(ch <-chan string) []string > { > On 2011/03/25 19:55:21, r wrote: > > i thought you were just going to rename containsChan to contains. that's a > > preferable design - no need to copy all the values just to search for them > > Then I can't use the output of the channel in the error message. I could call > Glob() again for the error message, but that could hide an indeterminism bug.
Sign in to reply to this message.
apparently not
Sign in to reply to this message.
R=close
Sign in to reply to this message.
|
