|
|
Descriptionpath/filepath: implement Match and Glob on windows
As discussed on golang-dev, windows will use
"\" as path separator. No escaping allowed.
Patch Set 1 #Patch Set 2 : diff -r ea1ec77f14c7 https://go.googlecode.com/hg/ #Patch Set 3 : diff -r ea1ec77f14c7 https://go.googlecode.com/hg/ #Patch Set 4 : diff -r 087c6e15702e https://go.googlecode.com/hg/ #
Total comments: 2
Patch Set 5 : diff -r 360c1d325ed4 https://go.googlecode.com/hg/ #MessagesTotal messages: 28
Hello golang-dev@googlegroups.com (cc: r@golang.org), I'd like you to review this change to https://go.googlecode.com/hg/
Sign in to reply to this message.
It looks to me that filepath.Glob is not working on Windows. I also noticed some potential problems in unix implementation (see my new TODO). Perhaps I am misguided here, but before I venture into implementing windows version of Glob, I need to settle my understanding. Should TODO tests pass? I also welcome any suggestion about how to implement windows version. Biggest concern I have is that \ is escape char, as well as path separator on windows. Patterns containing both will be confusing to both path/filepath internal functions that work on patterns and on paths. Thank you. Alex
Sign in to reply to this message.
On Mar 14, 2012, at 3:22 PM, alex.brainman@gmail.com wrote: > It looks to me that filepath.Glob is not working on Windows. I also > noticed some potential problems in unix implementation (see my new > TODO). Perhaps I am misguided here, but before I venture into > implementing windows version of Glob, I need to settle my understanding. > Should TODO tests pass? > > I also welcome any suggestion about how to implement windows version. > Biggest concern I have is that \ is escape char, as well as path > separator on windows. Patterns containing both will be confusing to both > path/filepath internal functions that work on patterns and on paths. The use of backslash has the property that on Windows a simple multielement path will not match itself. Perhaps on Windows we need a different escape character, just as we have a different path separator. The file name syntax is so different I'm not worried about incompatibilities; for complex patterns they won't apply anyway, while for simple ones like *.go they'll be fine. If we were to choose a different one for windows (only), what would make sense? Let me think about this. -rob
Sign in to reply to this message.
On Wednesday, March 14, 2012 3:36:21 PM UTC+11, r wrote: > > > ... Perhaps on Windows we need a different escape character, ... > This would confuse me. I am use to \ in regex and others. > If we were to choose a different one for windows (only), what would make > sense? > > As far as I know (and I could be mistaken here) Windows file pattern matching is very limited: * and ?, and only in the last (file) portion of path. I wouldn't be overly concerned if we made our implementation similar to that. But what do I know. Perhaps, it is best we wait for others to comment. The only problem I see with that is unix people will expect their code to be portable ... As to "escape character on Windows" - I really do not know. These are "reserved" characters (http://goo.gl/N2OOw) as far as file names are concerned on Windows: - < (less than) - > (greater than) - : (colon) - " (double quote) - / (forward slash) - \ (backslash) - | (vertical bar or pipe) - ? (question mark) - * (asterisk) We can't use : - it can be part of path (c:\a.txt). " will be too confusing. \ and / both can be part of path.? and * are for pattern matching. So, that leaves <, > and |. None of them looks good to me. Perhaps either \ or / is not such a bad ideas after all. Alex
Sign in to reply to this message.
I won't pass for windows. Windows can't escape characters in the path. And as far as I memory, Glob() is designed for slash path separator to use Glob(). https://groups.google.com/d/topic/golang-dev/CuRp83hrpQg/discussion If working Glob() on windows, we must call FromSlash() before call Glob(). I wrote grep application few month ago. it call FromSlash() before calling Glob(). https://github.com/mattn/jvgrep It work fine for me. :) This test should be skipped on windows. On 2012/03/14 05:10:18, brainman wrote: > On Wednesday, March 14, 2012 3:36:21 PM UTC+11, r wrote: > > > > > > ... Perhaps on Windows we need a different escape character, ... > > > This would confuse me. I am use to \ in regex and others. > > > > If we were to choose a different one for windows (only), what would make > > sense? > > > > As far as I know (and I could be mistaken here) Windows file pattern > matching is very limited: * and ?, and only in the last (file) portion of > path. I wouldn't be overly concerned if we made our implementation similar > to that. But what do I know. Perhaps, it is best we wait for others to > comment. The only problem I see with that is unix people will expect their > code to be portable ... > > As to "escape character on Windows" - I really do not know. These are > "reserved" characters (http://goo.gl/N2OOw) as far as file names are > concerned on Windows: > > - < (less than) > - > (greater than) > - : (colon) > - " (double quote) > - / (forward slash) > - \ (backslash) > - | (vertical bar or pipe) > - ? (question mark) > - * (asterisk) > > We can't use : - it can be part of path (c:\a.txt). " will be too > confusing. \ and / both can be part of path.? and * are for pattern > matching. So, that leaves <, > and |. None of them looks good to me. > Perhaps either \ or / is not such a bad ideas after all. > > Alex
Sign in to reply to this message.
typo: s/I won't pass/It won't pass/ On 2012/03/14 07:01:24, mattn wrote: > I won't pass for windows. Windows can't escape characters in the path. And as > far as I memory, Glob() is designed for slash path separator to use Glob(). > > https://groups.google.com/d/topic/golang-dev/CuRp83hrpQg/discussion > > If working Glob() on windows, we must call FromSlash() before call Glob(). I > wrote grep application few month ago. it call FromSlash() before calling Glob(). > > https://github.com/mattn/jvgrep > > It work fine for me. :) > This test should be skipped on windows. > > On 2012/03/14 05:10:18, brainman wrote: > > On Wednesday, March 14, 2012 3:36:21 PM UTC+11, r wrote: > > > > > > > > > ... Perhaps on Windows we need a different escape character, ... > > > > > This would confuse me. I am use to \ in regex and others. > > > > > > > If we were to choose a different one for windows (only), what would make > > > sense? > > > > > > As far as I know (and I could be mistaken here) Windows file pattern > > matching is very limited: * and ?, and only in the last (file) portion of > > path. I wouldn't be overly concerned if we made our implementation similar > > to that. But what do I know. Perhaps, it is best we wait for others to > > comment. The only problem I see with that is unix people will expect their > > code to be portable ... > > > > As to "escape character on Windows" - I really do not know. These are > > "reserved" characters (http://goo.gl/N2OOw) as far as file names are > > concerned on Windows: > > > > - < (less than) > > - > (greater than) > > - : (colon) > > - " (double quote) > > - / (forward slash) > > - \ (backslash) > > - | (vertical bar or pipe) > > - ? (question mark) > > - * (asterisk) > > > > We can't use : - it can be part of path (c:\a.txt). " will be too > > confusing. \ and / both can be part of path.? and * are for pattern > > matching. So, that leaves <, > and |. None of them looks good to me. > > Perhaps either \ or / is not such a bad ideas after all. > > > > Alex
Sign in to reply to this message.
On Wed, Mar 14, 2012 at 00:36, Rob 'Commander' Pike <r@google.com> wrote: > The use of backslash has the property that on Windows a simple multielement path will not match itself. Perhaps on Windows we need a different escape character, just as we have a different path separator. The file name syntax is so different I'm not worried about incompatibilities; for complex patterns they won't apply anyway, while for simple ones like *.go they'll be fine. > > If we were to choose a different one for windows (only), what would make sense? We could define that the \ means \ unless it is followed by *?[]- That would apply on all systems, not just Windows. Russ
Sign in to reply to this message.
On 14 March 2012 18:57, Russ Cox <rsc@golang.org> wrote: > On Wed, Mar 14, 2012 at 00:36, Rob 'Commander' Pike <r@google.com> wrote: >> The use of backslash has the property that on Windows a simple multielement path will not match itself. Perhaps on Windows we need a different escape character, just as we have a different path separator. The file name syntax is so different I'm not worried about incompatibilities; for complex patterns they won't apply anyway, while for simple ones like *.go they'll be fine. >> >> If we were to choose a different one for windows (only), what would make sense? > > We could define that the \ means \ unless it is followed by *?[]- > That would apply on all systems, not just Windows. > > Russ how would you quote a \ ?
Sign in to reply to this message.
On 15/03/2012, at 5:57 AM, Russ Cox wrote: > On Wed, Mar 14, 2012 at 00:36, Rob 'Commander' Pike <r@google.com> wrote: >> The use of backslash has the property that on Windows a simple multielement path will not match itself. Perhaps on Windows we need a different escape character, just as we have a different path separator. The file name syntax is so different I'm not worried about incompatibilities; for complex patterns they won't apply anyway, while for simple ones like *.go they'll be fine. >> >> If we were to choose a different one for windows (only), what would make sense? > > We could define that the \ means \ unless it is followed by *?[]- > That would apply on all systems, not just Windows. i had the same idea last night. you need to add \ to the list. it might be the best compromise. -rob
Sign in to reply to this message.
On 2012/03/14 20:58:09, r2 wrote: > > ... you need to add \ to the list. ... > And why would we need that extra rule? Alex
Sign in to reply to this message.
On Mar 15, 2012, at 10:04 AM, alex.brainman@gmail.com wrote: > On 2012/03/14 20:58:09, r2 wrote: > >> ... you need to add \ to the list. ... > > > And why would we need that extra rule? "Need" may be too strong, but it's conceivable we'd want to discriminate between \- and \\- for instance. -rob
Sign in to reply to this message.
On Mar 15, 2012, at 10:43 AM, Rob 'Commander' Pike wrote: > > On Mar 15, 2012, at 10:04 AM, alex.brainman@gmail.com wrote: > >> On 2012/03/14 20:58:09, r2 wrote: >> >>> ... you need to add \ to the list. ... >> >> >> And why would we need that extra rule? > > "Need" may be too strong, but it's conceivable we'd want to discriminate between \- and \\- for instance. And now I see \\- looks like an SMB name. Not sure if that's an argument for or against. -rob
Sign in to reply to this message.
On 2012/03/14 23:45:34, r2 wrote: > > > > "Need" may be too strong, but it's conceivable we'd want to discriminate > between \- and \\- for instance. > > And now I see \\- looks like an SMB name. Not sure if that's an argument for or > against. > It is not just that. I think that forcing us, Windows users, to double every path delimiter is bit too much. Doubly so, considering that "strings" would need to double \ as well. So nice unix pattern like "a/b" would look like "a\\\\b" on windows. I do not see any problems with \- and \\-. First would be - character that needs to match - in the path. Second would be \ path delimiter followed immediately by - character - both need to match in the path. Perhaps, I am mistaken. Also, I am bit concerned that match patterns will be non-portable from unix to windows. I know, we are trying to fit "local" path rules as best as we could. I do not want to complicate things more then they need to be, but perhaps we could provide some "compiler" that translate any path pattern into "OS-independant" form. Alex
Sign in to reply to this message.
On Mar 15, 2012, at 11:01 AM, alex.brainman@gmail.com wrote: > On 2012/03/14 23:45:34, r2 wrote: >> > >> > "Need" may be too strong, but it's conceivable we'd want to > discriminate >> between \- and \\- for instance. > >> And now I see \\- looks like an SMB name. Not sure if that's an > argument for or >> against. > > > It is not just that. I think that forcing us, Windows users, to double > every path delimiter is bit too much. Doubly so, considering that > "strings" would need to double \ as well. So nice unix pattern like > "a/b" would look like "a\\\\b" on windows. > > I do not see any problems with \- and \\-. First would be - character > that needs to match - in the path. Second would be \ path delimiter > followed immediately by - character - both need to match in the path. > Perhaps, I am mistaken. > > Also, I am bit concerned that match patterns will be non-portable from > unix to windows. I know, we are trying to fit "local" path rules as best > as we could. I do not want to complicate things more then they need to > be, but perhaps we could provide some "compiler" that translate any path > pattern into "OS-independant" form. I formally withdraw from this conversation. I am out of my depth. Do what works, as best it can, in Windows. One day I'd like to meet the person who chose \ as the separator for DOS, or perhaps that was CP/M, just to chat about the unfortunate accidents of technological history. -rob
Sign in to reply to this message.
On 2012/03/15 00:05:14, r2 wrote: > What about we start with rsc rule: ... the \ means \ unless it is followed by *?[]- ... We also say that both / and \ will match correspondent os path delimiter. So that "a/b" will work on windows too. Same "a\\b" (or `a\b`) will work on unix. That should allow users to use whatever pattern they wish and it will match on either os. No duplicating \ inside pattern needed. Where is teh downside? Alex
Sign in to reply to this message.
On Thu, Mar 15, 2012 at 01:36, <alex.brainman@gmail.com> wrote: > ... the \ means \ unless it is followed by *?[]- ... > > Where is teh downside? There are paths like \\?\C:\foo, if I remember correctly. --Benny. -- The first essential in chemistry is that you should perform practical work and conduct experiments, for he who performs not practical work nor makes experiments will never attain the least degree of mastery. -- Abu Musa Jabir ibn Hayyan (721-815)
Sign in to reply to this message.
> There are paths like \\?\C:\foo, ... This file name will match \\\?\C:\foo. And these are rarely used. Alex
Sign in to reply to this message.
What if we say that on Windows there is no escape character?
Sign in to reply to this message.
On 15 March 2012 17:21, Russ Cox <rsc@golang.org> wrote: > What if we say that on Windows there is no escape character? after all, paths in Windows can't contain ? or *, right?
Sign in to reply to this message.
On Thu, Mar 15, 2012 at 13:35, roger peppe <rogpeppe@gmail.com> wrote: > after all, paths in Windows can't contain ? or *, right? the disallowed character set is <>:"/\|?* even if not, i would not be worried about this. i can't remember the last time i needed to escape ? or * in a glob pattern.
Sign in to reply to this message.
Here is my thinking: 1) We have decided that our path separators will be different on different os. This is consistent with what os expects: / on unix, \ on windows. We should continue the story in match patterns. Windows patterns should use \ to match path delimiter, like unix / patterns do. 2) I want all (if not all, then most) match patterns to work on both unix and windows. If I have `a/b` pattern, I want it to match on windows despite of windows path really look like `a\b`. If we continue that line of thinking, then pattern of `a\b` (supposedly written by windows programmer) should match `a/b` file on unix. So I propose that both / and \ will match path separator regardless of os. 3) I want match patterns to be familiar to what all of us expect: * and ? and [] ranges. We need to escape (will continue to use \) * and ? and [ in patterns, and escape - and ] in ranges. 4) I want simple patterns to look simple. If we use \ as path separator, then we do not want to have to escape it every time I use it. So lets not escape \. Lets not escape anything, but all the ones listed in 3). 5) I would like to drop escaping altogether, but I do not think it is possible. We still have some chars, like -, where escaping is surely unavoidable. I also would like rules to be the same on both unix and windows (see 2.), so if one platform has it, all others should too. So perhaps something like this: // The pattern syntax is: // // pattern: // { term } // term: // '/' or '\\' matches any single Separator character // '*' matches any sequence of non-Separator characters // '?' matches any single non-Separator character // '[' [ '^' ] { character-range } ']' // character class (must be non-empty) // c matches character c (c != '*', '?', '\\', '/', '[') // '\\' '*' matches character '*' // '\\' '?' matches character '?' // // character-range: // '/' or '\\' matches any single Separator character // c matches character c (c != '\\', '/', '-', ']') // '\\' '-' matches character '-' // '\\' ']' matches character ']' // lo '-' hi matches character c for lo <= c <= hi // If no objection, I will try to implement this. Alex
Sign in to reply to this message.
On 15 March 2012 18:05, Russ Cox <rsc@golang.org> wrote: > On Thu, Mar 15, 2012 at 13:35, roger peppe <rogpeppe@gmail.com> wrote: >> after all, paths in Windows can't contain ? or *, right? > > the disallowed character set is <>:"/\|?* > even if not, i would not be worried about this. > i can't remember the last time i needed to > escape ? or * in a glob pattern. i'm not concerned about using them literally, but it would be nice if we could implement the following function portably and correctly, even if we've constructed dir by reading a file system and it happens to contain metacharacters. func ArchivesTwoLevelsDeep(dir string) ([]string, error) { return filepath.Glob(filepath.Join(quoteMeta(dir), "*", "*.tgz")) }
Sign in to reply to this message.
I object to interpreting \ on Unix as meaning /, just as I object to interpreting \ on Windows as being an escape character. I don't believe we can change anything about Unix right now. We can make a very targeted fix for Windows, and I think the right fix is to make \ be a separator instead of an escape character, but only in the Windows version. Russ
Sign in to reply to this message.
Hello golang-dev@googlegroups.com, r@google.com, mattn.jp@gmail.com, rsc@golang.org, rogpeppe@gmail.com, bsiegert@gmail.com (cc: golang-dev@googlegroups.com, r@golang.org), Please take another look.
Sign in to reply to this message.
LGTM http://codereview.appspot.com/5825044/diff/13001/src/pkg/path/filepath/match.go File src/pkg/path/filepath/match.go (right): http://codereview.appspot.com/5825044/diff/13001/src/pkg/path/filepath/match.... src/pkg/path/filepath/match.go:41: // Escaping is not allowed on windows. Instead, '\\' is treated as s/w/W/ maybe "allowed" id the wrong word. also put it first On Windows, escaping is disabled. Instead, ...
Sign in to reply to this message.
*** Submitted as http://code.google.com/p/go/source/detail?r=a216dfd16073 *** path/filepath: implement Match and Glob on windows As discussed on golang-dev, windows will use "\" as path separator. No escaping allowed. R=golang-dev, r, mattn.jp, rsc, rogpeppe, bsiegert, r CC=golang-dev http://codereview.appspot.com/5825044 http://codereview.appspot.com/5825044/diff/13001/src/pkg/path/filepath/match.go File src/pkg/path/filepath/match.go (right): http://codereview.appspot.com/5825044/diff/13001/src/pkg/path/filepath/match.... src/pkg/path/filepath/match.go:41: // Escaping is not allowed on windows. Instead, '\\' is treated as On 2012/03/19 03:39:11, r wrote: > s/w/W/ > maybe "allowed" id the wrong word. > also put it first > > On Windows, escaping is disabled. Instead, ... Done.
Sign in to reply to this message.
> > One day I'd like to meet the person who chose \ as the separator for DOS, > or perhaps that was CP/M, just to chat about the unfortunate accidents of > technological history. > Why is the DOS path character "\"?<http://blogs.msdn.com/b/larryosterman/archive/2005/06/24/432386.aspx>
Sign in to reply to this message.
> > Why is the DOS path character "\"?<http://blogs.msdn.com/b/larryosterman/archive/2005/06/24/432386.aspx> > Seems to be significantly older than that: http://news.ycombinator.com/item?id=3723529
Sign in to reply to this message.
|