|
|
Created:
12 years, 8 months ago by Ross Light Modified:
12 years, 8 months ago Reviewers:
CC:
m.n.summerfield, adg, kevlar, rsc, gustavo_niemeyer.net, niemeyer, golang-dev Visibility:
Public. |
Descriptionxml: Marshal "parent>child" tags correctly
Fixes issue 2119
Patch Set 1 #Patch Set 2 : diff -r 443be59de1ba https://go.googlecode.com/hg/ #Patch Set 3 : diff -r 443be59de1ba https://go.googlecode.com/hg/ #
Total comments: 6
Patch Set 4 : diff -r 443be59de1ba8e30507b567e9f74a37e815c2e8c https://go.googlecode.com/hg/ #Patch Set 5 : diff -r 443be59de1ba8e30507b567e9f74a37e815c2e8c https://go.googlecode.com/hg/ #
Total comments: 8
Patch Set 6 : diff -r 0d64f70fe250cb1158633fae8fb7e2cd351adaba https://go.googlecode.com/hg/ #
Total comments: 11
Patch Set 7 : diff -r 0d64f70fe250cb1158633fae8fb7e2cd351adaba https://go.googlecode.com/hg/ #
Total comments: 2
Patch Set 8 : diff -r 0d64f70fe250cb1158633fae8fb7e2cd351adaba https://go.googlecode.com/hg/ #
Total comments: 3
Patch Set 9 : diff -r 0d64f70fe250cb1158633fae8fb7e2cd351adaba https://go.googlecode.com/hg/ #MessagesTotal messages: 31
I'm not convinced that this change is worthwhile; I haven't tried it, but it seems like it wouldn't be difficult to contrive a case which would break this. I'm also not convinced that this marshals parent>child "correctly" at all, because the following could have two interpretations: var group = struct{ Value []string `xml:"values>value"` }{[]string{"foo", "bar"}} <Group> <values> <value>foo</value> <value>bar</value> </values> </Group> or <Group> <values><value>foo</value></values> <values><value>bar</value></values> </Group> Both of these will unmarshal properly. http://codereview.appspot.com/4941042/diff/6001/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4941042/diff/6001/src/pkg/xml/marshal.go#newcode69 src/pkg/xml/marshal.go:69: type mgroup struct { This type name is somewhat opaque. Please use a more descriptive name. http://codereview.appspot.com/4941042/diff/6001/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:248: func (p *printer) marshalGroup(group mgroup, name string) os.Error { it might be more idiomatic for this to be a method on mgroup, which can still have an embedded io.Writer. http://codereview.appspot.com/4941042/diff/6001/src/pkg/xml/marshal_test.go File src/pkg/xml/marshal_test.go (right): http://codereview.appspot.com/4941042/diff/6001/src/pkg/xml/marshal_test.go#n... src/pkg/xml/marshal_test.go:189: }, I really don't think this is an exhaustive enough test to convince me that this code works correctly in all cases.
Sign in to reply to this message.
On 23 August 2011 02:19, <kevlar@google.com> wrote: > I'm not convinced that this change is worthwhile; I haven't tried it, > but it seems like it wouldn't be difficult to contrive a case which > would break this. > > I'm also not convinced that this marshals parent>child "correctly" at > all, because the following could have two interpretations: > > var group = struct{ > Value []string `xml:"values>value"` > }{[]string{"foo", "bar"}} > > <Group> > <values> > <value>foo</value> > <value>bar</value> > </values> > </Group> > > or > > <Group> > <values><value>foo</value></values> > <values><value>bar</value></values> > </Group> > > Both of these will unmarshal properly. That's a good point. Maybe we can't/shouldn't do this?
Sign in to reply to this message.
On 23 August 2011 09:38, Andrew Gerrand <adg@golang.org> wrote: >> <Group> >> <values> >> <value>foo</value> >> <value>bar</value> >> </values> >> </Group> >> >> or >> >> <Group> >> <values><value>foo</value></values> >> <values><value>bar</value></values> >> </Group> Or is the former the "sane" default? Should we add syntax specify which? I don't use the xml package so I don't have an opinion.
Sign in to reply to this message.
> > Or is the former the "sane" default? Should we add syntax specify which? > The latter case is the one that might actually convince me to use the "a>b" syntax due to the difficulty of going to-and-from that sort of Go data structure and a sensible representation (like a slice), and so I would say that (*if* we do this at all) it should be the default. The other can be easily emulated by a list of objects. Adding an extra syntax (e.g. "a+b" to add a "b" to the current "a" and "a>b" for a new "a" with a "b" child) would be just that... more syntax and more complexity, and I really like the current simplicity of Marshal. Adding an alternate syntax would also still be divergent from xml.Unmarshal. > I don't use the xml package so I don't have an opinion. > In my (obviously minority) opinion, we should remove "a>b" from xml.Unmarshal instead of adding it to xml.Marshal so that they are (closer to) one another's inverse. JSON seems to do fine with importing/exporting data without structural shortcuts like this, though I completely understand that XML can often look like a caffeinated squirrel made the schema. ~K
Sign in to reply to this message.
Throwing away the Unmarshal support is silly. This ambiguity only comes up when the annotation is on a slice type. Either interpretation is fine: pick one and test for it. Russ
Sign in to reply to this message.
On 2011/08/23 00:07:45, rsc wrote: > Throwing away the Unmarshal support is silly. > This ambiguity only comes up when the annotation > is on a slice type. Either interpretation is fine: > pick one and test for it. > > Russ +1 to first case. The second case is easier to implement, but most of the examples and test cases show the first case (and it is the one that I needed).
Sign in to reply to this message.
http://codereview.appspot.com/4941042/diff/6001/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4941042/diff/6001/src/pkg/xml/marshal.go#newcode69 src/pkg/xml/marshal.go:69: type mgroup struct { On 2011/08/22 16:19:37, kevlar wrote: > This type name is somewhat opaque. Please use a more descriptive name. Any ideas for what to name mgroup? I wasn't entirely satisfied with the name either. There isn't a direct DOM or XML analogue to what this does. http://codereview.appspot.com/4941042/diff/6001/src/pkg/xml/marshal_test.go File src/pkg/xml/marshal_test.go (right): http://codereview.appspot.com/4941042/diff/6001/src/pkg/xml/marshal_test.go#n... src/pkg/xml/marshal_test.go:189: }, On 2011/08/22 16:19:37, kevlar wrote: > I really don't think this is an exhaustive enough test to convince me that this > code works correctly in all cases. It doesn't convince me either. More tests coming.
Sign in to reply to this message.
ftr, the first case can much more easily be represented simply in a Go data structure by simply making a struct type with the list of elements in it. The second requires a loop every time you want to convert between a slice and the XML representation. Thus, I think that if "a>b" is supposed to be there to simplify the representation of awkward XML structures in "sensible" Go structures, the second case should win out. http://codereview.appspot.com/4941042/diff/6001/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4941042/diff/6001/src/pkg/xml/marshal.go#newcode69 src/pkg/xml/marshal.go:69: type mgroup struct { On 2011/08/23 03:49:59, Ross Light wrote: > On 2011/08/22 16:19:37, kevlar wrote: > > This type name is somewhat opaque. Please use a more descriptive name. > > Any ideas for what to name mgroup? I wasn't entirely satisfied with the name > either. There isn't a direct DOM or XML analogue to what this does. well, for starters, spell out whatever 'm' is. nestedFieldGroup or something descriptive like that is probably preferable.
Sign in to reply to this message.
> ftr, the first case can much more easily be represented simply in a Go IMO it's not just about simplicity, but about expected behavior. I actually vote for the first one as well, FWIW, and that this: type T struct { A "v>a" B "v>b" } Be represented as: <v> <a>...</a> <b>...</b> </v> Rather than: <v><a>...</a></v> <v><b>...</b></v> That behavior seems to be what most people in this thread expect. Whatever happens, we can also support the alternative down the road. -- Gustavo Niemeyer http://niemeyer.net http://niemeyer.net/plus http://niemeyer.net/twitter http://niemeyer.net/blog -- I never filed a patent.
Sign in to reply to this message.
> IMO it's not just about simplicity, but about expected behavior. I > actually vote for the first one as well, FWIW, and that this: Some less subjective reasons for the suggested behavior: Format: http://en.wikipedia.org/wiki/RSS_(file_format) Tag: "channel>item" Format: http://en.wikipedia.org/wiki/XML_Shareable_Playlist_Format Tag: "trackslist>track" Format: http://goo.gl/Mcm3b Tag: "groupset>item" I can't think of many cases for the other behavior. Do you have some examples? -- Gustavo Niemeyer http://niemeyer.net http://niemeyer.net/plus http://niemeyer.net/twitter http://niemeyer.net/blog -- I never filed a patent.
Sign in to reply to this message.
On Tue, Aug 23, 2011 at 10:28 AM, Gustavo Niemeyer <gustavo@niemeyer.net>wrote: > > IMO it's not just about simplicity, but about expected behavior. I > > actually vote for the first one as well, FWIW, and that this: > > Some less subjective reasons for the suggested behavior: > > Format: http://en.wikipedia.org/wiki/RSS_(file_format) > Tag: "channel>item" > The XML package actually uses RSS as a test case. It works out much better, for this format, to not use any shortcuts at all. Item is not a single element for which there is a one-to-one relationship with a channel, and so is not even applicable anyway. > Format: http://en.wikipedia.org/wiki/XML_Shareable_Playlist_Format > Tag: "trackslist>track" > This is one example where there is a silly level of nesting that should be skipped, yes (at least from that example). I'm not actually sure that this changelist would be able to handle that, though. > Format: http://goo.gl/Mcm3b > Tag: "groupset>item" > Indeed. Though I have to wonder if we should be using as an example an API specification that didn't even take the time to make its example valid XML... I think the better example for your case in this API would be: StateCode int "instanceState>code" StateName string "instanceState>name" I can't think of many cases for the other behavior. Do you have some > examples? To reuse one of your examples, if you're using the playlist format and only want to store the locations (e.g. for a list of radio streaming stations which provide their own title) you could trivially use something like Locations []string `xml:"track>location"`. This discussion just reinforces my belief that this shouldn't be a part of xml.Marshal though. I think skipping nesting levels on input is a very valid thing to do, especially since it can be used to great effect to skip over the data that you don't care about, but if you're actually trying to speak an XML language with another piece of software (e.g. using xml.Marshal) I think it's reasonable to expect that your data structure maps directly with the one you're producing. ~K
Sign in to reply to this message.
> To reuse one of your examples, if you're using the playlist format and only > want to store the locations (e.g. for a list of radio streaming stations > which provide their own title) you could trivially use something like > Locations []string `xml:"track>location"`. That's the thing.. this is not very realistic.. I can't see people marshalling a broken document including just the locations in a useful way. The > syntax is most useful for skipping silly (and common) nesting levels, as you pointed out, or for obtaining a location list without the goal of marshalling it back. When people do have the intention of marshalling it back, they will produce types that read more of the document content to make the output meaningful as well. > This discussion just reinforces my belief that this shouldn't be a part of > xml.Marshal though. I think skipping nesting levels on input is a very > valid thing to do, especially since it can be used to great effect to skip Indeed, and if you handle the marshalling as we've been suggesting, the skipping will work for marshalling as well in several cases. -- Gustavo Niemeyer http://niemeyer.net http://niemeyer.net/plus http://niemeyer.net/twitter http://niemeyer.net/blog -- I never filed a patent.
Sign in to reply to this message.
Hello again, I realized there is a simpler stack-based approach to this problem that respects field order. It still uses the first form, but I'm much happier with it now (hopefully you will agree). Cheers, Ross Light
Sign in to reply to this message.
I'm happy with this code, it just needs a bit more documentation for users and future maintainers. http://codereview.appspot.com/4941042/diff/13002/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4941042/diff/13002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:50: // - the name '???'. Add a line here about the name of the element being the last element of a >-chain http://codereview.appspot.com/4941042/diff/13002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:60: // Describe here (or somewhere in this comment at least) how the a>b works. You might even give an example about how One a>b>c Two a>d Three a>e>f would be marshaled. http://codereview.appspot.com/4941042/diff/13002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:231: type parentStack struct { I think this is a lot more clear. It's very easy to understand what the stack represents and how the open/close tags are generated. http://codereview.appspot.com/4941042/diff/13002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:236: func (s *parentStack) move(parents []string) { Please comment this. I think you'll find that // move moves... is not how the comment starts, so in the process of writing that comment a better name might surface. perhaps: // setParents updates the parents of the current XML context to match the stack given as its argument, generating close tags up to the highest common level and open tags back to the specified context. or something like that. http://codereview.appspot.com/4941042/diff/13002/src/pkg/xml/marshal_test.go File src/pkg/xml/marshal_test.go (right): http://codereview.appspot.com/4941042/diff/13002/src/pkg/xml/marshal_test.go#... src/pkg/xml/marshal_test.go:85: type MixedNested struct { Good test cases. These will be really helpful to anyone who's looking at the code to help figure out how this struct tag format works.
Sign in to reply to this message.
http://codereview.appspot.com/4941042/diff/13002/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4941042/diff/13002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:50: // - the name '???'. On 2011/08/24 17:26:09, kevlar wrote: > Add a line here about the name of the element being the last element of a > >-chain Done. http://codereview.appspot.com/4941042/diff/13002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:60: // On 2011/08/24 17:26:09, kevlar wrote: > Describe here (or somewhere in this comment at least) how the a>b works. You > might even give an example about how > One a>b>c > Two a>d > Three a>e>f > would be marshaled. Done. http://codereview.appspot.com/4941042/diff/13002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:236: func (s *parentStack) move(parents []string) { On 2011/08/24 17:26:09, kevlar wrote: > Please comment this. I think you'll find that // move moves... is not how the > comment starts, so in the process of writing that comment a better name might > surface. > > perhaps: > // setParents updates the parents of the current XML context to match the stack > given as its argument, generating close tags up to the highest common level and > open tags back to the specified context. > or something like that. Done.
Sign in to reply to this message.
I realized this didn't handle nils as well as I liked, so here's a new patch.
Sign in to reply to this message.
Some initial comments re. docs: http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:43: // element containing the data. s/a single XML element/one or more XML elements/ http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:45: // The name of that XML element is taken from, in order of preference: s/name of that XML element/name for the XML elements/ http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:49: // used to obtain the data This looks incorrect. All of the elements in the chain are used now. It also overlaps with the bottom point, so I suggest just removing this point from the list. http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:63: // If a field uses a tag a>b>c, then the element c will be nested inside Tags seem to be consistently referenced within quotes elsewhere. http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:65: // same parent will be enclosed in one tag. For example: s/one tag/one XML element/? http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:86: // </result> Nice to have an example, thanks.
Sign in to reply to this message.
looks good so far
Sign in to reply to this message.
Nice, thanks for working on this Ross. I've only got a minor suggestion. LGTM either way. http://codereview.appspot.com/4941042/diff/25002/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4941042/diff/25002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:241: } Rather than computing parents/name and discarding in case it's not useful, the vf test here might be inverted and done upfront with break.
Sign in to reply to this message.
> I've only got a minor suggestion. LGTM either way. Plus the doc suggestions, that is.
Sign in to reply to this message.
I realized there was another edge case with nils that wasn't being handled correctly. See the new unit tests for details. http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:43: // element containing the data. On 2011/08/25 20:38:58, niemeyer wrote: > s/a single XML element/one or more XML elements/ Done. http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:45: // The name of that XML element is taken from, in order of preference: On 2011/08/25 20:38:58, niemeyer wrote: > s/name of that XML element/name for the XML elements/ Done. http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:49: // used to obtain the data On 2011/08/25 20:38:58, niemeyer wrote: > This looks incorrect. All of the elements in the chain are used now. It also > overlaps with the bottom point, so I suggest just removing this point from the > list. Done. http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:63: // If a field uses a tag a>b>c, then the element c will be nested inside On 2011/08/25 20:38:58, niemeyer wrote: > Tags seem to be consistently referenced within quotes elsewhere. Done. http://codereview.appspot.com/4941042/diff/22001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:65: // same parent will be enclosed in one tag. For example: On 2011/08/25 20:38:58, niemeyer wrote: > s/one tag/one XML element/? Done. http://codereview.appspot.com/4941042/diff/25002/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4941042/diff/25002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:241: } On 2011/08/25 21:46:58, niemeyer wrote: > Rather than computing parents/name and discarding in case it's not useful, the > vf test here might be inverted and done upfront with break. Except I realized there's still one more edge case with nils here. I added new unit tests to demonstrate the problem.
Sign in to reply to this message.
Thanks, LGTM. Just a minor suggestion: http://codereview.appspot.com/4941042/diff/32001/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4941042/diff/32001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:238: s.pop(parents) Can we call this trim instead of pop? It's not necessarily popping anything, and it's asymmetric with push. http://codereview.appspot.com/4941042/diff/32001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:240: s.push(parents[len(s.stack):]) Makes sense now, thanks.
Sign in to reply to this message.
http://codereview.appspot.com/4941042/diff/32001/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4941042/diff/32001/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:238: s.pop(parents) On 2011/08/26 13:52:12, niemeyer wrote: > Can we call this trim instead of pop? It's not necessarily popping anything, and > it's asymmetric with push. Done.
Sign in to reply to this message.
Thanks Ross. I'm happy with it. Russ?
Sign in to reply to this message.
LGTM Very nice.
Sign in to reply to this message.
Ross, can you please s/Marshal/marshal/ the description?
Sign in to reply to this message.
On Fri, Aug 26, 2011 at 11:20, <n13m3y3r@gmail.com> wrote: > Ross, can you please s/Marshal/marshal/ the description? gustavo: feel free to do that during submit. hg clpatch 123 hg change 123 # edit description hg submit 123
Sign in to reply to this message.
> gustavo: feel free to do that during submit. > > hg clpatch 123 > hg change 123 # edit description > hg submit 123 Got it, thanks. Didn't know I was able to tweak the description in flight. -- Gustavo Niemeyer http://niemeyer.net http://niemeyer.net/plus http://niemeyer.net/twitter http://niemeyer.net/blog -- I never filed a patent.
Sign in to reply to this message.
On Fri, Aug 26, 2011 at 11:26, Gustavo Niemeyer <gustavo@niemeyer.net> wrote: >> gustavo: feel free to do that during submit. >> >> hg clpatch 123 >> hg change 123 # edit description >> hg submit 123 > > Got it, thanks. Didn't know I was able to tweak the description in flight. it was magic i added. :-)
Sign in to reply to this message.
*** Submitted as http://code.google.com/p/go/source/detail?r=7697a84ee19f *** xml: marshal "parent>child" tags correctly Fixes issue 2119 R=m.n.summerfield, adg, kevlar, rsc, gustavo, n13m3y3r CC=golang-dev http://codereview.appspot.com/4941042 Committer: Gustavo Niemeyer <gustavo@niemeyer.net>
Sign in to reply to this message.
|