|
|
Description xml: add Marshal and MarshalIndent
I have written up a Marshal and MarshalIndent pair that should
closely reflect the way that Unmarshal works. I would love feedback
on making this code more accessible and efficient... I haven't used
reflecton on this scale before, so there is probably a lot of work
that can be done on that.
Some potentially controversial things:
- All tag names are lower-cased by default.
- Zero-valued struct values are skipped.
- No namespace prefix (o:tag, etc) mechanism is supplied.
- You are allowed to marshal non-struct values (even though unmarshal
cannot handle them).
- A tag for a non-XMLName struct field that isn't "attr", "chardata",
or "innerxml" is used as the name of the tag. This could wreak
havoc if you try to marshal a protobuf struct.
- The "innerxml" and "chardata" are inserted verbatim. If you try to
marshal something straight from unmarshal, the results could be
unexpected (remove "innerxml" support from Marshal would be one
possible solution).
Patch Set 1 #Patch Set 2 : diff -r 2e64054a4cb2 https://go.googlecode.com/hg/ #Patch Set 3 : diff -r 2e64054a4cb2 https://go.googlecode.com/hg/ #
Total comments: 30
Patch Set 4 : diff -r 2e64054a4cb2 https://go.googlecode.com/hg/ #
Total comments: 18
Patch Set 5 : diff -r 3418f22c39eb https://go.googlecode.com/hg/ #
Total comments: 12
Patch Set 6 : diff -r 3418f22c39eb https://go.googlecode.com/hg/ #
Total comments: 19
Patch Set 7 : diff -r 3418f22c39eb https://go.googlecode.com/hg/ #Patch Set 8 : diff -r 3418f22c39eb https://go.googlecode.com/hg/ #Patch Set 9 : diff -r 3418f22c39eb https://go.googlecode.com/hg/ #
Total comments: 14
Patch Set 10 : diff -r cf0c2b126968 https://go.googlecode.com/hg/ #
MessagesTotal messages: 21
Hello rsc@golang.org, golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://go.googlecode.com/hg/
Sign in to reply to this message.
http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/atom_test.go File src/pkg/xml/atom_test.go (right): http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/atom_test.go#newcode6 src/pkg/xml/atom_test.go:6: var atom_value = &Feed{ no underscores; mixedCaps throughout see http://golang.org/doc/effective_go.html#names http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode1 src/pkg/xml/marshal.go:1: // Copyright 2011 Google Inc. All Rights Reserved. wrong copyright tags; see http://golang.org/doc/contribute.html http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode54 src/pkg/xml/marshal.go:54: func Marshal(v interface{}) (data []byte, err os.Error) { I'd make this func Marshal(w io.Writer, val interface{}) os.Error by analogy with Unmarshal. You'll probably want to start by doing bw := bufio.NewWriter(w) and then pass the *bufio.Writer throughout. i think you want a type printer struct { w *bufio.Writer ... other fields probably ... } and then all the helpers are methods on printer. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode59 src/pkg/xml/marshal.go:59: // MarshalIndent formats the given value in the same way as Marshal, except the resulting XML let's leave this for a different CL. would be better as a filter, like json.Indent is, so you can apply it to any xml. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode130 src/pkg/xml/marshal.go:130: func marshalValue(buf *bytes.Buffer, val reflect.Value, name string, I think this is more complicated than it needs to be. Look at package json's encodeState.reflectValue. It manages to get by with a single place that switches on kind. There are three or four places below, and I think they can probably be reduced. The general structure of the body of function should be ptr check method check print "<tag" print attributes if any print ">" print body (main switch on val.Kind()) print "</tag>" return As written there are many special cases and I think you can avoid most of them. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode132 src/pkg/xml/marshal.go:132: for { You do have to do something with pointers. I'd do switch val.Kind() { case reflect.Ptr, reflect.Interface: if val.IsNil() { return } p.reflectValue(val.Elem()) return } http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode145 src/pkg/xml/marshal.go:145: name = val.Kind().String() Seems wrong to have a tag named <slice> just because this is a slice. Require the caller to pass this in. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode163 src/pkg/xml/marshal.go:163: openTag(buf, prefix, indent, name) In the structure suggested above, the tag lines all go away. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode190 src/pkg/xml/marshal.go:190: case reflect.Map: I'd say just leave maps out for now. default: return &UnsupportedTypeError{val.Type()} (see same type in json) http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode210 src/pkg/xml/marshal.go:210: case reflect.Interface, reflect.Struct: s/reflect.Interface, // Interface was handled above. This code only works on struct. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode218 src/pkg/xml/marshal.go:218: var attrIdx, childIdx, rawIdx []int These slices are probably not worth their weight. Since the open tag stuff was handled above, just process each field one at a time and handle it when it happens. for i := 0; i < typ.NumField(); i++ { f := typ.Field(i) if f.PkgPath != "" || f.Tag == "attr" { continue } if f.Tag == "chardata" { ... continue } if f.Tag == "innerxml" { ... continue } p.reflectValue(val.Field(i)) } The handling of the open tag above would be a similar loop: fmt.Fprint(b.w, "<%s", tag) if typ.Kind() == reflect.Struct { v = val.FieldByName("XMLName") if v.IsValid() { ... name space } for i := 0; i < typ.NumField(); i++ { f := typ.Field(i) if f.PkgPath != "" || f.Tag != "attr" { continue } ... attribute } } fmt.Fprint(b.w, ">") http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode221 src/pkg/xml/marshal.go:221: if !unicode.IsUpper(int(field.Name[0])) { // TODO(kevlar): utf8? Look at what json does. (Good advice in general.) http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode358 src/pkg/xml/marshal.go:358: // Skip zero values Surprising to me. I'd say leave them in. Json doesn't skip them. Just expensive for no reason. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode415 src/pkg/xml/marshal.go:415: // A MarshalXMLError is returned when Marshal or MarshalIndent encounter a type already in package xml. MarshalError is fine probably don't need it anyway. the only error is unsupported type. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode427 src/pkg/xml/marshal.go:427: func (mxe *MarshalXMLError) String() string { s/mxe/e/
Sign in to reply to this message.
PTAL Down to ~300 lines from ~450. If you do indeed think I should print <tag></tag> instead of <tag/>, then it'll drop further. Thanks for the review! http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/atom_test.go File src/pkg/xml/atom_test.go (right): http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/atom_test.go#newcode6 src/pkg/xml/atom_test.go:6: var atom_value = &Feed{ On 2011/05/25 03:12:58, rsc wrote: > no underscores; mixedCaps > throughout > see http://golang.org/doc/effective_go.html#names Done. I'm so bad at that game. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode1 src/pkg/xml/marshal.go:1: // Copyright 2011 Google Inc. All Rights Reserved. On 2011/05/25 03:12:58, rsc wrote: > wrong copyright tags; see http://golang.org/doc/contribute.html Done. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode54 src/pkg/xml/marshal.go:54: func Marshal(v interface{}) (data []byte, err os.Error) { On 2011/05/25 03:12:58, rsc wrote: > I'd make this > > func Marshal(w io.Writer, val interface{}) os.Error > > by analogy with Unmarshal. You'll probably want to > start by doing > > bw := bufio.NewWriter(w) > > and then pass the *bufio.Writer throughout. > > i think you want a > > type printer struct { > w *bufio.Writer > ... other fields probably ... > } > > and then all the helpers are methods on printer. Done. I had the old signature to emulate json.Marshal(v interface{}) ([]bytes, os.Error). http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode59 src/pkg/xml/marshal.go:59: // MarshalIndent formats the given value in the same way as Marshal, except the resulting XML On 2011/05/25 03:12:58, rsc wrote: > let's leave this for a different CL. > would be better as a filter, like json.Indent is, > so you can apply it to any xml. Done. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode130 src/pkg/xml/marshal.go:130: func marshalValue(buf *bytes.Buffer, val reflect.Value, name string, On 2011/05/25 03:12:58, rsc wrote: > I think this is more complicated than it needs to be. > Look at package json's encodeState.reflectValue. > It manages to get by with a single place that switches > on kind. There are three or four places below, and > I think they can probably be reduced. > > The general structure of the body of function should be > > ptr check > method check > print "<tag" > print attributes if any > print ">" > print body (main switch on val.Kind()) > print "</tag>" > return > > As written there are many special cases and I think you > can avoid most of them. Simplified quite a bit, but I still have some switches. Did you actually want me to drop the empty tag shorthand notation? With the zero checks, it isn't used a whole lot, but in the atom_test case the link turns into <link href=""></link> when your example output had it as <link href=""/>. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode132 src/pkg/xml/marshal.go:132: for { On 2011/05/25 03:12:58, rsc wrote: > You do have to do something with pointers. I'd do > > switch val.Kind() { > case reflect.Ptr, reflect.Interface: > if val.IsNil() { > return > } > p.reflectValue(val.Elem()) > return > } Done. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode145 src/pkg/xml/marshal.go:145: name = val.Kind().String() On 2011/05/25 03:12:58, rsc wrote: > Seems wrong to have a tag named <slice> just because > this is a slice. Require the caller to pass this in. Done; the top level is now the only one that guesses its own name (in case it doesn't have an XMLName). http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode163 src/pkg/xml/marshal.go:163: openTag(buf, prefix, indent, name) On 2011/05/25 03:12:58, rsc wrote: > In the structure suggested above, the tag lines all go away. Done. Though it turns out that purely because of the empty tag shorthand, you end up with logic in every case anyway. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode190 src/pkg/xml/marshal.go:190: case reflect.Map: On 2011/05/25 03:12:58, rsc wrote: > I'd say just leave maps out for now. > > default: > return &UnsupportedTypeError{val.Type()} > > (see same type in json) Ah, I thought maps were handled this way by unmarshal. Done. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode210 src/pkg/xml/marshal.go:210: case reflect.Interface, reflect.Struct: On 2011/05/25 03:12:58, rsc wrote: > s/reflect.Interface, // > > Interface was handled above. This code only works on struct. Done. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode218 src/pkg/xml/marshal.go:218: var attrIdx, childIdx, rawIdx []int On 2011/05/25 03:12:58, rsc wrote: > These slices are probably not worth their weight. > Since the open tag stuff was handled above, just > process each field one at a time and handle it when > it happens. > > for i := 0; i < typ.NumField(); i++ { > f := typ.Field(i) > if f.PkgPath != "" || f.Tag == "attr" { > continue > } > if f.Tag == "chardata" { > ... > continue > } > if f.Tag == "innerxml" { > ... > continue > } > p.reflectValue(val.Field(i)) > } > > The handling of the open tag above would be a similar loop: > > fmt.Fprint(b.w, "<%s", tag) > if typ.Kind() == reflect.Struct { > v = val.FieldByName("XMLName") > if v.IsValid() { > ... name space > } > for i := 0; i < typ.NumField(); i++ { > f := typ.Field(i) > if f.PkgPath != "" || f.Tag != "attr" { > continue > } > ... attribute > } > } > fmt.Fprint(b.w, ">") Done. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode221 src/pkg/xml/marshal.go:221: if !unicode.IsUpper(int(field.Name[0])) { // TODO(kevlar): utf8? On 2011/05/25 03:12:58, rsc wrote: > Look at what json does. (Good advice in general.) Done. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode358 src/pkg/xml/marshal.go:358: // Skip zero values On 2011/05/25 03:12:58, rsc wrote: > Surprising to me. I'd say leave them in. > Json doesn't skip them. Just expensive for no reason. > In the test case you gave me via email, you omitted zero values in the XML output. For things like the atom test, there are quite a few fields that get omitted. I did a benchmark, and omitting them actually makes marshaling a large structure 10% faster even though the zero check itself is somewhat expensive. In terms of the surprise factor, it shouldn't be too surprising, because we expect XML to ignore things that have no data at all. Also, the values skipped here are only those that will be left alone (zero, if you start with an empty structure) by unmarshal. For values that you want to always print, I propose adding an "explicit" tag that will force them to be printed even if they are zero, for instance if you have indices and zero is valid, a human might want to see that zero value (though a computer would probably assume zero). http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode415 src/pkg/xml/marshal.go:415: // A MarshalXMLError is returned when Marshal or MarshalIndent encounter a type On 2011/05/25 03:12:58, rsc wrote: > already in package xml. MarshalError is fine > probably don't need it anyway. the only error > is unsupported type. Done. http://codereview.appspot.com/4539082/diff/7/src/pkg/xml/marshal.go#newcode427 src/pkg/xml/marshal.go:427: func (mxe *MarshalXMLError) String() string { On 2011/05/25 03:12:58, rsc wrote: > s/mxe/e/ Done.
Sign in to reply to this message.
Ping
Sign in to reply to this message.
http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:227: // Skip zero values I still think this is wrong. If the value is nil then the recursive call will skip it, so that doesn't have to be handled here. Skipping zero numbers and structs is strange. http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:272: xp.WriteByte('/') I think it's fine to skip this optimization for now.
Sign in to reply to this message.
http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcode35 src/pkg/xml/marshal.go:35: type xmlPrinter struct { In package xml, it's redundant to name things xmlFoo. s/xmlPrinter/printer/ s/xp/p/ http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcode42 src/pkg/xml/marshal.go:42: // - If the type is named, that is the name of the top-level tag This comment is, I think, out of date. I hope that making it describe the new simpler code will also shorten it. http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcode71 src/pkg/xml/marshal.go:71: // TODO(kevlar): Would an exported MarshalInto appreciably faster? Useful? delete http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcode93 src/pkg/xml/marshal.go:93: var ( use := instead of a var block kind := val.Kind() typ := val.Type() leave attr and serial until closer to where they are set/used http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:109: name = strings.ToLower(typ.Name()) Why does this override the name that was passed in? Let the top-level call pass in a name and then I think the topLevel bool and this code can go away. Also var attr []string does not allocated; attr := []string{} does. http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:129: buf := bytes.NewBuffer(nil) should not need the buffer. ... slice and array code (below) ... // A struct with an XMLName field specifies the tag name. if kind == reflect.Struct { ... find, record name, xmlns } emit <tag // "attr" field tags become attributes if kind == reflect.Struct { ... attr loop ... } emit > switch on kind for data http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:180: // Slices are printed serially, so no open or close tag is used // Slices and arrays iterate over the elements. // They do not have an enclosing tag. if kind == reflect.Slice || kind == reflect.Array { .. code from swtich below return }
Sign in to reply to this message.
PTAL http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcode35 src/pkg/xml/marshal.go:35: type xmlPrinter struct { On 2011/06/03 17:43:54, rsc wrote: > In package xml, it's redundant to name things xmlFoo. > s/xmlPrinter/printer/ > s/xp/p/ Done. http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcode42 src/pkg/xml/marshal.go:42: // - If the type is named, that is the name of the top-level tag On 2011/06/03 17:43:54, rsc wrote: > This comment is, I think, out of date. > I hope that making it describe the new simpler code > will also shorten it. Done. http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcode71 src/pkg/xml/marshal.go:71: // TODO(kevlar): Would an exported MarshalInto appreciably faster? Useful? On 2011/06/03 17:43:54, rsc wrote: > delete Done. http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcode93 src/pkg/xml/marshal.go:93: var ( On 2011/06/03 17:43:54, rsc wrote: > use := instead of a var block > > kind := val.Kind() > typ := val.Type() > > leave attr and serial until closer to where they are set/used Done. http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:109: name = strings.ToLower(typ.Name()) On 2011/06/03 17:43:54, rsc wrote: > Why does this override the name that was passed in? > Let the top-level call pass in a name and then I think > the topLevel bool and this code can go away. > Also var attr []string does not allocated; attr := []string{} does. > Done. http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:129: buf := bytes.NewBuffer(nil) On 2011/06/03 17:43:54, rsc wrote: > should not need the buffer. > > ... slice and array code (below) ... > > // A struct with an XMLName field specifies the tag name. > if kind == reflect.Struct { > ... find, record name, xmlns > } > > emit <tag > > // "attr" field tags become attributes > if kind == reflect.Struct { > ... attr loop ... > } > > emit > > > switch on kind for data Done. http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:180: // Slices are printed serially, so no open or close tag is used On 2011/06/03 17:43:54, rsc wrote: > // Slices and arrays iterate over the elements. > // They do not have an enclosing tag. > if kind == reflect.Slice || kind == reflect.Array { > .. code from swtich below > return > } Done. http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:227: // Skip zero values On 2011/06/03 17:33:22, rsc wrote: > I still think this is wrong. > If the value is nil then the recursive call will skip it, > so that doesn't have to be handled here. > Skipping zero numbers and structs is strange. Fair enough. http://codereview.appspot.com/4539082/diff/8001/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:272: xp.WriteByte('/') On 2011/06/03 17:33:22, rsc wrote: > I think it's fine to skip this optimization for now. Done.
Sign in to reply to this message.
This is getting much simpler. Nice. http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:64: func (p *printer) openTag(tag string, params []string) { now that these are only in one place you could just inline them. http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:108: // Slices and arrays iterate over the elements. They do not have an enclosing tag. []byte is an exception here, no? http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:119: attr := []string{} Please just write the attributes out directly instead of making a slice. http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:164: case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64, Bool, Uintptr. http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:169: fmt.Fprint(p, val.Interface()) Please handle each case separately, so that print doesn't have to redo the effort: case reflect.Int, reflect.Int8, ...: p.Write([]byte(strconv.Itoa64(val.Int())) etc When you get to reflect.String remember that you need to escape it. http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:179: name := strings.ToLower(ftyp.Name) name := ftyp.Name the atom example may need fixing.
Sign in to reply to this message.
PTAL http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:64: func (p *printer) openTag(tag string, params []string) { On 2011/06/03 18:53:53, rsc wrote: > now that these are only in one place you could just inline them. Done. http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:108: // Slices and arrays iterate over the elements. They do not have an enclosing tag. On 2011/06/03 18:53:53, rsc wrote: > []byte is an exception here, no? Indeed. Done and test added. http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:119: attr := []string{} On 2011/06/03 18:53:53, rsc wrote: > Please just write the attributes out directly instead of making a slice. Oops, I meant to do that already. Done. http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:164: case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64, On 2011/06/03 18:53:53, rsc wrote: > Bool, Uintptr. Done. http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:169: fmt.Fprint(p, val.Interface()) On 2011/06/03 18:53:53, rsc wrote: > Please handle each case separately, so that print doesn't have to redo the > effort: > > case reflect.Int, reflect.Int8, ...: > p.Write([]byte(strconv.Itoa64(val.Int())) > > etc > > When you get to reflect.String remember that you need to escape it. Done. http://codereview.appspot.com/4539082/diff/11002/src/pkg/xml/marshal.go#newco... src/pkg/xml/marshal.go:179: name := strings.ToLower(ftyp.Name) On 2011/06/03 18:53:53, rsc wrote: > name := ftyp.Name > > the atom example may need fixing. I added the relevant empty fields already, and the tests pass. Is that what you mean? http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:146: continue Does this also need to be escapped? Before, or after it's quoted? If so, intuitively I think it should be: Value: "3.14 < 42" // turns into: value="3.14 < 42" which would mean quote(escape(str))? http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:163: switch k := val.Kind(); k { Is there an easier way than defering this to fmt? The fmt logic seems more complicated than I wanted to inline here. http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:170: case reflect.Struct: Is this the right thing to do for Uintptr? JSON just prints it out.
Sign in to reply to this message.
http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:146: p.WriteString(strconv.Quote(str)) On 2011/06/03 21:00:26, kevlar wrote: > Does this also need to be escapped? Before, or after it's quoted? > > If so, intuitively I think it should be: > Value: "3.14 < 42" // turns into: > value="3.14 < 42" > which would mean quote(escape(str))? you shouldn't use strconv.Quote. That uses Go quoting rules. p.WriteString(`="`) Escape(p, str) p.WriteString(`"`) http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:149: // "innerxml": write XML and return? delete (can't return here, haven't finished the tag. the right place for innerxml is below.) http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:163: fmt.Fprintf(p, "%g", val.Complex()) On 2011/06/03 21:00:26, kevlar wrote: > Is there an easier way than defering this to fmt? The fmt logic seems more > complicated than I wanted to inline here. You could just not allow complex. http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:170: p.WriteString(strconv.Uitob64(val.Uint(), 16)) On 2011/06/03 21:00:26, kevlar wrote: > Is this the right thing to do for Uintptr? JSON just prints it out. I think it's fine to put Uintptr up above with the other Uint cases. http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:191: name := strings.ToLower(ftyp.Name) name := ftyp.Name no ToLower http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:200: p.Write(elem) chardata needs the same escaping http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:204: case "innerxml", "attr": attr is a continue innerxml should be p.Write(val.Field(i).Interface().([]byte))
Sign in to reply to this message.
http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:191: name := strings.ToLower(ftyp.Name) On 2011/06/03 21:21:12, rsc wrote: > name := ftyp.Name > > no ToLower You want capitalized tag names? http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:204: case "innerxml", "attr": On 2011/06/03 21:21:12, rsc wrote: > attr is a continue > > innerxml should be > p.Write(val.Field(i).Interface().([]byte)) So you have to explicitly clear out innerxml if you don't want to duplicate the data?
Sign in to reply to this message.
PTAL http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:146: p.WriteString(strconv.Quote(str)) On 2011/06/03 21:21:12, rsc wrote: > On 2011/06/03 21:00:26, kevlar wrote: > > Does this also need to be escapped? Before, or after it's quoted? > > > > If so, intuitively I think it should be: > > Value: "3.14 < 42" // turns into: > > value="3.14 < 42" > > which would mean quote(escape(str))? > > you shouldn't use strconv.Quote. > That uses Go quoting rules. > > p.WriteString(`="`) > Escape(p, str) > p.WriteString(`"`) Done. http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:149: // "innerxml": write XML and return? On 2011/06/03 21:21:12, rsc wrote: > delete > (can't return here, haven't finished the tag. > the right place for innerxml is below.) > Done. http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:163: fmt.Fprintf(p, "%g", val.Complex()) On 2011/06/03 21:21:12, rsc wrote: > On 2011/06/03 21:00:26, kevlar wrote: > > Is there an easier way than defering this to fmt? The fmt logic seems more > > complicated than I wanted to inline here. > > You could just not allow complex. Done. http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:170: p.WriteString(strconv.Uitob64(val.Uint(), 16)) On 2011/06/03 21:21:12, rsc wrote: > On 2011/06/03 21:00:26, kevlar wrote: > > Is this the right thing to do for Uintptr? JSON just prints it out. > > I think it's fine to put Uintptr up above with the other Uint cases. Done. http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:200: p.Write(elem) On 2011/06/03 21:21:12, rsc wrote: > chardata needs the same escaping Done.
Sign in to reply to this message.
> So you have to explicitly clear out innerxml if you don't want to > duplicate the data? People who use innerxml are presumed to know what they are doing.
Sign in to reply to this message.
PTAL Sorry this took so long; I made a silly mistake in fixing the case in the atom test. http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:191: name := strings.ToLower(ftyp.Name) On 2011/06/03 21:37:13, kevlar wrote: > On 2011/06/03 21:21:12, rsc wrote: > > name := ftyp.Name > > > > no ToLower > > You want capitalized tag names? Done. http://codereview.appspot.com/4539082/diff/4003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:204: case "innerxml", "attr": On 2011/06/03 21:37:13, kevlar wrote: > On 2011/06/03 21:21:12, rsc wrote: > > attr is a continue > > > > innerxml should be > > p.Write(val.Field(i).Interface().([]byte)) > > So you have to explicitly clear out innerxml if you don't want to duplicate the > data? Done.
Sign in to reply to this message.
I realized that I didn't have an innerxml test case (other than atom, but it's cleared out and not tested for marshaling), and realized that the current bahvior is a bit odd. If you want to be able to provide the raw XML yourself, you have two options: implement the Marshaler interface, or have all of the fields be pointers so that they can be nil'd out and ignored. If you have any integers, for instance, they will be marshaled along with the raw innerxml. Thus, should the first innerxml field be remembered during inspection of the arguments and printed in lieu of all of the structure's fields? This would seem to be the proper inverse of Unmarshal, if I read the docs correctly.
Sign in to reply to this message.
ping
Sign in to reply to this message.
Getting very close. http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcode33 src/pkg/xml/marshal.go:33: // Marshal formats the given value as a top-level XML tag whose contents represent // Marshal writes an XML-formatted representation of v to w. // // If v implements Marshaler, then Marshal calls its MarshalXML method. // Otherwise, Marshal uses the following procedure to create the XML. // // Marshal handles an array or slice by marshalling each of the elements. // Marshal handles a pointer by marshalling the value it points at or, // if the pointer is nil, by writing nothing. // Marshal handles an interface value by marshalling the value it contains or, // if the interface value is nil, by writing nothing. // Marshal handles all other data by writing a single XML element containing the data. // The name of that XML element is taken from, in order of preference: // - the tag on an XMLName field, if the data is a struct // - the value of an XMLName field of type xml.Name // - the tag of the struct field used to obtain the data // - the name of the struct field used to obtain the data // - the name '???'. // The XML element for a struct contains marshalled elements for each of the // exported fields of the struct, with these exceptions: // - the XMLName field, described above, is omitted. // - a field with tag "attr" becomes an attribute in the XML element. // - a field with tag "chardata" is written as character data, // not as an XML element. // - a field with tag "innerxml" is written verbatim, // not subject to the usual marshalling procedure. // Marshal will return an error if asked to marshal a channel, function, or map. http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcode52 src/pkg/xml/marshal.go:52: func Marshal(w io.Writer, tagName string, v interface{}) (err os.Error) { I think I'd drop tagName. Structs should have XMLName elements. http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcode54 src/pkg/xml/marshal.go:54: err = p.marshalValue(reflect.ValueOf(v), tagName) s/tagName/"???" http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcode74 src/pkg/xml/marshal.go:74: // Try Marshaler Should be above the pointer check. http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcode96 src/pkg/xml/marshal.go:96: // XMLname Find XML name. http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcode99 src/pkg/xml/marshal.go:99: if xn := val.FieldByName("XMLName"); xn.IsValid() { if f, ok := typ.FieldByName("XMLName"); ok { if tag := f.Tag; tag != "" { if i := strings.Index(tag, " "); i >= 0 { xmlns, name = tag[:i], tag[i+1:] } else { name = tag } } else if v, ok := val.FieldByIndex(f.Index).Interface().(Name); ok && v.Local != "" { xmlns, name = v.Space, v.Local } } http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:134: field := typ.Field(i) if f := typ.Field(i); f.PkgPath != "" && f.Tag == "attr" { ... } http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:179: fval := val.Field(i) save until you need it
Sign in to reply to this message.
PTAL http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go File src/pkg/xml/marshal.go (right): http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcode33 src/pkg/xml/marshal.go:33: // Marshal formats the given value as a top-level XML tag whose contents represent On 2011/06/14 16:49:11, rsc wrote: > // Marshal writes an XML-formatted representation of v to w. > // > // If v implements Marshaler, then Marshal calls its MarshalXML method. > // Otherwise, Marshal uses the following procedure to create the XML. > // > // Marshal handles an array or slice by marshalling each of the elements. > // Marshal handles a pointer by marshalling the value it points at or, > // if the pointer is nil, by writing nothing. > // Marshal handles an interface value by marshalling the value it contains or, > // if the interface value is nil, by writing nothing. > // Marshal handles all other data by writing a single XML element containing the > data. > // The name of that XML element is taken from, in order of preference: > // - the tag on an XMLName field, if the data is a struct > // - the value of an XMLName field of type xml.Name > // - the tag of the struct field used to obtain the data > // - the name of the struct field used to obtain the data > // - the name '???'. > // The XML element for a struct contains marshalled elements for each of the > // exported fields of the struct, with these exceptions: > // - the XMLName field, described above, is omitted. > // - a field with tag "attr" becomes an attribute in the XML element. > // - a field with tag "chardata" is written as character data, > // not as an XML element. > // - a field with tag "innerxml" is written verbatim, > // not subject to the usual marshalling procedure. > // Marshal will return an error if asked to marshal a channel, function, or map. Thanks! http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcode52 src/pkg/xml/marshal.go:52: func Marshal(w io.Writer, tagName string, v interface{}) (err os.Error) { On 2011/06/14 16:49:11, rsc wrote: > I think I'd drop tagName. Structs should have XMLName elements. Done. http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcode54 src/pkg/xml/marshal.go:54: err = p.marshalValue(reflect.ValueOf(v), tagName) On 2011/06/14 16:49:11, rsc wrote: > s/tagName/"???" Done. http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcode99 src/pkg/xml/marshal.go:99: if xn := val.FieldByName("XMLName"); xn.IsValid() { On 2011/06/14 16:49:11, rsc wrote: > if f, ok := typ.FieldByName("XMLName"); ok { > if tag := f.Tag; tag != "" { > if i := strings.Index(tag, " "); i >= 0 { > xmlns, name = tag[:i], tag[i+1:] > } else { > name = tag > } > } else if v, ok := val.FieldByIndex(f.Index).Interface().(Name); ok && > v.Local != "" { > xmlns, name = v.Space, v.Local > } > } Done. http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:134: field := typ.Field(i) On 2011/06/14 16:49:11, rsc wrote: > if f := typ.Field(i); f.PkgPath != "" && f.Tag == "attr" { > ... > } > Done. http://codereview.appspot.com/4539082/diff/2003/src/pkg/xml/marshal.go#newcod... src/pkg/xml/marshal.go:179: fval := val.Field(i) On 2011/06/14 16:49:11, rsc wrote: > save until you need it Inlined at its one use.
Sign in to reply to this message.
Ping
Sign in to reply to this message.
LGTM
Sign in to reply to this message.
*** Submitted as http://code.google.com/p/go/source/detail?r=85fe786bbd63 *** xml: add Marshal and MarshalIndent I have written up a Marshal and MarshalIndent pair that should closely reflect the way that Unmarshal works. I would love feedback on making this code more accessible and efficient... I haven't used reflecton on this scale before, so there is probably a lot of work that can be done on that. Some potentially controversial things: - All tag names are lower-cased by default. - Zero-valued struct values are skipped. - No namespace prefix (o:tag, etc) mechanism is supplied. - You are allowed to marshal non-struct values (even though unmarshal cannot handle them). - A tag for a non-XMLName struct field that isn't "attr", "chardata", or "innerxml" is used as the name of the tag. This could wreak havoc if you try to marshal a protobuf struct. - The "innerxml" and "chardata" are inserted verbatim. If you try to marshal something straight from unmarshal, the results could be unexpected (remove "innerxml" support from Marshal would be one possible solution). R=rsc CC=golang-dev http://codereview.appspot.com/4539082 Committer: Russ Cox <rsc@golang.org>
Sign in to reply to this message.
|