https://codereview.appspot.com/10085049/diff/5001/unicode/encoding/encoding.go File unicode/encoding/encoding.go (right): https://codereview.appspot.com/10085049/diff/5001/unicode/encoding/encoding.go#newcode39 unicode/encoding/encoding.go:39: // dst should always be large enough to write ...
12 years, 4 months ago
(2013-06-07 12:35:27 UTC)
#1
https://codereview.appspot.com/10085049/diff/5001/unicode/encoding/encoding.go File unicode/encoding/encoding.go (right): https://codereview.appspot.com/10085049/diff/5001/unicode/encoding/encoding.go#newcode58 unicode/encoding/encoding.go:58: // encoding; the rune returned in this case should ...
12 years, 4 months ago
(2013-06-07 17:47:34 UTC)
#2
some high-level comments only. https://codereview.appspot.com/10085049/diff/5001/unicode/encoding/encoding.go File unicode/encoding/encoding.go (right): https://codereview.appspot.com/10085049/diff/5001/unicode/encoding/encoding.go#newcode42 unicode/encoding/encoding.go:42: // there is not enough ...
12 years, 4 months ago
(2013-06-12 15:51:40 UTC)
#3
some high-level comments only.
https://codereview.appspot.com/10085049/diff/5001/unicode/encoding/encoding.go
File unicode/encoding/encoding.go (right):
https://codereview.appspot.com/10085049/diff/5001/unicode/encoding/encoding.g...
unicode/encoding/encoding.go:42: // there is not enough source bytes to decode
the next rune. For example,
This assumption is not correct for all encodings. For example, Regional
Indicator Symbols are represented as multiple codepoints in Unicode, whereas
they are represented as a single character in Shift-JIS, IIUC. At least they
should be encoded as a single unit. There is a proposal to augment this
standard such that encoders should interject a Zero-Width Joiner for consecutive
RISs. If this were to happen, the minimum character size is 11 bytes in UTF-8!
There could also be encodings where accents need to be translated into modifiers
and the like. In these cases, the minimum buffer size might be larger. I can't
think of an example, though.
https://codereview.appspot.com/10085049/diff/5001/unicode/encoding/encoding.g...
unicode/encoding/encoding.go:68: DecodeRune(p []byte) (r rune, n int, enc
Encoding)
Not all characters from a source encoding can necessarily be encoded in a single
destination rune.
https://codereview.appspot.com/10085049/diff/10001/unicode/transform/transfor...
File unicode/transform/transform.go (right):
https://codereview.appspot.com/10085049/diff/10001/unicode/transform/transfor...
unicode/transform/transform.go:5: package transform
I would put package transform directly under go.text. Transform will likely be
used throughout go.text and it is not specific to unicode.
Issue 10085049: go.net/unicode/encoding: new package.
(Closed)
Created 12 years, 4 months ago by nigeltao
Modified 12 years, 3 months ago
Reviewers: rog, andybalholm, mpvl
Base URL:
Comments: 8