|
|
Created:
8 years ago by sffc Modified:
7 years, 10 months ago Base URL:
http://source.icu-project.org/repos/icu/icu4j/trunk/ Visibility:
Public. |
DescriptionJava version of DecimalFormatSymbols data loading
Patch Set 1 #
Total comments: 4
Patch Set 2 : Updating DecimalFormatSymbols to use the new sink iteration style. #
Total comments: 22
Patch Set 3 : Updating DecimalFormatSymbols with Markus's feedback. #
Total comments: 14
Patch Set 4 : Changing logic to avoid loading Latin fallback if all values are populated #
Total comments: 6
Patch Set 5 : A few more changes recommended by Markus #Patch Set 6 : Fixing MissingResourceException found by unit tests #Patch Set 7 : Changing logic to expose MissingResourceException for Latin numbering systems #MessagesTotal messages: 25
Here is the Java code review. I designed the logic of the two versions (C++ and Java) to be mostly the same.
Sign in to reply to this message.
Adding explanations for some of the changes. https://codereview.appspot.com/295950043/diff/1/main/classes/core/src/com/ibm... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (left): https://codereview.appspot.com/295950043/diff/1/main/classes/core/src/com/ibm... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:924: setLocale(uloc, uloc); I moved this section of the code up so that we create only one instance of ICUResourceBundle rather than two. https://codereview.appspot.com/295950043/diff/1/main/classes/core/src/com/ibm... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:939: I moved the monetarySeparator logic into the sink class. https://codereview.appspot.com/295950043/diff/1/main/classes/core/src/com/ibm... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:956: } The "\u00D7" default is now directly specified along with the other default array elements (now line 847). https://codereview.appspot.com/295950043/diff/1/main/classes/core/src/com/ibm... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/1/main/classes/core/src/com/ibm... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:1027: // TODO: Determine actual and valid locale correctly. The second todo was there from before (formerly line 922).
Sign in to reply to this message.
Updating DecimalFormatSymbols to use the new sink iteration style.
Sign in to reply to this message.
https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:844: * The indices of each name into the array correspond to the positoin of that item in the numberElements array. mispelt position Please try to keep line lengths to about 100. Sometimes 102 or so is fine, use your judgment. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:846: private static String[] SYMBOL_FIELDS = { I would call them SYMBOL_KEYS like in the old code. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:861: Please avoid adding new trailing whitespace. Set Eclipse to show it. You could even set it to remove all trailing whitespace on save (we talked about it a few weeks ago), but that will cause spurious file changes. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:865: private static final String LATIN_NUMBERING_SYSTEM = "latn"; I don't think that defining a constant where both the name and the value say "latn" buys anything. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:872: private static String getSymbolsPath(String nsName) { Just use prefix + nsName + suffix directly. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:901: // Look up the field index. (Java 7 doesn't have a nice way of doing this) Please remove the parenthesis. FYI, ICU4J must not go beyond Java 6 currently. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:909: // If we didn't find the field name, leave now We don't want to "leave" if we see an unknown key, we want to ignore it. Simpler code: for (int i; i < SYMBOL_KEYS.length; ++i) { if (key.contentEquals(SYMBOL_KEYS[i])) { if (numberElements[i] == null) { numberElements[i] = value.toString(); } break; } } https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:991: DecFmtDataSink sink = new DecFmtDataSink(data[0]); I don't think it works to create the sink with an already-filled-in array of strings. The sink does "put if absent" so it looks like it won't ever put any real locale data. Instead, after loading nsName and latn data, then replace any remaining null values with the ones from fallbackElements which could be either local as before or could be a private static final in the class (probably better). Also please look for whether there are multiple places in the code with this list of strings. I am pretty sure that there is code that makes a trivial version of the DecimalSymbols, with these ones here. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:1002: String[] numberElements = data[0]; Why is data an array of arrays? It looks like the outer dimension is always 1. True?
Sign in to reply to this message.
Replying to Markus. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:844: * The indices of each name into the array correspond to the positoin of that item in the numberElements array. On 2016/04/26 20:29:37, markus.icu wrote: > mispelt position > > Please try to keep line lengths to about 100. Sometimes 102 or so is fine, use > your judgment. Done. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:846: private static String[] SYMBOL_FIELDS = { On 2016/04/26 20:29:37, markus.icu wrote: > I would call them SYMBOL_KEYS like in the old code. Done. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:861: On 2016/04/26 20:29:37, markus.icu wrote: > Please avoid adding new trailing whitespace. Set Eclipse to show it. You could > even set it to remove all trailing whitespace on save (we talked about it a few > weeks ago), but that will cause spurious file changes. Done. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:865: private static final String LATIN_NUMBERING_SYSTEM = "latn"; On 2016/04/26 20:29:37, markus.icu wrote: > I don't think that defining a constant where both the name and the value say > "latn" buys anything. I made this change to be more consistent with C++ and other Java classes, which set it as a constant rather than an in-line string. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:872: private static String getSymbolsPath(String nsName) { On 2016/04/26 20:29:37, markus.icu wrote: > Just use prefix + nsName + suffix directly. Ok. I inlined this function down where I call it in initialize(). https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:901: // Look up the field index. (Java 7 doesn't have a nice way of doing this) On 2016/04/26 20:29:37, markus.icu wrote: > Please remove the parenthesis. > > FYI, ICU4J must not go beyond Java 6 currently. Done. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:909: // If we didn't find the field name, leave now On 2016/04/26 20:29:37, markus.icu wrote: > We don't want to "leave" if we see an unknown key, we want to ignore it. > > Simpler code: > > for (int i; i < SYMBOL_KEYS.length; ++i) { > if (key.contentEquals(SYMBOL_KEYS[i])) { > if (numberElements[i] == null) { > numberElements[i] = value.toString(); > } > break; > } > } You're right, it's supposed to be "continue", not "return". That was a remnant from when I was copying code over from the old sink system. Good catch. I prefer the style with less nesting, but I can change it if you want. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:991: DecFmtDataSink sink = new DecFmtDataSink(data[0]); On 2016/04/26 20:29:37, markus.icu wrote: > I don't think it works to create the sink with an already-filled-in array of > strings. The sink does "put if absent" so it looks like it won't ever put any > real locale data. > > Instead, after loading nsName and latn data, then replace any remaining null > values with the ones from fallbackElements which could be either local as before > or could be a private static final in the class (probably better). > > Also please look for whether there are multiple places in the code with this > list of strings. I am pretty sure that there is code that makes a trivial > version of the DecimalSymbols, with these ones here. You're right, very sloppy on my part. At one point I had that line correct but it looks like I changed it for some reason. It appears that parts of this list of strings appear in DecimalFormat also as default values, but they are overwritten there by DecimalFormatSymbols if DecimalFormatSymbols is available. For example, DecimalSymbols.toPattern(). https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:1002: String[] numberElements = data[0]; On 2016/04/26 20:29:37, markus.icu wrote: > Why is data an array of arrays? It looks like the outer dimension is always 1. > True? Yes. I was wondering that myself. I'm just being consistent with what was already there.
Sign in to reply to this message.
Updating DecimalFormatSymbols with Markus's feedback.
Sign in to reply to this message.
https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:909: // If we didn't find the field name, leave now On 2016/04/26 21:50:14, sffc wrote: > I prefer the style with less nesting, but I can change it if you want. Either is ok, but I prefer the more-nested version because a) it keeps the loop variable scoped to just the loop b) it has to test for i==length only once If you were to move the map-key-to-index into a separate function, then the call site would look much like what you have here, but it does not seem worth moving it out. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:1002: String[] numberElements = data[0]; Please add a TODO comment about this. https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:898: // Properties: I would remove this line. I don't see any value. https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:901: // Constructor ditto https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:981: // Cache miss! Seems unnecessary. You already added "Try loading from the cache", it seems obvious what "data == null" means after that. https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:982: // Start with default values Remove this line. https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:989: // Load the Latin fallback Optional: You could add a counter to the data sink, increment it for each item you add, and don't even try the Latin fallback or the defaults if you already have all items (count == KEYS.length). https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:991: rb.getAllItemsWithFallback(NUMBER_ELEMENTS + "/" + LATIN_NUMBERING_SYSTEM + "/" + SYMBOLS, sink); This is ok but looks clunky. I would write "NumberElements/" + nsName + "/symbols" above, and simply "NumberElements/latn/symbols" here.
Sign in to reply to this message.
https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:898: // Properties: On 2016/04/28 18:27:56, markus.icu wrote: > I would remove this line. I don't see any value. Done. https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:901: // Constructor On 2016/04/28 18:27:55, markus.icu wrote: > ditto Done. https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:981: // Cache miss! On 2016/04/28 18:27:56, markus.icu wrote: > Seems unnecessary. You already added "Try loading from the cache", it seems > obvious what "data == null" means after that. I like to err on the side of too many comments over too few. In this particular case, these are the same two comments that were there before ("try the cache first" and "cache miss"). https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:982: // Start with default values On 2016/04/28 18:27:56, markus.icu wrote: > Remove this line. Done. https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:989: // Load the Latin fallback On 2016/04/28 18:27:56, markus.icu wrote: > Optional: You could add a counter to the data sink, increment it for each item > you add, and don't even try the Latin fallback or the defaults if you already > have all items (count == KEYS.length). Hmm... it's an interesting idea. The only issue I see is that the code has inline fallbacks for some of the symbols (monetarySeparator and monetaryGroupingSeparator) that are okay for it to not load (see line 1018-1030). However, it might even be the case that it would be better behavior to trigger those fallbacks than replace those two symbols with Latin (which is a change from the old logic). Thoughts? https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:991: rb.getAllItemsWithFallback(NUMBER_ELEMENTS + "/" + LATIN_NUMBERING_SYSTEM + "/" + SYMBOLS, sink); On 2016/04/28 18:27:55, markus.icu wrote: > This is ok but looks clunky. > > I would write "NumberElements/" + nsName + "/symbols" above, and simply > "NumberElements/latn/symbols" here. I did it this way to keep DRY. The string "latn", for example, is used in multiple places throughout the function. The C++ version of this class, as well as the Java and C++ versions of the other class I'm working on, all use constants rather than explicitly inlined strings. I think these should compile to inlined strings in bytecode (except for nsName). If you'd rather do the style with the explicit inline strings, do you want me to change it in the other classes as well?
Sign in to reply to this message.
Removing a few unnecessary comments according to feedback.
Sign in to reply to this message.
https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:989: // Load the Latin fallback On 2016/04/28 18:46:13, sffc wrote: > On 2016/04/28 18:27:56, markus.icu wrote: > > Optional: You could add a counter to the data sink, increment it for each item > > you add, and don't even try the Latin fallback or the defaults if you already > > have all items (count == KEYS.length). > > Hmm... it's an interesting idea. The only issue I see is that the code has > inline fallbacks for some of the symbols (monetarySeparator and > monetaryGroupingSeparator) that are okay for it to not load (see line > 1018-1030). However, it might even be the case that it would be better behavior > to trigger those fallbacks than replace those two symbols with Latin (which is a > change from the old logic). Thoughts? Also, since the missing values are filled in before the 1018-1030 monetary fallback, the fallback code will never run. Maybe I could move the monetary fallback up here before Latin runs, and then run Latin only if there are still "null" values in the array?
Sign in to reply to this message.
Changing logic to avoid loading Latin fallback if all values are populated
Sign in to reply to this message.
https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/40001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:989: // Load the Latin fallback On 2016/04/28 18:59:10, sffc wrote: > On 2016/04/28 18:46:13, sffc wrote: > > On 2016/04/28 18:27:56, markus.icu wrote: > > > Optional: You could add a counter to the data sink, increment it for each > item > > > you add, and don't even try the Latin fallback or the defaults if you > already > > > have all items (count == KEYS.length). > > > > Hmm... it's an interesting idea. The only issue I see is that the code has > > inline fallbacks for some of the symbols (monetarySeparator and > > monetaryGroupingSeparator) that are okay for it to not load (see line > > 1018-1030). However, it might even be the case that it would be better > behavior > > to trigger those fallbacks than replace those two symbols with Latin (which is > a > > change from the old logic). Thoughts? > > Also, since the missing values are filled in before the 1018-1030 monetary > fallback, the fallback code will never run. Maybe I could move the monetary > fallback up here before Latin runs, and then run Latin only if there are still > "null" values in the array? I resolved this in the latest code iteration (version 3).
Sign in to reply to this message.
Please also check for unresolved comments. https://codereview.appspot.com/295950043/diff/80001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/80001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:30: * Why did you add this line? https://codereview.appspot.com/295950043/diff/80001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:986: resolveMissingMonetarySeparators(data[0]); I think you should do this exactly once, unconditionally after the Latin fallback but before the ultimate defaults. You then won't need a function.
Sign in to reply to this message.
https://codereview.appspot.com/295950043/diff/80001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/80001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:30: * On 2016/04/28 20:47:48, markus.icu wrote: > Why did you add this line? I didn't. Must have been Eclipse. I'll fix it. https://codereview.appspot.com/295950043/diff/80001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:986: resolveMissingMonetarySeparators(data[0]); On 2016/04/28 20:47:48, markus.icu wrote: > I think you should do this exactly once, unconditionally after the Latin > fallback but before the ultimate defaults. You then won't need a function. I did it this way for the case when the locale NS is missing the monetary separators but has everything else. If we can populate the monetary separators here, then we can avoid consuming Latin. (This is a slight change from the previous logic; is it ok?)
Sign in to reply to this message.
Resetting comment lines at top of file
Sign in to reply to this message.
Adding if{} braces as oer style guidelines
Sign in to reply to this message.
https://codereview.appspot.com/295950043/diff/80001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/80001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:986: resolveMissingMonetarySeparators(data[0]); On 2016/04/28 21:03:33, sffc wrote: > I did it this way for the case when the locale NS is missing the monetary > separators but has everything else. If we can populate the monetary separators > here, then we can avoid consuming Latin. (This is a slight change from the > previous logic; is it ok?) My interpretation of the logic is that the data wants to fall back to latn, and when the currency formatting symbols are missing then copy the non-currency ones. the SYMBOL_DEFAULTS should always be last-resort values. so i think the correct way is to load nsName, then latn, then do the currency-to-standard fallback, then last resort. Correct data according to what appears to be the data model beats "avoid consuming Latin".
Sign in to reply to this message.
https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:909: // If we didn't find the field name, leave now On 2016/04/28 18:27:55, markus.icu wrote: > On 2016/04/26 21:50:14, sffc wrote: > > I prefer the style with less nesting, but I can change it if you want. > > Either is ok, but I prefer the more-nested version because > a) it keeps the loop variable scoped to just the loop > b) it has to test for i==length only once > > If you were to move the map-key-to-index into a separate function, then the call > site would look much like what you have here, but it does not seem worth moving > it out. Done. https://codereview.appspot.com/295950043/diff/20001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:1002: String[] numberElements = data[0]; On 2016/04/28 18:27:55, markus.icu wrote: > Please add a TODO comment about this. Done. https://codereview.appspot.com/295950043/diff/80001/main/classes/core/src/com... File main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java (right): https://codereview.appspot.com/295950043/diff/80001/main/classes/core/src/com... main/classes/core/src/com/ibm/icu/text/DecimalFormatSymbols.java:986: resolveMissingMonetarySeparators(data[0]); On 2016/04/28 22:45:51, markus.icu wrote: > On 2016/04/28 21:03:33, sffc wrote: > > I did it this way for the case when the locale NS is missing the monetary > > separators but has everything else. If we can populate the monetary > separators > > here, then we can avoid consuming Latin. (This is a slight change from the > > previous logic; is it ok?) > > My interpretation of the logic is that the data wants to fall back to latn, and > when the currency formatting symbols are missing then copy the non-currency > ones. the SYMBOL_DEFAULTS should always be last-resort values. so i think the > correct way is to load nsName, then latn, then do the currency-to-standard > fallback, then last resort. > > Correct data according to what appears to be the data model beats "avoid > consuming Latin". Thanks, that makes more sense. I will change the code so that it only resolves monetary separators after Latin.
Sign in to reply to this message.
A few more changes recommended by Markus
Sign in to reply to this message.
LGTM I assume you have run the test suite and it passes.
Sign in to reply to this message.
Fixing MissingResourceException found by unit tests
Sign in to reply to this message.
still LGTM
Sign in to reply to this message.
Changing logic to expose MissingResourceException for Latin numbering systems
Sign in to reply to this message.
LGTM
Sign in to reply to this message.
|