|
|
Created:
7 years, 9 months ago by michelzel Modified:
7 years, 8 months ago CC:
gsutil-crs_google.com Visibility:
Public. |
DescriptionUI Metadata Implementation
Patch Set 1 #
Total comments: 30
Patch Set 2 : CR feedback round 1 thobrla #
Total comments: 20
Patch Set 3 : CR feedback round 2 thobrla #
Total comments: 2
Patch Set 4 : CR feedback round 3 thobrla #
Total comments: 16
Patch Set 5 : CR feedback round 4 thobrla + metadata tests #
Total comments: 8
Patch Set 6 : CR feedback round 5 thobrla #
Total comments: 15
Patch Set 7 : CR feedback round 6 thobrla + round 1 yarbrough #
Total comments: 2
Patch Set 8 : CR feedback round 7 thobrla #Patch Set 9 : #
Total comments: 4
MessagesTotal messages: 21
UI Metadata Implementation
Sign in to reply to this message.
More review comments to come later. https://codereview.appspot.com/302240043/diff/1/gslib/command.py File gslib/command.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/command.py#newcode1829 gslib/command.py:1829: try: # NameExpansionResult with size field on data commands. If we're checking is_instance above, whey not also use that check here as well? https://codereview.appspot.com/302240043/diff/1/gslib/tests/test_rm.py File gslib/tests/test_rm.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/tests/test_rm.py#newcode85 gslib/tests/test_rm.py:85: stderr = self._CleanRmUiOutputBeforeChecking(stderr) Don't we still want to check that we got a UI message for the final number of objects? https://codereview.appspot.com/302240043/diff/1/gslib/thread_message.py File gslib/thread_message.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/thread_message.py#newco... gslib/thread_message.py:46: """Creates a FileMessage. No longer a FileMessage? https://codereview.appspot.com/302240043/diff/1/gslib/thread_message.py#newco... gslib/thread_message.py:50: num_objects: Number of objects updated since last MetadataMessage. Default When will this ever be > 1?
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/1/gslib/command.py File gslib/command.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/command.py#newcode1829 gslib/command.py:1829: try: # NameExpansionResult with size field on data commands. On 2016/07/22 23:54:29, thobrla wrote: > If we're checking is_instance above, whey not also use that check here as well? I thought (and I remembered of testing it) that the field size would only exist on data operations. However, I checked it again and it also works for metadata operation so we can definitely avoid this try-except statement. https://codereview.appspot.com/302240043/diff/1/gslib/tests/test_rm.py File gslib/tests/test_rm.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/tests/test_rm.py#newcode85 gslib/tests/test_rm.py:85: stderr = self._CleanRmUiOutputBeforeChecking(stderr) On 2016/07/22 23:54:29, thobrla wrote: > Don't we still want to check that we got a UI message for the final number of > objects? Yes, I'll work on that during the weekend. This should be pretty simple. https://codereview.appspot.com/302240043/diff/1/gslib/thread_message.py File gslib/thread_message.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/thread_message.py#newco... gslib/thread_message.py:46: """Creates a FileMessage. On 2016/07/22 23:54:29, thobrla wrote: > No longer a FileMessage? Acknowledged. https://codereview.appspot.com/302240043/diff/1/gslib/thread_message.py#newco... gslib/thread_message.py:50: num_objects: Number of objects updated since last MetadataMessage. Default On 2016/07/22 23:54:30, thobrla wrote: > When will this ever be > 1? It's not supposed to be, at least the way I implemented it. It could be useful to keep it for testing or if we find too expensive to send a MetadataMessage for each object, and decide to send them in periodic updates instead (however, that's not the case today because processing those messages is really fast). But otherwise we could remove it. What do you think?
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcode73 gslib/ui_controller.py:73: class MetadataManager(object): There seems to be a lot of shared code and arguments between this and the DataManager class. I don't feel strongly about it, so feel free not to do this for this review, but consider making a common subclass so that: 1. There is less code duplication 2. The differences between the MetadataManager and the DataManager are clearer. https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:289: width = len(', ETA ' + PrettyTime(1)) Would it be easier to calculate this padding once at the beginning of the function? (also in DataManager) https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:292: stream.write(string_to_print + ' \r') Could you document why we print this exact number of spaces? (also in DataManager) https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:810: def _InvalidManagerMessageCombination(self, status_message): Why does this live in Manager even though it references DataManager and MetadataManager? Those classes should be responsible for understanding what inputs are valid. https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:897: # Should not process a MetadataMessage with a DataManager or Wouldn't this indicate a program bug? It seems like we should at least log it. https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:931: def put(self, status_message, timeout=None): # pylint: disable=invalid-name, unused-argument, line-too-long You can avoid line-too-long by doing: # pylint: disable=invalid-name, unused-argument def put(self, status_message, timeout=None): self.ui_controller.Call(status_message, self.stream) # pylint: enable=invalid-name, unused-argument
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/1/gslib/command.py File gslib/command.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/command.py#newcode1829 gslib/command.py:1829: try: # NameExpansionResult with size field on data commands. On 2016/07/23 00:29:59, michelzel wrote: > On 2016/07/22 23:54:29, thobrla wrote: > > If we're checking is_instance above, whey not also use that check here as > well? > > I thought (and I remembered of testing it) that the field size would only exist > on data operations. However, I checked it again and it also works for metadata > operation so we can definitely avoid this try-except statement. Update: indeed, there are some tests that fail here. Those are NameExpansionResults that do not have 'size' (some of the tests were related with seek_ahead_thread, not sure if that's the reason). https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcode73 gslib/ui_controller.py:73: class MetadataManager(object): On 2016/07/25 16:53:45, thobrla wrote: > There seems to be a lot of shared code and arguments between this and the > DataManager class. > > I don't feel strongly about it, so feel free not to do this for this review, but > consider making a common subclass so that: > 1. There is less code duplication > 2. The differences between the MetadataManager and the DataManager are clearer. I grouped the constructor, HandleSeekAheadMessage, HandleProducerThreadMessagem ShouldPrintSpinner, PrintSpinner and ShouldPrintProgress on the Manager subclass. I think the other functions work better separated. https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:289: width = len(', ETA ' + PrettyTime(1)) On 2016/07/25 16:53:45, thobrla wrote: > Would it be easier to calculate this padding once at the beginning of the > function? (also in DataManager) We won't need this anymore. https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:292: stream.write(string_to_print + ' \r') On 2016/07/25 16:53:45, thobrla wrote: > Could you document why we print this exact number of spaces? (also in > DataManager) I had actually forgotten about this, it was just a random number of spaces to make sure output was clear. Now that we have defined a model that fits on the console width, the number of spaces should be console width - used width. https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:810: def _InvalidManagerMessageCombination(self, status_message): On 2016/07/25 16:53:45, thobrla wrote: > Why does this live in Manager even though it references DataManager and > MetadataManager? Those classes should be responsible for understanding what > inputs are valid. And we should make one for each, right? I'll do that, the conditions will be definitely simpler and clearer. https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:897: # Should not process a MetadataMessage with a DataManager or On 2016/07/25 16:53:45, thobrla wrote: > Wouldn't this indicate a program bug? It seems like we should at least log it. I made this thinking about our discussion on the mv command: it first copies the files (creating a DataManager) and then deletes the previous files (which will generate some MetadataMessages). https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:931: def put(self, status_message, timeout=None): # pylint: disable=invalid-name, unused-argument, line-too-long On 2016/07/25 16:53:45, thobrla wrote: > You can avoid line-too-long by doing: > > # pylint: disable=invalid-name, unused-argument > def put(self, status_message, timeout=None): > self.ui_controller.Call(status_message, self.stream) > # pylint: enable=invalid-name, unused-argument Awesome, thanks!
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/1/gslib/command.py File gslib/command.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/command.py#newcode1829 gslib/command.py:1829: try: # NameExpansionResult with size field on data commands. On 2016/07/25 21:16:45, michelzel wrote: > On 2016/07/23 00:29:59, michelzel wrote: > > On 2016/07/22 23:54:29, thobrla wrote: > > > If we're checking is_instance above, whey not also use that check here as > > well? > > > > I thought (and I remembered of testing it) that the field size would only > exist > > on data operations. However, I checked it again and it also works for metadata > > operation so we can definitely avoid this try-except statement. > > Update: indeed, there are some tests that fail here. Those are > NameExpansionResults that do not have 'size' (some of the tests were related > with seek_ahead_thread, not sure if that's the reason). Can we cover this with attribute checks instead of a bare exception handler: if args.expanded_result: json_expanded_result = json.loads(args.expanded_result) if 'size' in json_expanded_result: ... https://codereview.appspot.com/302240043/diff/1/gslib/thread_message.py File gslib/thread_message.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/thread_message.py#newco... gslib/thread_message.py:50: num_objects: Number of objects updated since last MetadataMessage. Default On 2016/07/23 00:29:59, michelzel wrote: > On 2016/07/22 23:54:30, thobrla wrote: > > When will this ever be > 1? > > It's not supposed to be, at least the way I implemented it. It could be useful > to keep it for testing or if we find too expensive to send a MetadataMessage for > each object, and decide to send them in periodic updates instead (however, > that's not the case today because processing those messages is really fast). But > otherwise we could remove it. What do you think? If we're not using it, I think we should remove it on the principle that we shouldn't add code until it's needed. https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:810: def _InvalidManagerMessageCombination(self, status_message): On 2016/07/25 21:16:45, michelzel wrote: > On 2016/07/25 16:53:45, thobrla wrote: > > Why does this live in Manager even though it references DataManager and > > MetadataManager? Those classes should be responsible for understanding what > > inputs are valid. > > And we should make one for each, right? I'll do that, the conditions will be > definitely simpler and clearer. Acknowledged. https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:897: # Should not process a MetadataMessage with a DataManager or On 2016/07/25 21:16:45, michelzel wrote: > On 2016/07/25 16:53:45, thobrla wrote: > > Wouldn't this indicate a program bug? It seems like we should at least log > it. > > I made this thinking about our discussion on the mv command: it first copies the > files (creating a DataManager) and then deletes the previous files (which will > generate some MetadataMessages). I see - let's call out that case directly as an example. That potentially creates problems in the order in which we instantiates the managers, right? For example, if we get a MetadataMessage first, we'll instantiate the metadata manager? https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:73: class Manager(object): Can we choose a more specific name for this class (consider: StatusMessageManager)? multiprocessing has a Manager class that makes this name overloaded. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:83: """Instantiates a MetadataManager. This is no longer a MetadataManager, right? https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:101: console_width: Width to display on console. This determines how much we'll pad, but if I set the number lower, do I get a different output? Documentation should be clearer on this. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:116: # Only used on data operations Should this only exist in the data class, then? Or perhaps the comment should just say that this remains 0 in the metadata case? https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:202: """Instantiates a MetadataManager. If the arguments are identical to the superclass's init, it's okay to omit this and use a one-line docstring: """See argument documentation in XManager base class.""" https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:225: first_throughput_latency, custom_time, Keyword arguments should be initialized with keywords, i.e.: update_time_period=update_time_period, ... You can indent these arguments by 4 on the next line so that you have more space. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:228: # Time at last info update displayed. This code also seems shared between the two classes? https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:355: # else: Remove this commented-out code? https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:365: def InvalidManagerMessageCombination(self, status_message): Naming is a bit strange here - does this return true or false if the message is invalid? Consider function name: CanHandleStatusMessage https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:756: # else: Here also, commented out code
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:73: class Manager(object): On 2016/07/25 22:16:50, thobrla wrote: > Can we choose a more specific name for this class (consider: > StatusMessageManager)? multiprocessing has a Manager class that makes this name > overloaded. Yes, sure. My bad. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:83: """Instantiates a MetadataManager. On 2016/07/25 22:16:50, thobrla wrote: > This is no longer a MetadataManager, right? Acknowledged. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:101: console_width: Width to display on console. On 2016/07/25 22:16:50, thobrla wrote: > This determines how much we'll pad, but if I set the number lower, do I get a > different output? Documentation should be clearer on this. Edited the description. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:116: # Only used on data operations On 2016/07/25 22:16:51, thobrla wrote: > Should this only exist in the data class, then? Or perhaps the comment should > just say that this remains 0 in the metadata case? It's the second. The reason for that is to allow using the same HandleSeekAheadMessage. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:202: """Instantiates a MetadataManager. On 2016/07/25 22:16:50, thobrla wrote: > If the arguments are identical to the superclass's init, it's okay to omit this > and use a one-line docstring: > > """See argument documentation in XManager base class.""" Oh ok, that's what I was trying to ask you, sorry if I wasn't clear. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:225: first_throughput_latency, custom_time, On 2016/07/25 22:16:50, thobrla wrote: > Keyword arguments should be initialized with keywords, i.e.: > update_time_period=update_time_period, ... > > You can indent these arguments by 4 on the next line so that you have more > space. Got it. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:228: # Time at last info update displayed. On 2016/07/25 22:16:51, thobrla wrote: > This code also seems shared between the two classes? It is. I think only old_objects_finished is specific to MetadataManager here. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:355: # else: On 2016/07/25 22:16:51, thobrla wrote: > Remove this commented-out code? Didn't realize it was still there, sorry. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:365: def InvalidManagerMessageCombination(self, status_message): On 2016/07/25 22:16:50, thobrla wrote: > Naming is a bit strange here - does this return true or false if the message is > invalid? > > Consider function name: CanHandleStatusMessage It returns true, because it is invalid. But I can easily change that. https://codereview.appspot.com/302240043/diff/40001/gslib/ui_controller.py#ne... gslib/ui_controller.py:756: # else: On 2016/07/25 22:16:50, thobrla wrote: > Here also, commented out code Same.
Sign in to reply to this message.
There are still some unaddressed comments, particularly around the ordering of instantiation of managers. https://codereview.appspot.com/302240043/diff/60001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/60001/gslib/ui_controller.py#ne... gslib/ui_controller.py:102: visual output, just the space padding. For proper Trailing whitespace here and elsewhere, please run lint before submitting patches.
Sign in to reply to this message.
Besides those comments, I also realized that we might need two sources on DataManagers: one for the total size and the other for the number of objects. Even though they tend to be correlated, sometimes we receive a SeekAheadMessage or a ProducerThreadMessage that updates only num_objects, for instance. Because of that, I decided to change a bit the way PrintProgress works, and I now calculate each portion of the string separately. This also addresses Brandon's comment on the File UI CL: while I'm not exactly using Format, I think the construction of string_to_print looks better and more readable now! I'll work on some metadata tests tomorrow morning (I've done a couple of tests for hash and rewrite, but decided not to post them here so we can focus on the metadata part for now). https://codereview.appspot.com/302240043/diff/1/gslib/command.py File gslib/command.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/command.py#newcode1829 gslib/command.py:1829: try: # NameExpansionResult with size field on data commands. On 2016/07/25 22:16:50, thobrla wrote: > On 2016/07/25 21:16:45, michelzel wrote: > > On 2016/07/23 00:29:59, michelzel wrote: > > > On 2016/07/22 23:54:29, thobrla wrote: > > > > If we're checking is_instance above, whey not also use that check here as > > > well? > > > > > > I thought (and I remembered of testing it) that the field size would only > > exist > > > on data operations. However, I checked it again and it also works for > metadata > > > operation so we can definitely avoid this try-except statement. > > > > Update: indeed, there are some tests that fail here. Those are > > NameExpansionResults that do not have 'size' (some of the tests were related > > with seek_ahead_thread, not sure if that's the reason). > > Can we cover this with attribute checks instead of a bare exception handler: > > if args.expanded_result: > json_expanded_result = json.loads(args.expanded_result) > if 'size' in json_expanded_result: > ... Yes, changed it. https://codereview.appspot.com/302240043/diff/1/gslib/thread_message.py File gslib/thread_message.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/thread_message.py#newco... gslib/thread_message.py:50: num_objects: Number of objects updated since last MetadataMessage. Default On 2016/07/25 22:16:50, thobrla wrote: > On 2016/07/23 00:29:59, michelzel wrote: > > On 2016/07/22 23:54:30, thobrla wrote: > > > When will this ever be > 1? > > > > It's not supposed to be, at least the way I implemented it. It could be useful > > to keep it for testing or if we find too expensive to send a MetadataMessage > for > > each object, and decide to send them in periodic updates instead (however, > > that's not the case today because processing those messages is really fast). > But > > otherwise we could remove it. What do you think? > > If we're not using it, I think we should remove it on the principle that we > shouldn't add code until it's needed. Removed. https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:897: # Should not process a MetadataMessage with a DataManager or On 2016/07/25 22:16:50, thobrla wrote: > On 2016/07/25 21:16:45, michelzel wrote: > > On 2016/07/25 16:53:45, thobrla wrote: > > > Wouldn't this indicate a program bug? It seems like we should at least log > > it. > > > > I made this thinking about our discussion on the mv command: it first copies > the > > files (creating a DataManager) and then deletes the previous files (which will > > generate some MetadataMessages). > > I see - let's call out that case directly as an example. That potentially > creates problems in the order in which we instantiates the managers, right? For > example, if we get a MetadataMessage first, we'll instantiate the metadata > manager? Yes, but I don't see any good way of avoiding that (maybe creating two managers regardless, but I think we should be able to do something better). When would this happen, though? https://codereview.appspot.com/302240043/diff/60001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/60001/gslib/ui_controller.py#ne... gslib/ui_controller.py:102: visual output, just the space padding. For proper On 2016/07/25 23:46:49, thobrla wrote: > Trailing whitespace here and elsewhere, please run lint before submitting > patches. Sorry, I tried to send you fast so you could take a look. I added a post-it on my screen to see if I don't forget to run lint ever again!
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:897: # Should not process a MetadataMessage with a DataManager or On 2016/07/26 07:42:46, michelzel wrote: > On 2016/07/25 22:16:50, thobrla wrote: > > On 2016/07/25 21:16:45, michelzel wrote: > > > On 2016/07/25 16:53:45, thobrla wrote: > > > > Wouldn't this indicate a program bug? It seems like we should at least > log > > > it. > > > > > > I made this thinking about our discussion on the mv command: it first copies > > the > > > files (creating a DataManager) and then deletes the previous files (which > will > > > generate some MetadataMessages). > > > > I see - let's call out that case directly as an example. That potentially > > creates problems in the order in which we instantiates the managers, right? > For > > example, if we get a MetadataMessage first, we'll instantiate the metadata > > manager? > > Yes, but I don't see any good way of avoiding that (maybe creating two managers > regardless, but I think we should be able to do something better). When would > this happen, though? I think we're protected against this for now because the mv command does the copy and the delete on the same thread, but it's possible that will not always be the case. We're creating a dependency on message ordering here (that one type always comes before the other) and that's likely to lead to bugs in the future. So I think we need to address this. One way to do that potentially is to always have DataManager supercede MetadataManager if we get a data message. My thinking is, if we get a data message and we have a MetadataManager: 1. Save the state from the metadata manager 2. Upgrade it to the data manager with a starting number of objects equal to what the metadata manager has seen. https://codereview.appspot.com/302240043/diff/80001/gslib/thread_message.py File gslib/thread_message.py (right): https://codereview.appspot.com/302240043/diff/80001/gslib/thread_message.py#n... gslib/thread_message.py:52: self.num_objects = 1 Where is this still used? https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:165: return No need for return statement here https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:343: throughput = ' %.2f' % self.throughput + ' objects/s' Avoid string concatenation with: ' %.2f objects/s' % self.throughput https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:354: string_to_print = (char_to_print + objects_completed + Use format or %s so you don't do a lot of concatenation unnecessarily. Also this would enable you to get rid of the leading/trailing spaces in the messages above. '%s %s %s %s' % (variables...) '{char_to_print} ...'.format( {'char_to_print': char_to_print, ... }) https://pyformat.info/ https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:692: # An example of bytes_progress would be '[101.0 MiB/1.0 GiB]'. 101.0 is 5 characters and 1.0 is 3 characters - is that intended? It seems like they should be the same? https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:720: string_to_print = (char_to_print + objects_completed + bytes_progress + Same comment as above.
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/1/gslib/ui_controller.py#newcod... gslib/ui_controller.py:897: # Should not process a MetadataMessage with a DataManager or On 2016/07/26 16:54:07, thobrla wrote: > On 2016/07/26 07:42:46, michelzel wrote: > > On 2016/07/25 22:16:50, thobrla wrote: > > > On 2016/07/25 21:16:45, michelzel wrote: > > > > On 2016/07/25 16:53:45, thobrla wrote: > > > > > Wouldn't this indicate a program bug? It seems like we should at least > > log > > > > it. > > > > > > > > I made this thinking about our discussion on the mv command: it first > copies > > > the > > > > files (creating a DataManager) and then deletes the previous files (which > > will > > > > generate some MetadataMessages). > > > > > > I see - let's call out that case directly as an example. That potentially > > > creates problems in the order in which we instantiates the managers, right? > > For > > > example, if we get a MetadataMessage first, we'll instantiate the metadata > > > manager? > > > > Yes, but I don't see any good way of avoiding that (maybe creating two > managers > > regardless, but I think we should be able to do something better). When would > > this happen, though? > > I think we're protected against this for now because the mv command does the > copy and the delete on the same thread, but it's possible that will not always > be the case. We're creating a dependency on message ordering here (that one > type always comes before the other) and that's likely to lead to bugs in the > future. So I think we need to address this. > > One way to do that potentially is to always have DataManager supercede > MetadataManager if we get a data message. My thinking is, if we get a data > message and we have a MetadataManager: > 1. Save the state from the metadata manager > 2. Upgrade it to the data manager with a starting number of objects equal to > what the metadata manager has seen. Ok, I'll try to implement that. However, if we are concerned about a race condition, I don't think we need to worry about the number of objects the metadata manager has seen, since they will end up being counted by the latter data messages, right? https://codereview.appspot.com/302240043/diff/80001/gslib/thread_message.py File gslib/thread_message.py (right): https://codereview.appspot.com/302240043/diff/80001/gslib/thread_message.py#n... gslib/thread_message.py:52: self.num_objects = 1 On 2016/07/26 16:54:07, thobrla wrote: > Where is this still used? I thought I was using this on _HandleMetadataMessage, but I'm not, so we can remove it. https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:165: return On 2016/07/26 16:54:08, thobrla wrote: > No need for return statement here Acknowledged. https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:343: throughput = ' %.2f' % self.throughput + ' objects/s' On 2016/07/26 16:54:08, thobrla wrote: > Avoid string concatenation with: > ' %.2f objects/s' % self.throughput Acknowledged. https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:354: string_to_print = (char_to_print + objects_completed + On 2016/07/26 16:54:08, thobrla wrote: > Use format or %s so you don't do a lot of concatenation unnecessarily. > > Also this would enable you to get rid of the leading/trailing spaces in the > messages above. > > '%s %s %s %s' % (variables...) > '{char_to_print} ...'.format( > {'char_to_print': char_to_print, > ... > }) > > https://pyformat.info/ The thing with the leading/trailing spaces is that we might use an extra space if a field is missing. If we use something like '(...) {percentage_completed} (...)'.format(...) and percentage_completed is not calculated, we will be printing an extra space. If you think this is too minor, I can adjust it later, but I'll leave the leading/trailing spaces for now and add the format. https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:692: # An example of bytes_progress would be '[101.0 MiB/1.0 GiB]'. On 2016/07/26 16:54:08, thobrla wrote: > 101.0 is 5 characters and 1.0 is 3 characters - is that intended? It seems like > they should be the same? I don't know. Because the total size tends to be relatively fixed throughout our operation, I think it would be better not to allocate an unnecessary number of spaces for it beforehand. This is different for the progress, as it is constantly changing, so I think space-padding just for the progress might be a good alternative. Obs: I'll use a fixed width of 9 characters (5 for number + ' GiB', for instance). It could technically go up to 10 characters with something like (1010.1 MiB) but I don't think we should waste a space almost unused for the whole operation. Let me know if you think differently. https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:720: string_to_print = (char_to_print + objects_completed + bytes_progress + On 2016/07/26 16:54:08, thobrla wrote: > Same comment as above. Acknowledged.
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:354: string_to_print = (char_to_print + objects_completed + On 2016/07/26 19:14:29, michelzel wrote: > On 2016/07/26 16:54:08, thobrla wrote: > > Use format or %s so you don't do a lot of concatenation unnecessarily. > > > > Also this would enable you to get rid of the leading/trailing spaces in the > > messages above. > > > > '%s %s %s %s' % (variables...) > > '{char_to_print} ...'.format( > > {'char_to_print': char_to_print, > > ... > > }) > > > > https://pyformat.info/ > > The thing with the leading/trailing spaces is that we might use an extra space > if a field is missing. If we use something like > '(...) {percentage_completed} (...)'.format(...) and percentage_completed is not > calculated, we will be printing an extra space. If you think this is too minor, > I can adjust it later, but I'll leave the leading/trailing spaces for now and > add the format. As discussed, if this happens then the UI would jump around anyway. https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:692: # An example of bytes_progress would be '[101.0 MiB/1.0 GiB]'. On 2016/07/26 19:14:29, michelzel wrote: > On 2016/07/26 16:54:08, thobrla wrote: > > 101.0 is 5 characters and 1.0 is 3 characters - is that intended? It seems > like > > they should be the same? > > I don't know. Because the total size tends to be relatively fixed throughout our > operation, I think it would be better not to allocate an unnecessary number of > spaces for it beforehand. This is different for the progress, as it is > constantly changing, so I think space-padding just for the progress might be a > good alternative. > > Obs: I'll use a fixed width of 9 characters (5 for number + ' GiB', for > instance). It could technically go up to 10 characters with something like > (1010.1 MiB) but I don't think we should waste a space almost unused for the > whole operation. Let me know if you think differently. Okay, using fewer space for the fixed size makes sense to me. But shouldn't we ensure the number of characters is fixed? Meaning 123 GiB as opposed to 123.0 GiB? https://codereview.appspot.com/302240043/diff/120001/gslib/tests/test_ui.py File gslib/tests/test_ui.py (right): https://codereview.appspot.com/302240043/diff/120001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:648: stderr = self.RunGsUtil(['-m', 'setmeta', '-R', '-h', This will be subject to eventual bucket listing consistency, so consider passing the object arguments directly as opposed to using recursion on the bucket. Otherwise you can model the eventual listing consistency handling code around some of the simpler rm tests. https://codereview.appspot.com/302240043/diff/120001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:672: def test_ui_acl_with_m_flag(self): Still need unit tests for the metadata cases. https://codereview.appspot.com/302240043/diff/120001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/120001/gslib/ui_controller.py#n... gslib/ui_controller.py:332: percentage_completed = percentage + '% Done' Again, use substitution: '%s% Done' % percentage https://codereview.appspot.com/302240043/diff/120001/gslib/ui_controller.py#n... gslib/ui_controller.py:852: # avoid a possible race condition. Describe the race condition in more detail, or at least describe that DataManager trumps MetadataManager and that if we have mixed operations we'll use a data UI
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:354: string_to_print = (char_to_print + objects_completed + On 2016/07/26 19:47:41, thobrla wrote: > On 2016/07/26 19:14:29, michelzel wrote: > > On 2016/07/26 16:54:08, thobrla wrote: > > > Use format or %s so you don't do a lot of concatenation unnecessarily. > > > > > > Also this would enable you to get rid of the leading/trailing spaces in the > > > messages above. > > > > > > '%s %s %s %s' % (variables...) > > > '{char_to_print} ...'.format( > > > {'char_to_print': char_to_print, > > > ... > > > }) > > > > > > https://pyformat.info/ > > > > The thing with the leading/trailing spaces is that we might use an extra space > > if a field is missing. If we use something like > > '(...) {percentage_completed} (...)'.format(...) and percentage_completed is > not > > calculated, we will be printing an extra space. If you think this is too > minor, > > I can adjust it later, but I'll leave the leading/trailing spaces for now and > > add the format. > > As discussed, if this happens then the UI would jump around anyway. Done. https://codereview.appspot.com/302240043/diff/80001/gslib/ui_controller.py#ne... gslib/ui_controller.py:692: # An example of bytes_progress would be '[101.0 MiB/1.0 GiB]'. On 2016/07/26 19:47:41, thobrla wrote: > On 2016/07/26 19:14:29, michelzel wrote: > > On 2016/07/26 16:54:08, thobrla wrote: > > > 101.0 is 5 characters and 1.0 is 3 characters - is that intended? It seems > > like > > > they should be the same? > > > > I don't know. Because the total size tends to be relatively fixed throughout > our > > operation, I think it would be better not to allocate an unnecessary number of > > spaces for it beforehand. This is different for the progress, as it is > > constantly changing, so I think space-padding just for the progress might be a > > good alternative. > > > > Obs: I'll use a fixed width of 9 characters (5 for number + ' GiB', for > > instance). It could technically go up to 10 characters with something like > > (1010.1 MiB) but I don't think we should waste a space almost unused for the > > whole operation. Let me know if you think differently. > > Okay, using fewer space for the fixed size makes sense to me. But shouldn't we > ensure the number of characters is fixed? Meaning 123 GiB as opposed to 123.0 > GiB? Added a auxiliary function AdjustWidth to do what we discussed. https://codereview.appspot.com/302240043/diff/120001/gslib/tests/test_ui.py File gslib/tests/test_ui.py (right): https://codereview.appspot.com/302240043/diff/120001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:648: stderr = self.RunGsUtil(['-m', 'setmeta', '-R', '-h', On 2016/07/26 19:47:41, thobrla wrote: > This will be subject to eventual bucket listing consistency, so consider passing > the object arguments directly as opposed to using recursion on the bucket. > Otherwise you can model the eventual listing consistency handling code around > some of the simpler rm tests. Fixed. https://codereview.appspot.com/302240043/diff/120001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:672: def test_ui_acl_with_m_flag(self): On 2016/07/26 19:47:41, thobrla wrote: > Still need unit tests for the metadata cases. Created a couple. Just tested the manager hierarchy and checked if MetadataMessages are being correctly processed. Let me know if there's something else. https://codereview.appspot.com/302240043/diff/120001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/120001/gslib/ui_controller.py#n... gslib/ui_controller.py:332: percentage_completed = percentage + '% Done' On 2016/07/26 19:47:42, thobrla wrote: > Again, use substitution: '%s% Done' % percentage As discussed, this produces an error when running, as warned by lint. Keeping this version for now. https://codereview.appspot.com/302240043/diff/120001/gslib/ui_controller.py#n... gslib/ui_controller.py:852: # avoid a possible race condition. On 2016/07/26 19:47:42, thobrla wrote: > Describe the race condition in more detail, or at least describe that > DataManager trumps MetadataManager and that if we have mixed operations we'll > use a data UI Should be clearer now. https://codereview.appspot.com/302240043/diff/180001/gslib/tests/test_ui.py File gslib/tests/test_ui.py (right): https://codereview.appspot.com/302240043/diff/180001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:1104: time.sleep(0.1) # Just to ensure ui_controller processes the message. I know this is not good practice, but the tests are not dependent on how long the program sleeps. This is just to ensure the ui_controller processes the message before we check something that depends on this processing. If there's a better way of doing this, let me know. https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py#n... gslib/ui_controller.py:878: # SeekAheadThread, should be auto-sufficient). There's a trailing space here, please ignore it (it should be the only one, though!).
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py#n... gslib/ui_controller.py:74: """Adjusts proper width for UI output. Based on this description alone, I can't tell what this function does. https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py#n... gslib/ui_controller.py:888: if (isinstance(self.manager, DataManager) and Checking the class of the manager to decide whether it can handle a certain type of message feels like bad OO. Why not pass the message to the manager regardless and let it decide whether it's interested in the message?
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/180001/gslib/tests/test_ui.py File gslib/tests/test_ui.py (right): https://codereview.appspot.com/302240043/diff/180001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:1033: Unit tests for AdjustWidth? https://codereview.appspot.com/302240043/diff/180001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:1074: # We should not have estimated the number of objects as 100 in the UI. The comment (100) doesn't match the code (130). https://codereview.appspot.com/302240043/diff/180001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:1104: time.sleep(0.1) # Just to ensure ui_controller processes the message. On 2016/07/27 07:18:21, michelzel wrote: > I know this is not good practice, but the tests are not dependent on how long > the program sleeps. This is just to ensure the ui_controller processes the > message before we check something that depends on this processing. If there's a > better way of doing this, let me know. Instead of asserting this on the UI controller directly, can you just ensure the right messages are output in the case where we get a metadata message followed by a data message? This would allow you to ensure this scenario works. Alternatively, if you wanted to test the UI controller without time, couldn't you do it single-threaded via the MainThreadUIQueue? https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py#n... gslib/ui_controller.py:74: """Adjusts proper width for UI output. On 2016/07/27 16:09:35, yarbrough wrote: > Based on this description alone, I can't tell what this function does. Suggest function name: BytesToFixedWidthString https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py#n... gslib/ui_controller.py:878: # SeekAheadThread, should be auto-sufficient). What does "auto-sufficient" mean? I don't understand the part in the parentheses.
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/180001/gslib/tests/test_ui.py File gslib/tests/test_ui.py (right): https://codereview.appspot.com/302240043/diff/180001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:1033: On 2016/07/27 17:21:34, thobrla wrote: > Unit tests for AdjustWidth? Created. https://codereview.appspot.com/302240043/diff/180001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:1074: # We should not have estimated the number of objects as 100 in the UI. On 2016/07/27 17:21:34, thobrla wrote: > The comment (100) doesn't match the code (130). Acknowledged. https://codereview.appspot.com/302240043/diff/180001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:1104: time.sleep(0.1) # Just to ensure ui_controller processes the message. On 2016/07/27 17:21:34, thobrla wrote: > On 2016/07/27 07:18:21, michelzel wrote: > > I know this is not good practice, but the tests are not dependent on how long > > the program sleeps. This is just to ensure the ui_controller processes the > > message before we check something that depends on this processing. If there's > a > > better way of doing this, let me know. > > Instead of asserting this on the UI controller directly, can you just ensure the > right messages are output in the case where we get a metadata message followed > by a data message? This would allow you to ensure this scenario works. > > Alternatively, if you wanted to test the UI controller without time, couldn't > you do it single-threaded via the MainThreadUIQueue? Yes, I think using the MainThreadUIQueue is indeed the best way of doing this. https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py#n... gslib/ui_controller.py:74: """Adjusts proper width for UI output. On 2016/07/27 17:21:34, thobrla wrote: > On 2016/07/27 16:09:35, yarbrough wrote: > > Based on this description alone, I can't tell what this function does. > > Suggest function name: BytesToFixedWidthString Changed the name and the description. https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py#n... gslib/ui_controller.py:878: # SeekAheadThread, should be auto-sufficient). On 2016/07/27 17:21:34, thobrla wrote: > What does "auto-sufficient" mean? I don't understand the part in the > parentheses. Sorry, mixed with Portuguese, it was supposed to be self-sufficient (=auto-suficiente in Portuguese!). I updated the description nonetheless. https://codereview.appspot.com/302240043/diff/180001/gslib/ui_controller.py#n... gslib/ui_controller.py:888: if (isinstance(self.manager, DataManager) and On 2016/07/27 16:09:35, yarbrough wrote: > Checking the class of the manager to decide whether it can handle a certain type > of message feels like bad OO. Why not pass the message to the manager regardless > and let it decide whether it's interested in the message? I restored CanHandleMessage. It is now supposed to return 1 if the manager can handle the message, 0 if it cannot, or -1 if it cannot handle the message and the manager should be replaced.
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/220001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/220001/gslib/ui_controller.py#n... gslib/ui_controller.py:396: if (isinstance(status_message, FileMessage) or This seems brittle if we add further message classes. Can you reverse the logic so that we assert only the types of messages we can handle as opposed to omitting the ones we can't?
Sign in to reply to this message.
https://codereview.appspot.com/302240043/diff/220001/gslib/ui_controller.py File gslib/ui_controller.py (right): https://codereview.appspot.com/302240043/diff/220001/gslib/ui_controller.py#n... gslib/ui_controller.py:396: if (isinstance(status_message, FileMessage) or On 2016/07/27 20:30:09, thobrla wrote: > This seems brittle if we add further message classes. Can you reverse the logic > so that we assert only the types of messages we can handle as opposed to > omitting the ones we can't? I see what you're saying. I thought it would be similar, as we potentially have to adjust the function either way (depending on the type of message we are adding), but it does make more sense to list all the 'handleable' messages here.
Sign in to reply to this message.
LGTM
Sign in to reply to this message.
LGTM I had just a couple of small nitpicks. https://codereview.appspot.com/302240043/diff/280001/gslib/tests/test_ui.py File gslib/tests/test_ui.py (right): https://codereview.appspot.com/302240043/diff/280001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:90: def _FindAppropriateComplement(metadata): Maybe use a more descriptive word than 'complement'? Maybe 'descriptionString' or 'typeDescriptorSuffix'? https://codereview.appspot.com/302240043/diff/280001/gslib/thread_message.py File gslib/thread_message.py (right): https://codereview.appspot.com/302240043/diff/280001/gslib/thread_message.py#... gslib/thread_message.py:42: """Creates a MetadataMessage. This line should describe MetadataMessage.
Sign in to reply to this message.
Awesome, I made the changes as described below. If there's something else to change in the description, let me know and I can add that after merging. https://codereview.appspot.com/302240043/diff/280001/gslib/tests/test_ui.py File gslib/tests/test_ui.py (right): https://codereview.appspot.com/302240043/diff/280001/gslib/tests/test_ui.py#n... gslib/tests/test_ui.py:90: def _FindAppropriateComplement(metadata): On 2016/07/28 19:49:07, yarbrough wrote: > Maybe use a more descriptive word than 'complement'? Maybe 'descriptionString' > or 'typeDescriptorSuffix'? Changed for _FindAppropriateDescriptionString. https://codereview.appspot.com/302240043/diff/280001/gslib/thread_message.py File gslib/thread_message.py (right): https://codereview.appspot.com/302240043/diff/280001/gslib/thread_message.py#... gslib/thread_message.py:42: """Creates a MetadataMessage. On 2016/07/28 19:49:07, yarbrough wrote: > This line should describe MetadataMessage. Added "A MetadataMessage simply indicates that a metadata operation on a given object has been successfully done. The only passed argument is the time when such operation has finished."
Sign in to reply to this message.
|