|
|
Created:
5 years, 10 months ago by Mike Schwartz Modified:
5 years, 9 months ago Reviewers:
thobrla CC:
houglum, gsutil-crs_google.com, dshorten Visibility:
Public. |
DescriptionAddress problem where seek_ahead_iterator races with commands being run, resulting in errors being …
Patch Set 1 #
Total comments: 12
Patch Set 2 : Changes per review comments #MessagesTotal messages: 12
This addresses b/77655855
Sign in to reply to this message.
https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py File gslib/seek_ahead_thread.py (right): https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:112: # operations, so ignore it. Catching this exception here terminates the iterator. Is there a way we can catch it lower down the stack so that iterator can continue? https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:118: _PutToQueueWithTimeout(self.status_queue, If we do decide to terminate the iterator, should we withhold this message since the estimate is likely wrong (and potentially *very* wrong)? My concern is that if you look at the priority of the estimation made by the UI thread here: https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/ui_controller... we'll treat the seek ahead result as the source of truth until the producer thread completes 100%, which may not happen until the very end of the command.
Sign in to reply to this message.
https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py File gslib/seek_ahead_thread.py (right): https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:112: # operations, so ignore it. On 2018/06/27 20:43:54, thobrla wrote: > Catching this exception here terminates the iterator. Is there a way we can > catch it lower down the stack so that iterator can continue? How about if I add an ignore_os_errors param to the WildcardIterator constructor, and then catch the exception at https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/wildcard_iter... and ignore it there so we continue iterating? https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:118: _PutToQueueWithTimeout(self.status_queue, On 2018/06/27 20:43:54, thobrla wrote: > If we do decide to terminate the iterator, should we withhold this message since > the estimate is likely wrong (and potentially *very* wrong)? > > My concern is that if you look at the priority of the estimation made by the UI > thread here: > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/ui_controller... > > we'll treat the seek ahead result as the source of truth until the producer > thread completes 100%, which may not happen until the very end of the command. I think this problem is addressed by my above proposal.
Sign in to reply to this message.
https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py File gslib/seek_ahead_thread.py (right): https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:112: # operations, so ignore it. On 2018/06/27 20:56:24, Mike Schwartz wrote: > On 2018/06/27 20:43:54, thobrla wrote: > > Catching this exception here terminates the iterator. Is there a way we can > > catch it lower down the stack so that iterator can continue? > > How about if I add an ignore_os_errors param to the WildcardIterator > constructor, and then catch the exception at > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/wildcard_iter... > and ignore it there so we continue iterating? This approach SGTM
Sign in to reply to this message.
https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py File gslib/seek_ahead_thread.py (right): https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:112: # operations, so ignore it. On 2018/06/27 22:01:48, thobrla wrote: > On 2018/06/27 20:56:24, Mike Schwartz wrote: > > On 2018/06/27 20:43:54, thobrla wrote: > > > Catching this exception here terminates the iterator. Is there a way we can > > > catch it lower down the stack so that iterator can continue? > > > > How about if I add an ignore_os_errors param to the WildcardIterator > > constructor, and then catch the exception at > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/wildcard_iter... > > and ignore it there so we continue iterating? > > This approach SGTM Looking at the code, I'm now not sure if this is possible. seek_ahead_iterator is created at https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/commands/rsyn..., by wrapping two layers of iterators over the base diff_iterator. I think that means that we'd need to make the base diff_iterator ignore OS errors, which we don't want to do because it would impact correctness of the rsync data operations. I suppose we could define a nextIgnoringOsErrors function on the base iterator, and have diff_iterator call that function. I'm not super fond of that approach, since we're then no longer actually using the iterator interface; but it's the only approach I can see taking here. Sugggestions welcome.
Sign in to reply to this message.
https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py File gslib/seek_ahead_thread.py (right): https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:112: # operations, so ignore it. On 2018/06/28 20:00:07, Mike Schwartz wrote: > On 2018/06/27 22:01:48, thobrla wrote: > > On 2018/06/27 20:56:24, Mike Schwartz wrote: > > > On 2018/06/27 20:43:54, thobrla wrote: > > > > Catching this exception here terminates the iterator. Is there a way we > can > > > > catch it lower down the stack so that iterator can continue? > > > > > > How about if I add an ignore_os_errors param to the WildcardIterator > > > constructor, and then catch the exception at > > > > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/wildcard_iter... > > > and ignore it there so we continue iterating? > > > > This approach SGTM > > Looking at the code, I'm now not sure if this is possible. seek_ahead_iterator > is created at > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/commands/rsyn..., > by wrapping two layers of iterators over the base diff_iterator. I think that > means that we'd need to make the base diff_iterator ignore OS errors, which we > don't want to do because it would impact correctness of the rsync data > operations. > > I suppose we could define a nextIgnoringOsErrors function on the base iterator, > and have diff_iterator call that function. I'm not super fond of that approach, > since we're then no longer actually using the iterator interface; but it's the > only approach I can see taking here. > > Sugggestions welcome. I see two main possibilities: 1. Do the ugly work of plumbing a new argument through the "real" iterators so that we can skip errors deep within the iterator in the seek-ahead case. 2. Just blow up the SeekAheadIterator (as this code currently does), but don't send the message below and rely on the producer thread to create the estimate. Given your investigation, I prefer option #2; #1 seems like too much code complexity to solve this specific problem.
Sign in to reply to this message.
https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py File gslib/seek_ahead_thread.py (right): https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:112: # operations, so ignore it. On 2018/06/28 20:14:03, thobrla wrote: > On 2018/06/28 20:00:07, Mike Schwartz wrote: > > On 2018/06/27 22:01:48, thobrla wrote: > > > On 2018/06/27 20:56:24, Mike Schwartz wrote: > > > > On 2018/06/27 20:43:54, thobrla wrote: > > > > > Catching this exception here terminates the iterator. Is there a way we > > can > > > > > catch it lower down the stack so that iterator can continue? > > > > > > > > How about if I add an ignore_os_errors param to the WildcardIterator > > > > constructor, and then catch the exception at > > > > > > > > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/wildcard_iter... > > > > and ignore it there so we continue iterating? > > > > > > This approach SGTM > > > > Looking at the code, I'm now not sure if this is possible. seek_ahead_iterator > > is created at > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/commands/rsyn..., > > by wrapping two layers of iterators over the base diff_iterator. I think that > > means that we'd need to make the base diff_iterator ignore OS errors, which we > > don't want to do because it would impact correctness of the rsync data > > operations. > > > > I suppose we could define a nextIgnoringOsErrors function on the base > iterator, > > and have diff_iterator call that function. I'm not super fond of that > approach, > > since we're then no longer actually using the iterator interface; but it's the > > only approach I can see taking here. > > > > Sugggestions welcome. > > I see two main possibilities: > > 1. Do the ugly work of plumbing a new argument through the "real" iterators so > that we can skip errors deep within the iterator in the seek-ahead case. > > 2. Just blow up the SeekAheadIterator (as this code currently does), but don't > send the message below and rely on the producer thread to create the estimate. > > Given your investigation, I prefer option #2; #1 seems like too much code > complexity to solve this specific problem. Is #2 different from what my current CL does?
Sign in to reply to this message.
https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py File gslib/seek_ahead_thread.py (right): https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:112: # operations, so ignore it. On 2018/06/28 20:20:41, Mike Schwartz wrote: > On 2018/06/28 20:14:03, thobrla wrote: > > On 2018/06/28 20:00:07, Mike Schwartz wrote: > > > On 2018/06/27 22:01:48, thobrla wrote: > > > > On 2018/06/27 20:56:24, Mike Schwartz wrote: > > > > > On 2018/06/27 20:43:54, thobrla wrote: > > > > > > Catching this exception here terminates the iterator. Is there a way > we > > > can > > > > > > catch it lower down the stack so that iterator can continue? > > > > > > > > > > How about if I add an ignore_os_errors param to the WildcardIterator > > > > > constructor, and then catch the exception at > > > > > > > > > > > > > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/wildcard_iter... > > > > > and ignore it there so we continue iterating? > > > > > > > > This approach SGTM > > > > > > Looking at the code, I'm now not sure if this is possible. > seek_ahead_iterator > > > is created at > > > > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/commands/rsyn..., > > > by wrapping two layers of iterators over the base diff_iterator. I think > that > > > means that we'd need to make the base diff_iterator ignore OS errors, which > we > > > don't want to do because it would impact correctness of the rsync data > > > operations. > > > > > > I suppose we could define a nextIgnoringOsErrors function on the base > > iterator, > > > and have diff_iterator call that function. I'm not super fond of that > > approach, > > > since we're then no longer actually using the iterator interface; but it's > the > > > only approach I can see taking here. > > > > > > Sugggestions welcome. > > > > I see two main possibilities: > > > > 1. Do the ugly work of plumbing a new argument through the "real" iterators so > > that we can skip errors deep within the iterator in the seek-ahead case. > > > > 2. Just blow up the SeekAheadIterator (as this code currently does), but don't > > send the message below and rely on the producer thread to create the estimate. > > > > Given your investigation, I prefer option #2; #1 seems like too much code > > complexity to solve this specific problem. > > Is #2 different from what my current CL does? Yes: I am proposing that we return instead of calling _PutToQueueWithTimeout (which signals the UI thread that seek ahead work has completed its estimation).
Sign in to reply to this message.
https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py File gslib/seek_ahead_thread.py (right): https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:112: # operations, so ignore it. On 2018/06/28 20:24:12, thobrla wrote: > On 2018/06/28 20:20:41, Mike Schwartz wrote: > > On 2018/06/28 20:14:03, thobrla wrote: > > > On 2018/06/28 20:00:07, Mike Schwartz wrote: > > > > On 2018/06/27 22:01:48, thobrla wrote: > > > > > On 2018/06/27 20:56:24, Mike Schwartz wrote: > > > > > > On 2018/06/27 20:43:54, thobrla wrote: > > > > > > > Catching this exception here terminates the iterator. Is there a > way > > we > > > > can > > > > > > > catch it lower down the stack so that iterator can continue? > > > > > > > > > > > > How about if I add an ignore_os_errors param to the WildcardIterator > > > > > > constructor, and then catch the exception at > > > > > > > > > > > > > > > > > > > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/wildcard_iter... > > > > > > and ignore it there so we continue iterating? > > > > > > > > > > This approach SGTM > > > > > > > > Looking at the code, I'm now not sure if this is possible. > > seek_ahead_iterator > > > > is created at > > > > > > > > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/commands/rsyn..., > > > > by wrapping two layers of iterators over the base diff_iterator. I think > > that > > > > means that we'd need to make the base diff_iterator ignore OS errors, > which > > we > > > > don't want to do because it would impact correctness of the rsync data > > > > operations. > > > > > > > > I suppose we could define a nextIgnoringOsErrors function on the base > > > iterator, > > > > and have diff_iterator call that function. I'm not super fond of that > > > approach, > > > > since we're then no longer actually using the iterator interface; but it's > > the > > > > only approach I can see taking here. > > > > > > > > Sugggestions welcome. > > > > > > I see two main possibilities: > > > > > > 1. Do the ugly work of plumbing a new argument through the "real" iterators > so > > > that we can skip errors deep within the iterator in the seek-ahead case. > > > > > > 2. Just blow up the SeekAheadIterator (as this code currently does), but > don't > > > send the message below and rely on the producer thread to create the > estimate. > > > > > > Given your investigation, I prefer option #2; #1 seems like too much code > > > complexity to solve this specific problem. > > > > Is #2 different from what my current CL does? > > Yes: I am proposing that we return instead of calling _PutToQueueWithTimeout > (which signals the UI thread that seek ahead work has completed its estimation). I see. So I understand: Does that mean the SeekAheadThread will again get created and run? And, is it possible we'll encounter the race again, and so on, and never finish?
Sign in to reply to this message.
https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py File gslib/seek_ahead_thread.py (right): https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:112: # operations, so ignore it. On 2018/06/28 20:37:47, Mike Schwartz wrote: > On 2018/06/28 20:24:12, thobrla wrote: > > On 2018/06/28 20:20:41, Mike Schwartz wrote: > > > On 2018/06/28 20:14:03, thobrla wrote: > > > > On 2018/06/28 20:00:07, Mike Schwartz wrote: > > > > > On 2018/06/27 22:01:48, thobrla wrote: > > > > > > On 2018/06/27 20:56:24, Mike Schwartz wrote: > > > > > > > On 2018/06/27 20:43:54, thobrla wrote: > > > > > > > > Catching this exception here terminates the iterator. Is there a > > way > > > we > > > > > can > > > > > > > > catch it lower down the stack so that iterator can continue? > > > > > > > > > > > > > > How about if I add an ignore_os_errors param to the WildcardIterator > > > > > > > constructor, and then catch the exception at > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/wildcard_iter... > > > > > > > and ignore it there so we continue iterating? > > > > > > > > > > > > This approach SGTM > > > > > > > > > > Looking at the code, I'm now not sure if this is possible. > > > seek_ahead_iterator > > > > > is created at > > > > > > > > > > > > > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/commands/rsyn..., > > > > > by wrapping two layers of iterators over the base diff_iterator. I think > > > that > > > > > means that we'd need to make the base diff_iterator ignore OS errors, > > which > > > we > > > > > don't want to do because it would impact correctness of the rsync data > > > > > operations. > > > > > > > > > > I suppose we could define a nextIgnoringOsErrors function on the base > > > > iterator, > > > > > and have diff_iterator call that function. I'm not super fond of that > > > > approach, > > > > > since we're then no longer actually using the iterator interface; but > it's > > > the > > > > > only approach I can see taking here. > > > > > > > > > > Sugggestions welcome. > > > > > > > > I see two main possibilities: > > > > > > > > 1. Do the ugly work of plumbing a new argument through the "real" > iterators > > so > > > > that we can skip errors deep within the iterator in the seek-ahead case. > > > > > > > > 2. Just blow up the SeekAheadIterator (as this code currently does), but > > don't > > > > send the message below and rely on the producer thread to create the > > estimate. > > > > > > > > Given your investigation, I prefer option #2; #1 seems like too much code > > > > complexity to solve this specific problem. > > > > > > Is #2 different from what my current CL does? > > > > Yes: I am proposing that we return instead of calling _PutToQueueWithTimeout > > (which signals the UI thread that seek ahead work has completed its > estimation). > > I see. > > So I understand: Does that mean the SeekAheadThread will again get created and > run? And, is it possible we'll encounter the race again, and so on, and never > finish? No, if this run function exits, then the thread will terminate without ever reporting an estimate (which seems desirable in this scenario).
Sign in to reply to this message.
PTAL https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py File gslib/seek_ahead_thread.py (right): https://codereview.appspot.com/367720043/diff/1/gslib/seek_ahead_thread.py#ne... gslib/seek_ahead_thread.py:112: # operations, so ignore it. On 2018/06/29 16:09:26, thobrla wrote: > On 2018/06/28 20:37:47, Mike Schwartz wrote: > > On 2018/06/28 20:24:12, thobrla wrote: > > > On 2018/06/28 20:20:41, Mike Schwartz wrote: > > > > On 2018/06/28 20:14:03, thobrla wrote: > > > > > On 2018/06/28 20:00:07, Mike Schwartz wrote: > > > > > > On 2018/06/27 22:01:48, thobrla wrote: > > > > > > > On 2018/06/27 20:56:24, Mike Schwartz wrote: > > > > > > > > On 2018/06/27 20:43:54, thobrla wrote: > > > > > > > > > Catching this exception here terminates the iterator. Is there > a > > > way > > > > we > > > > > > can > > > > > > > > > catch it lower down the stack so that iterator can continue? > > > > > > > > > > > > > > > > How about if I add an ignore_os_errors param to the > WildcardIterator > > > > > > > > constructor, and then catch the exception at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/wildcard_iter... > > > > > > > > and ignore it there so we continue iterating? > > > > > > > > > > > > > > This approach SGTM > > > > > > > > > > > > Looking at the code, I'm now not sure if this is possible. > > > > seek_ahead_iterator > > > > > > is created at > > > > > > > > > > > > > > > > > > > > > https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/commands/rsyn..., > > > > > > by wrapping two layers of iterators over the base diff_iterator. I > think > > > > that > > > > > > means that we'd need to make the base diff_iterator ignore OS errors, > > > which > > > > we > > > > > > don't want to do because it would impact correctness of the rsync data > > > > > > operations. > > > > > > > > > > > > I suppose we could define a nextIgnoringOsErrors function on the base > > > > > iterator, > > > > > > and have diff_iterator call that function. I'm not super fond of that > > > > > approach, > > > > > > since we're then no longer actually using the iterator interface; but > > it's > > > > the > > > > > > only approach I can see taking here. > > > > > > > > > > > > Sugggestions welcome. > > > > > > > > > > I see two main possibilities: > > > > > > > > > > 1. Do the ugly work of plumbing a new argument through the "real" > > iterators > > > so > > > > > that we can skip errors deep within the iterator in the seek-ahead case. > > > > > > > > > > 2. Just blow up the SeekAheadIterator (as this code currently does), but > > > don't > > > > > send the message below and rely on the producer thread to create the > > > estimate. > > > > > > > > > > Given your investigation, I prefer option #2; #1 seems like too much > code > > > > > complexity to solve this specific problem. > > > > > > > > Is #2 different from what my current CL does? > > > > > > Yes: I am proposing that we return instead of calling _PutToQueueWithTimeout > > > (which signals the UI thread that seek ahead work has completed its > > estimation). > > > > I see. > > > > So I understand: Does that mean the SeekAheadThread will again get created and > > run? And, is it possible we'll encounter the race again, and so on, and never > > finish? > > No, if this run function exits, then the thread will terminate without ever > reporting an estimate (which seems desirable in this scenario). Agreed.
Sign in to reply to this message.
|