Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(2036)

Issue 303350043: Finalize PerformanceSummary implementation

Can't Edit
Can't Publish+Mail
Start Review
Created:
7 years, 8 months ago by jessicazhu
Modified:
7 years, 8 months ago
Reviewers:
houglum, thobrla
CC:
gsutil-crs_google.com
Visibility:
Public.

Description

Note: Opened a separate CL due to rebasing. Feature-wise this CL adds to 302360043 with (1) number of objects/files transferred and (2) timestamp. This also fixes the testMetricsPosting failure due to exact ordering. Commit message: Implement PerformanceSummary analytics collection This commit implements collection of performance statistics after cp and rsync are run to send to Google Analytics.

Patch Set 1 : #

Total comments: 9

Patch Set 2 : Address Patch Set 1 comments #

Patch Set 3 : Modify thread idle/execution time collection #

Total comments: 14

Patch Set 4 : Address Patch Set 3 comments #

Unified diffs Side-by-side diffs Delta from patch set Stats (+134 lines, -126 lines) Patch
M gslib/command.py View 1 2 3 7 chunks +58 lines, -37 lines 0 comments Download
M gslib/metrics.py View 1 2 3 5 chunks +28 lines, -3 lines 0 comments Download
M gslib/tests/test_metrics.py View 1 2 3 11 chunks +34 lines, -58 lines 0 comments Download
M gslib/thread_message.py View 1 2 3 2 chunks +11 lines, -17 lines 0 comments Download
M gslib/ui_controller.py View 1 2 3 6 chunks +3 lines, -11 lines 0 comments Download

Messages

Total messages: 7
jessicazhu
https://codereview.appspot.com/303350043/diff/40001/gslib/command.py File gslib/command.py (left): https://codereview.appspot.com/303350043/diff/40001/gslib/command.py#oldcode1923 gslib/command.py:1923: args_iterator.ProcessSourceUrlType(self.status_queue, args) This was a rebasing error. https://codereview.appspot.com/303350043/diff/40001/gslib/tests/test_metrics.py File ...
7 years, 8 months ago (2016-08-18 21:27:00 UTC) #1
thobrla
https://codereview.appspot.com/303350043/diff/40001/gslib/tests/test_metrics.py File gslib/tests/test_metrics.py (right): https://codereview.appspot.com/303350043/diff/40001/gslib/tests/test_metrics.py#newcode702 gslib/tests/test_metrics.py:702: # The metric body might not be in the ...
7 years, 8 months ago (2016-08-18 23:22:42 UTC) #2
jessicazhu
https://codereview.appspot.com/303350043/diff/40001/gslib/tests/test_metrics.py File gslib/tests/test_metrics.py (right): https://codereview.appspot.com/303350043/diff/40001/gslib/tests/test_metrics.py#newcode702 gslib/tests/test_metrics.py:702: # The metric body might not be in the ...
7 years, 8 months ago (2016-08-19 17:02:52 UTC) #3
jessicazhu
As discussed offline, made changes to thread idle time collection.
7 years, 8 months ago (2016-08-19 20:45:56 UTC) #4
thobrla
https://codereview.appspot.com/303350043/diff/140001/gslib/command.py File gslib/command.py (right): https://codereview.appspot.com/303350043/diff/140001/gslib/command.py#newcode2065 gslib/command.py:2065: @CaptureAndLogException Since this exception can happen during normal execution, ...
7 years, 8 months ago (2016-08-19 21:07:12 UTC) #5
thobrla
https://codereview.appspot.com/303350043/diff/40001/gslib/thread_message.py File gslib/thread_message.py (right): https://codereview.appspot.com/303350043/diff/40001/gslib/thread_message.py#newcode338 gslib/thread_message.py:338: self.collected_info = {'uses_slice': uses_slice} On 2016/08/19 17:02:52, jessicazhu wrote: ...
7 years, 8 months ago (2016-08-19 21:16:17 UTC) #6
jessicazhu
7 years, 8 months ago (2016-08-19 23:11:11 UTC) #7
https://codereview.appspot.com/303350043/diff/40001/gslib/thread_message.py
File gslib/thread_message.py (right):

https://codereview.appspot.com/303350043/diff/40001/gslib/thread_message.py#n...
gslib/thread_message.py:338: self.collected_info = {'uses_slice': uses_slice}
On 2016/08/19 21:16:17, thobrla wrote:
> On 2016/08/19 17:02:52, jessicazhu wrote:
> > On 2016/08/18 23:22:41, thobrla wrote:
> > > Why did we eliminate all of these?
> > 
> > Originally we were posting has_file_src, has_cloud_src, is_daisy_chain, and
> > provider_type from the NameExpansionIterator wrapper. I changed that so that
> > this info would be logged at the end of the top-level Apply call, which
> doesn't
> > require a post to the status queue. Originally num_objects_transferred
updated
> > that metric every time a FileMessage was posted, but now this number gets
> logged
> > only when a FinalMessage is posted. I left collected_info as a dictionary so
> > that it would be easy in the future to add more parameters to this if
> necessary.
> 
> I don't feel strongly about this, but we shouldn't extend this to be a dict
> until we actually need it.

Done.

https://codereview.appspot.com/303350043/diff/140001/gslib/command.py
File gslib/command.py (right):

https://codereview.appspot.com/303350043/diff/140001/gslib/command.py#newcode...
gslib/command.py:2065: @CaptureAndLogException
On 2016/08/19 21:07:12, thobrla wrote:
> Since this exception can happen during normal execution, do we really want to
> log it?  Also the error message will say it happened during metrics
collection,
> which is not quite accurate.

Done.

https://codereview.appspot.com/303350043/diff/140001/gslib/command.py#newcode...
gslib/command.py:2150: self.total_execution_time += exec_time
On 2016/08/19 21:07:12, thobrla wrote:
> Since we initialize total_execution_time to self.start_block_time - init_time,
> won't the first call record double the actual execution time for thread
startup?
> 
> Since end_block_time = init_time, could we just leave this as-is and
initialize
> total_execution_time to 0 in init?

No; the first call to WorkerThread._StartBlockedTime initializes the thread
stat, but doesn't call ThreadStat.StartBlockedTime. That being said, it makes
more sense to do what you're saying, and call ThreadStat.StartBlockedTime after
its initialization.

https://codereview.appspot.com/303350043/diff/140001/gslib/command.py#newcode...
gslib/command.py:2157: def AggregateStat(self, end_time):
On 2016/08/19 21:07:12, thobrla wrote:
> We're only aggregating idle time here, not execution time, so should this
> function be called AggregateIdleTime?

We're aggregating one or the other.

https://codereview.appspot.com/303350043/diff/140001/gslib/command.py#newcode...
gslib/command.py:2159: # Apply ended not during a task_queue.get() call, or
there was an exception
On 2016/08/19 21:07:12, thobrla wrote:
> Consider moving these comments inside the if block.

Done.

https://codereview.appspot.com/303350043/diff/140001/gslib/command.py#newcode...
gslib/command.py:2159: # Apply ended not during a task_queue.get() call, or
there was an exception
On 2016/08/19 21:07:12, thobrla wrote:
> Consider: "Apply ended before we blocked on task_queue.get()"

Done.

https://codereview.appspot.com/303350043/diff/140001/gslib/command.py#newcode...
gslib/command.py:2164: # Apply ended during a task_queue.get() call, or there
was an exception
On 2016/08/19 21:07:12, thobrla wrote:
> Consider: "Apply ended while we were blocked on task_queue.get()"

Done.

https://codereview.appspot.com/303350043/diff/140001/gslib/command.py#newcode...
gslib/command.py:2168: elif self.start_block_time > self.end_block_time:
On 2016/08/19 21:07:12, thobrla wrote:
> There's a third case where start_block_time and end_block_time are equal -
what
> do we do in that case (it seems like we'll leave idle time alone but it may be
> useful to explain why).

In the third case I don't think we do anything -- should I add an else: pass
with a comment?
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b