|
|
Created:
9 years ago by f-s-p Modified:
8 years, 10 months ago Reviewers:
Joachim Metz CC:
log2timeline-dev_googlegroups.com Visibility:
Public. |
Description.
Patch Set 1 #
Total comments: 5
MessagesTotal messages: 16
https://codereview.appspot.com/224720043/diff/1/plaso/output/json_out.py File plaso/output/json_out.py (right): https://codereview.appspot.com/224720043/diff/1/plaso/output/json_out.py#newc... plaso/output/json_out.py:56: if 'pathspec' in json_dict and isinstance(json_dict['pathspec'], I opt to line break after isinstance( and continue with a 4 space indentation to prevent codeblocks forming on the right e.g. if u'pathspec' in json_dict and isinstance( json_dict['pathspec'], basestring): Also add use u'' for text strings and b'' for byte strings/streams to indicate their difference. https://codereview.appspot.com/224720043/diff/1/plaso/output/json_out.py#newc... plaso/output/json_out.py:58: # convert JSON string to a dictionary It is not directly clear what this code block does. I use the following criteria for myself if I have not seen this code in 6 months would I still be able to tell what it is supposed to do. If the answer is no I should change the code. However this change is likely best done in JsonEventObjectSerializer so that we change the path spec to a dict before serialization, and the same for other objects like the event tag. Maybe add a method WriteSerializedFlattened or change WriteSerialized not to do a double serialization.
Sign in to reply to this message.
https://codereview.appspot.com/224720043/diff/1/plaso/output/json_out_test.py File plaso/output/json_out_test.py (right): https://codereview.appspot.com/224720043/diff/1/plaso/output/json_out_test.py... plaso/output/json_out_test.py:90: expected_event_body = ( Please make sure this file is up to date: https://github.com/log2timeline/plaso/blob/master/plaso/output/json_out_test.py Some recent fixes for flaky test in Windows.
Sign in to reply to this message.
https://codereview.appspot.com/224720043/diff/1/plaso/output/json_out.py File plaso/output/json_out.py (right): https://codereview.appspot.com/224720043/diff/1/plaso/output/json_out.py#newc... plaso/output/json_out.py:56: if 'pathspec' in json_dict and isinstance(json_dict['pathspec'], On 2015/03/27 07:12:18, Joachim Metz wrote: > I opt to line break after isinstance( and continue with a 4 space indentation to > prevent codeblocks forming on the right e.g. > > if u'pathspec' in json_dict and isinstance( > json_dict['pathspec'], basestring): > > Also add use u'' for text strings and b'' for byte strings/streams to indicate > their difference. Acknowledged. https://codereview.appspot.com/224720043/diff/1/plaso/output/json_out.py#newc... plaso/output/json_out.py:58: # convert JSON string to a dictionary On 2015/03/27 07:12:18, Joachim Metz wrote: > It is not directly clear what this code block does. I use the following criteria > for myself if I have not seen this code in 6 months would I still be able to > tell what it is supposed to do. If the answer is no I should change the code. > > However this change is likely best done in JsonEventObjectSerializer so that we > change the path spec to a dict before serialization, and the same for other > objects like the event tag. Maybe add a method WriteSerializedFlattened or > change WriteSerialized not to do a double serialization. At first glance, it seems a simple change to edit dfvfs json_serializer to not double-serialize its 'parent' field, and then change the plaso JsonEventObjectSerializer to similarly not double-serialize its 'pathspec'. However, this causes a cascade of changes to ReadSerialized in both plaso and dfvfs, where we would have to test those attributes in order to support both the new and old forms. Alternatively, we can add a get_dict() function to PathSpec in dfvfs/path/path_spec.py that returns a dictionary. The implementation of that function could be similar to WriteSerialized but instead of "return json.dumps(attributes)" it could just "return attributes". The latter feels cleaner to me. Which would you prefer?
Sign in to reply to this message.
ReadSerialized would need to be changed as well. I was thinking along the lines of writing a custom JSONEncoder/JSONDecoder. That hooks the PathSpecJSON serializer for PathSpec objects and equiv for other objects. I can have a look if you want. I'm not in favor of adding get_dict() method, it feels like a Python style hack that will backfire somewhere in the future. A PathSpecDict (de)serializer solution maybe.
Sign in to reply to this message.
know that: https://github.com/log2timeline/plaso/commit/c9ef97618447438b6bf283e56e7364ec... is going to affect your CL. So please update from origin master.
Sign in to reply to this message.
On 2015/04/01 06:31:32, Joachim Metz wrote: > know that: > https://github.com/log2timeline/plaso/commit/c9ef97618447438b6bf283e56e7364ec... > is going to affect your CL. So please update from origin master. FYI I've changed the dfvfs JSON serializer a bit: https://codereview.appspot.com/220560043/
Sign in to reply to this message.
On 2015/04/01 19:27:18, Joachim Metz wrote: > On 2015/04/01 06:31:32, Joachim Metz wrote: > > know that: > > > https://github.com/log2timeline/plaso/commit/c9ef97618447438b6bf283e56e7364ec... > > is going to affect your CL. So please update from origin master. > > FYI I've changed the dfvfs JSON serializer a bit: > https://codereview.appspot.com/220560043/ Thank you! It looks like you are not converting the 'parent' back to an object in _ConvertDictToObject(). Is this intentional? Also, I do not see anything in the code that would handle the old JSON format. Does this mean any output generated by the old code would not be supported by the new code? This may be a problem for anyone using log2timeline's --serializer-format=json option.
Sign in to reply to this message.
> Thank you! It looks like you are not converting the 'parent' back to an object > in _ConvertDictToObject(). Is this intentional? I primarily whipped up the CL as a show case, I'll make sure to add a read test before it gets committed. > Also, I do not see anything in the code that would handle the old JSON format. > Does this mean any output generated by the old code would not be supported by > the new code? This may be a problem for anyone using log2timeline's > --serializer-format=json option. I agree that it is not elegant to just break formats, but is it really worth investing the time to support both? I'm wondering is there a real need for this if so we could build a converter if needed.
Sign in to reply to this message.
Sorry not paying attention > It looks like you are not converting the 'parent' back to an object > in _ConvertDictToObject(). Is this intentional? Yes from: https://docs.python.org/2/library/json.html object_hook is an optional function that will be called with the result of any object literal decoded (a dict). The return value of object_hook will be used instead of the dict. This feature can be used to implement custom decoders (e.g. JSON-RPC class hinting). It will call the _ConvertDictToObject() function for every dict and then assign the PathSpec object to parent instead of the JSON dict.
Sign in to reply to this message.
FYI this CL is affected by: https://codereview.appspot.com/224140043/
Sign in to reply to this message.
On 2015/04/08 16:23:17, Joachim Metz wrote: > FYI this CL is affected by: https://codereview.appspot.com/224140043/ I likely have re-implemented your CL, but I needed to move this forward due to the recent dfvfs changes: https://codereview.appspot.com/229760043/ And there has been no reply in 6 days.
Sign in to reply to this message.
On 2015/04/08 20:25:32, Joachim Metz wrote: > On 2015/04/08 16:23:17, Joachim Metz wrote: > > FYI this CL is affected by: https://codereview.appspot.com/224140043/ > > I likely have re-implemented your CL, but I needed to move this forward due to > the recent dfvfs changes: > https://codereview.appspot.com/229760043/ > > And there has been no reply in 6 days. Thanks. Your CL(?) looks like it takes care of what I need. I think I am just too used to the git(hub|lab) way of handling code submissions. This CodeReview really threw me off, especially when I had to create a Google account just to use it and then it kept sending update e-mails to that account even though I (thought I) told it about my e-mail address. I'll wait a few days in case there are any other comments and then close out this one, or feel free to do so without me.
Sign in to reply to this message.
CL => Change list (http://en.wikipedia.org/wiki/Revision_control) Both of them have pros and cons, codereview (or rietveld) is far from optimal Our main problem with the github PR (pull request) interface does not fit our review needs: * it spams the reviewer with a message for every comment * it does not offer a patch set type of review, so it is hard to see what changed during review steps * it does not handle large CL well, codereview somewhat better * as far as I know no easy way to stack (diff base) CLs for review * makes a mess of the commit history, every commit in the PR is reflected in the master branch (e.g. https://github.com/ForensicArtifacts/artifacts/commits/master) So it looks easier for the occasional submitter but is larger burden on the reviewer. The advantage of github PR is the integration with third party services. So I'm working on a hybrid workflow: https://codereview.appspot.com/198670043/ > I'll wait a few days in case there are any other comments and then close out this one, or feel free to do so without me. Please do, I cannot close the issue for you.
Sign in to reply to this message.
On 2015/04/09 04:33:31, Joachim Metz wrote: > CL => Change list (http://en.wikipedia.org/wiki/Revision_control) > > Both of them have pros and cons, codereview (or rietveld) is far from optimal > > Our main problem with the github PR (pull request) interface does not fit our > review needs: > * it spams the reviewer with a message for every comment > * it does not offer a patch set type of review, so it is hard to see what > changed during review steps > * it does not handle large CL well, codereview somewhat better > * as far as I know no easy way to stack (diff base) CLs for review > * makes a mess of the commit history, every commit in the PR is reflected in the > master branch (e.g. > https://github.com/ForensicArtifacts/artifacts/commits/master) > > So it looks easier for the occasional submitter but is larger burden on the > reviewer. > > The advantage of github PR is the integration with third party services. > So I'm working on a hybrid workflow: > https://codereview.appspot.com/198670043/ > > > I'll wait a few days in case there are any other comments and then close out > this one, or feel free to do so without me. > > Please do, I cannot close the issue for you. Will close out the issue and my pull request as soon as your changes are accepted. Regarding the reviewer process, you may be interested in this article: http://blog.spreedly.com/2014/06/24/merge-pull-request-considered-harmful/ It talks about some of the things you are concerned with, such as avoiding making a mess of the commit history. Looking forward to seeing your changes to the process, especially if any use of pull requests.
Sign in to reply to this message.
> Regarding the reviewer process, you may be interested in this article: > http://blog.spreedly.com/2014/06/24/merge-pull-request-considered-harmful/ Thanks for the tip, we recently also discovered https://reviewable.io/ Looks like it might be worth a trial.
Sign in to reply to this message.
|