Hi all,
The patch is changed in insert DB with executemany().
I found there is a trouble in importing listings to DB as below:
loop: # 1 million listings
condition # the loop or the condition costs 100 seconds
operation of inserting listings # the operation costs 190 seconds
but if we filter the condition, the operation of inserting listing will cost 40
seconds only. so I think the transaction of sqlite is effected at here and I
think if the time interval of two insert operation is a little long, then the
first operation will execute commit(), then the data will be written to disk,
this leads to more time cost.
Now the function is changed to:
loop:
condition
assign to a list
loop the list
operation of inserting
This new function can reduce 1/3 time.
Please help review. Thank.
Wenxin, please also help review the CL and add your comments. http://codereview.appspot.com/898044/diff/1/3 File python/transitfeed_editor.py (right): ...
Hi,
Thank for your review.
http://codereview.appspot.com/898044/diff/1/3
File python/transitfeed_editor.py (right):
http://codereview.appspot.com/898044/diff/1/3#newcode90
python/transitfeed_editor.py:90: return str
On 2010/04/14 09:40:06, wLiu.sjtu wrote:
> Will replacing empty string to None reduce time?
No
In 1031 lines below, the function will export the listings from DB to Objects
and skips converting empty string to None, so the problems will throw ERRORs on
here if it's not None and it's blank.
So I added this function.
http://codereview.appspot.com/898044/diff/1/3#newcode1020
python/transitfeed_editor.py:1020:
On 2010/04/14 09:40:06, wLiu.sjtu wrote:
> Please add comments saying that InsertRows will improve the total insert
> operation speed.
For example:
feed: sandieg.zip
before changing the code, it costs 452 seconds with running the project and now
it costs 311 seconds on my computer.
On Thu, Apr 15, 2010 at 5:21 PM, <quguangfan@gmail.com> wrote:
> Hi,
>
> Thank for your review.
>
>
>
> http://codereview.appspot.com/898044/diff/1/3
> File python/transitfeed_editor.py (right):
>
> http://codereview.appspot.com/898044/diff/1/3#newcode90
> python/transitfeed_editor.py:90: return str
> On 2010/04/14 09:40:06, wLiu.sjtu wrote:
>
>> Will replacing empty string to None reduce time?
>>
>
> No
>
> In 1031 lines below, the function will export the listings from DB to
> Objects and skips converting empty string to None, so the problems will
> throw ERRORs on here if it's not None and it's blank.
> So I added this function.
So it's bug and you fixed it, please add comments before line 1031 to
explain why you did it.
>
>
> http://codereview.appspot.com/898044/diff/1/3#newcode1020
> python/transitfeed_editor.py:1020:
> On 2010/04/14 09:40:06, wLiu.sjtu wrote:
>
>> Please add comments saying that InsertRows will improve the total
>>
> insert
>
>> operation speed.
>>
>
> For example:
> feed: sandieg.zip
>
> before changing the code, it costs 452 seconds with running the project
> and now it costs 311 seconds on my computer.
I mean you should add comments for that.
>
>
> http://codereview.appspot.com/898044/show
>
Issue 898044: Changed the insert function to redule the time cost
Created 14 years ago by quguangfan
Modified 7 years, 4 months ago
Reviewers: leio.chen, wliu.sjtu_gmail.com, baiming
Base URL: http://scheduleeditor.googlecode.com/svn/trunk/
Comments: 4