Issue 145220043: call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage

Can't Edit
Can't Publish+Mail
Start Review

Created:
10 years, 7 months ago by chatgris

Modified:
10 years, 7 months ago

Reviewers:
gvrpython, yselivanov, GvR

Visibility:
Public.

Description

This patch is intended to fix http://bugs.python.org/issue22448. Option b) was implemented. The measure to indicate that scheduled event filtering (to remove cancelled events) needs to be performed is 1) There are at least 100 events scheduled. This prevents needless reheapifying if there are only a handful of events. 2) More than 50% of scheduled events are cancelled. The use of a percentage ensures that a list of 10,000 events isn't reheaped for some small fixed number of cancelled events. Otherwise 50% was just chosen as it seemed a reasonable measure. Note that _cancel_count is not always completely accurate, nor does it need to be: - It may be possible for events that are _ready to be cancelled, and it may be possible for race conditions to occur if two threads are cancelling the same timer handle at the same time. - Other threads may cancel events while _scheduled events are being rebuilt. Note that I added a condition (events.py) to avoid repeated cancellation messing up the _cancel_count. Once I added that, I thought it would make sense to include the rest of the method in the condition. This has the added benefit of the string representation of the arguments not getting wiped on repeated cancellation. test_base_events.py and test_events.py were both updated to test these changes.

Patch Set 1 #

Patch Set 2 : pep8 line length #

Patch Set 3 : Test that cancelled events at head of scheduled list are accounted for #

Patch Set 4 : Moved double cancellation test to debug test #

Patch Set 5 : Removed blank indentation #

Patch Set 6 : Better documentation for test case #

Patch Set 7 : Removed redundant leading spaces on blank lines #

Total comments: 2

Patch Set 8 : Demonstrates options on how to stop overcounting events + misc feedback fixes #

Patch Set 9 : Track scheduled handles by flag inline #

Total comments: 13

Patch Set 10 : Various nits fixed #

Patch Set 11 : Test code that ties into new constants fixed #

Total comments: 5

Patch Set 12 : Minor fixes based on latest feedback #

Total comments: 1

Created: 10 years, 7 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+145 lines, -26 lines)			Patch
M	asyncio/base_events.py	View	1 2 3 4 5 6 7 8 9 10 11	6 chunks	+37 lines, -7 lines	1 comment	Download
M	asyncio/events.py	View	1 2 3 4 5 6 7 8 9 10 11	5 chunks	+20 lines, -9 lines	0 comments	Download
M	tests/test_base_events.py	View	1 2 3 4 5 6 7 8 9 10 11	3 chunks	+77 lines, -7 lines	0 comments	Download
M	tests/test_events.py	View	1 2 3 4 5 6 7 8 9	1 chunk	+11 lines, -3 lines	0 comments	Download

Messages

Total messages: 24

Expand All Messages | Collapse All Messages

yselivanov

https://codereview.appspot.com/145220043/diff/110001/asyncio/base_events.py File asyncio/base_events.py (right): https://codereview.appspot.com/145220043/diff/110001/asyncio/base_events.py#newcode148 asyncio/base_events.py:148: self._cancel_count = 0 Maybe we can come up with ...

10 years, 7 months ago (2014-09-20 22:42:44 UTC) #2

chatgris

On 2014/09/20 22:42:44, yselivanov wrote: > https://codereview.appspot.com/145220043/diff/110001/asyncio/base_events.py > File asyncio/base_events.py (right): > > https://codereview.appspot.com/145220043/diff/110001/asyncio/base_events.py#newcode148 > ...

10 years, 7 months ago (2014-09-20 23:00:42 UTC) #3

On 2014/09/20 22:42:44, yselivanov wrote:
> https://codereview.appspot.com/145220043/diff/110001/asyncio/base_events.py
> File asyncio/base_events.py (right):
> 
>
https://codereview.appspot.com/145220043/diff/110001/asyncio/base_events.py#n...
> asyncio/base_events.py:148: self._cancel_count = 0
> Maybe we can come up with a bit more descriptive name?

self._cancelled_count = 0 ?

Anything much more descriptive would be a fairly long name. I haven't thought of
anything better, any ideas?

> 
>
https://codereview.appspot.com/145220043/diff/110001/asyncio/base_events.py#n...
> asyncio/base_events.py:995: if sched_count > 100 and self._cancel_count /
> sched_count > 0.5:
> 100 and 0.5 should be constants defined somewhere on top of the file.

No problem doing this, but will wait until the following is resolved before I
do.

> 
> Also, while I understand what you want to do with the second check (cleanup
only
> when 50% of tasks are cancelled), I'm not sure it's such a good idea. I would
> just set the threshold to a bit higher number (say 1000) and cleanup the queue
> every time we hit it. After all, it takes python about 50ms to rebuild a list
of
> 1000000 items.

I understand your reasoning, but I strongly disagree.

Taking just 1,000,000 - If it takes 50ms to rebuild the 1,000,000 list, which
would be built as the result of 500,000 active events and 500,000 cancelled
events.

  i) The current algorithm would take 50ms if there were 500,000 uncancelled
events.
  ii) The proposed change (static of 1000) would take ~ 500 * 25ms == 12.5
seconds!!!
       Explanation:
       - Once the system got to a steady state of 500,000 uncancelled events
(and 1M == 50ms, 0.5M == 25ms) every time 1000 events were cancelled, a 25ms
penalty would be paid.

IMO, a core element such as an event scheduler should never be O(N) (which it
still is with a static limit).

GvR

Thanks, this is a good start. Some remarks in addition to Yuri's review: - You ...

10 years, 7 months ago (2014-09-21 02:30:59 UTC) #4

chatgris

On 2014/09/21 02:30:59, GvR wrote: > Thanks, this is a good start. Some remarks in ...

10 years, 7 months ago (2014-09-21 18:57:29 UTC) #5

On 2014/09/21 02:30:59, GvR wrote:
> Thanks, this is a good start.  Some remarks in addition to Yuri's review:

On the topic of Yuri's review: 

Constants were defined at the top of the file.

Renaming has become less of a concern since I have pulled the scheduled 
logic into a separate class (details below), and all names are now relative
to that class.

> 
> - You are effectively adding a new interface to the event loop class. It is
> legal to implement a new event loop that derives from AbstractEventLoop but
not
> from BaseEventLoop. So you either have to add _handle_cancelled to
> AbstractEventLoop (so people know that they have to implement it), or you have
> to check whether it exists before calling it.

I have added the cancelled callback to AbstractEventLoop.

> 
> - The _scheduled list can only contain TimerHandle instances; I think the
> _handle_cancelled call should be made in that class's cancel() method, so you
> don't count cancellations of immediate callbacks.

Done. Additionally, I have renamed the method to _timer_handle_cancelled

> 
> - I'm in agreement with you that we should avoid O(N) behavior and I like your
> current approach to rebuild at a fixed percentage. I'm not sure how to tune
the
> percentage at which we decide to rebuild. Note that the heapify() call
> contributes way more than the list comprehension.

No ideas on how to tune the percentage. 50% was chosen simply on the heuristic 
"If there are more cancelled events than non-cancelled events, it's probably 
time to clean".

> 
> - You are overcounting cancelled events -- calling cancel() after the callback
> has already run is common, e.g. sleep() does this.  But I'm not sure of the
best
> way to avoid this.  Setting _cancelled when the callback is run feels wrong
(it
> affects the repr) and Handle objects may be run multiple times (when used as
I/O
> callbacks).  Maybe a separate flag that only TimerHandle instances have.  But
> you could still overcount, as the TimerHandle is transferred from _scheduled
to
> _ready once it is due, but it may still be cancelled before it gets run. 
> Perhaps none of this matters.

The algorithm will still function correctly if the count was off, but it will 
affect performance especially with the cancel semantics of sleep.

It could be handled by tracking whether the TimerHandle is in the _scheduled
list 
or not. If we can track that, then on cancellation if the handle is in
_scheduled 
we alter the count, otherwise we ignore the cancellation.

I made a fairly major change to the code, to include two different ways we could

track that. Additionally, since in some cases it resulted in keeping 3 variables

in sync, I made separate SchedBy* classes. (This was a bit of a pain since the
unit 
tests expect direct access to a list when testing, but I made it work at least
for
the purposes of demonstration. We could always pull all variables back into 
BaseEventLoop and discard the classes if that is preferred, or we could alter
the
tests and get rid of a lot of the methods in the SchedBy* classes that are
marked
"for testing purposes only").

You can flip between the two types by editing lines 301 and 302 in
base_events.py.

SchedByFlag uses a _scheduled flag in TimerHandle that it modifies to track
whether 
it is in the _scheduled list or not.

SchedBySet uses a set to keep track of what exists in the _scheduled list and
what 
does not.  Note that _scheduled would be removed from TimerHandle if this
algorithm
was chosen as it is not needed.

Finally, note it is very likely these SchedBy* classes are buggy. No testing has
been 
done yet, they exist merely for demonstration of the concept. And I would
probably 
rename the classes too if we decide to use a separate class and one was deleted.

Looking forward to feedback.

chatgris

Note: Apologies for the terrible formatting of my last post, I didn't realize it automatically ...

10 years, 7 months ago (2014-09-21 18:59:03 UTC) #6

GvR

On 2014/09/21 18:59:03, chatgris wrote: > Note: Apologies for the terrible formatting of my last ...

10 years, 7 months ago (2014-09-22 18:14:30 UTC) #7

On 2014/09/21 18:59:03, chatgris wrote:
> Note: Apologies for the terrible formatting of my last post, I didn't realize
it
> automatically breaks lines.

Well, if you *do* want to break lines, break them around 72 chars, then you'll
be safe. :-)

Anyway, let's look at the various proposed solutions:

(A) over-count cancellations when cancel() happens to a handle that's not
scheduled, like in your original proposal
(B) track whether a handle is in the scheduled queue
    (B1) using a flag in each timer handle
    (B2) using a separate set of scheduled timer handles

I don't think B2 is needed -- the TimerHandle class is ours to modify, and
having an extra field in every instance takes up less space than the space
required for the set, *unless* the percentage of cancelled timer handles is
under 25% or so. (Back-of-the-envelop calculation after observing that the
memory taken by a set of N elements is around 32 bytes per element, i.e. 4
pointers -- the extra slot only costs one pointer per timer handle.) The set
version is also always slower and takes more code.

So IMO the contest is between (A) and (B1). Your current version of B1 looks
pretty clunky -- perhaps you can inline it?  There are really only very few
choke points -- timer handles are only placed in the scheduled queue by
call_at() and _add_callback(), and only removed by _run_once() (inn two
different places).

(Plus, I think _add_callback() should never be called with a TimerHandle
argument. There's a test for this, but it's an internal method and none of the
call sites should ever create a TimerHandle. I think these things are all relics
from earlier designs.)

I'll refrain from more in-detail review until you've got a new version up. FWIW,
we've now missed the 3.4.2 release candidate and I think this is too much new
stuff to sneak into a release at the last moment, so there's less time pressure.

chatgris

On 2014/09/22 18:14:30, GvR wrote: > On 2014/09/21 18:59:03, chatgris wrote: > > Note: Apologies ...

10 years, 7 months ago (2014-09-23 16:32:23 UTC) #8

On 2014/09/22 18:14:30, GvR wrote:
> On 2014/09/21 18:59:03, chatgris wrote:
> > Note: Apologies for the terrible formatting of my last post, I didn't
realize
> it
> > automatically breaks lines.
> 
> Well, if you *do* want to break lines, break them around 72 chars, then you'll
> be safe. :-)
> 
> Anyway, let's look at the various proposed solutions:
> 
> (A) over-count cancellations when cancel() happens to a handle that's not
> scheduled, like in your original proposal
> (B) track whether a handle is in the scheduled queue
>     (B1) using a flag in each timer handle
>     (B2) using a separate set of scheduled timer handles
> 
> I don't think B2 is needed -- the TimerHandle class is ours to modify, and
> having an extra field in every instance takes up less space than the space
> required for the set, *unless* the percentage of cancelled timer handles is
> under 25% or so. (Back-of-the-envelop calculation after observing that the
> memory taken by a set of N elements is around 32 bytes per element, i.e. 4
> pointers -- the extra slot only costs one pointer per timer handle.) The set
> version is also always slower and takes more code.
> 
> So IMO the contest is between (A) and (B1). 

We could tune the % cleanup to be 25% to keep memory usage down for (B2). IMO,
the contest is between (B1) and (B2), because if (B2) is being rejected due to
performance, then (A) should be rejected as well (due to the increased frequency
of heapq.heapify).

> Your current version of B1 looks
> pretty clunky -- perhaps you can inline it?  There are really only very few
> choke points -- timer handles are only placed in the scheduled queue by
> call_at() and _add_callback(), and only removed by _run_once() (inn two
> different places).

(B1) has been inlined in the latest patch set.

> 
> (Plus, I think _add_callback() should never be called with a TimerHandle
> argument. There's a test for this, but it's an internal method and none of the
> call sites should ever create a TimerHandle. I think these things are all
relics
> from earlier designs.)

Well, I'm happy to remove the "if isinstance(handle, events.TimerHandle):" block
if you'd like me to from _add_callback(). Based on your reasoning it appears
that it is redundant.

yselivanov

The patch is getting better and better! I'm +1 for B1 strategy you guys have ...

10 years, 7 months ago (2014-09-23 16:56:05 UTC) #9

GvR

I'm with Yuri's comments (except where I explicitly disagree :-). Here are some nits. https://codereview.appspot.com/145220043/diff/150001/asyncio/base_events.py ...

10 years, 7 months ago (2014-09-23 17:07:04 UTC) #10

chatgris

Added new patch set to fix nits, will reply to a few specific issues in ...

10 years, 7 months ago (2014-09-23 18:14:25 UTC) #11

chatgris

> https://codereview.appspot.com/145220043/diff/150001/asyncio/base_events.py#newcode1004 > asyncio/base_events.py:1004: # Remove delayed calls that were cancelled if their > number ...

10 years, 7 months ago (2014-09-23 18:16:54 UTC) #12

chatgris

I just realized that the tests I changed to tie into the new constants is ...

10 years, 7 months ago (2014-09-23 18:43:27 UTC) #13

chatgris

On 2014/09/23 18:43:27, chatgris wrote: > I just realized that the tests I changed to ...

10 years, 7 months ago (2014-09-23 19:20:40 UTC) #14

yselivanov

Looks good, aside from just couple of really small notes. I also like how you ...

10 years, 7 months ago (2014-09-23 19:41:02 UTC) #15

chatgris

On 2014/09/23 19:41:02, yselivanov wrote: > https://codereview.appspot.com/145220043/diff/160005/asyncio/base_events.py#newcode44 > asyncio/base_events.py:44: # cancelled handles is performed > ...

10 years, 7 months ago (2014-09-23 20:04:10 UTC) #16

GvR

Yuri, if you're happy with it, I have no further comments. Can you commit this ...

10 years, 7 months ago (2014-09-25 01:50:23 UTC) #18

yselivanov

I was in the middle of committing the patch when I saw one more thing ...

10 years, 7 months ago (2014-09-25 02:55:48 UTC) #19

GvR

On 2014/09/25 02:55:48, yselivanov wrote: > I was in the middle of committing the patch ...

10 years, 7 months ago (2014-09-25 03:16:59 UTC) #20

chatgris

On 2014/09/25 03:16:59, GvR wrote: > On 2014/09/25 02:55:48, yselivanov wrote: > > I was ...

10 years, 7 months ago (2014-09-25 15:07:17 UTC) #21

On 2014/09/25 03:16:59, GvR wrote:
> On 2014/09/25 02:55:48, yselivanov wrote:
> > I was in the middle of committing the patch when I saw one more thing I'd
like
> > us to discuss. Please see the comment below.
> > 
> > https://codereview.appspot.com/145220043/diff/200001/asyncio/base_events.py
> > File asyncio/base_events.py (right):
> > 
> >
>
https://codereview.appspot.com/145220043/diff/200001/asyncio/base_events.py#n...
> > asyncio/base_events.py:1006: self._scheduled = [x for x in self._scheduled
if
> > not x._cancelled]
> > Just noticed that we have a double iteration over what might be a huge list
> > potentially, and _run_once() is a central part of asyncio.
> > 
> > Let's think about combining this list comprehension and the above loop in
one
> > thing, like this:
> > 
> >   new_scheduled = []
> >   for handle in self._scheduled:
> >     if handle._cancelled:
> >        handle._scheduled = False
> >     else:
> >        new_scheduled.append(handle)
> >   self._scheduled = heapify(new_scheduled)
> > 
> > Or should we use deque for new_scheduled? Can you test if there are any
> > performance benefits in rewriting this piece?
> 
> Heh, I suggested this in an earlier review. I expect it will be only a modest
> win, since heapify() is O(N) and its constant is much larger. But yeah, good
> idea to try.

Sorry, I don't recall seeing this in an earlier review. It was something I
briefly thought about but my internal heuristic thought that calling append
would be slower than the list comprehension since in my experience python
function calls have a lot of overhead.

However, in the interests of completeness, below is a program I ran to test this
(it is included inline at the end of this message).

Based on that program, over the course of 5 runs:

Option A took between 12.25 and 12.4 seconds.
Option B took between 13.2 and 14.4.
Option C took around 17s (NOTE: Option C is not correct, after the first
iteration it will be running on a list half the size (and since it's slower I
didn't go to the effort to create an apples to apples comparison). But I thought
I'd try an in-place method as well.)

It appears that choosing the option that uses the least python/most C code is
the winning option. And that appears to be what is already in the patch.

Note that the above timing is the elapsed time the program prints out which
excludes creation of the list.

====================

import asyncio
import time

loop = asyncio.get_event_loop()

def cb(self):
    pass

for x in range(1000000):
    th = loop.call_later(3600, cb)
    if x % 2 == 1:
        th.cancel()

start = time.time()

orig_data = loop._scheduled

for x in range(100):
    loop._scheduled = orig_data

    """Option A
    for handle in loop._scheduled:
        if handle._cancelled:
            handle._scheduled = False

    loop._scheduled = [x for x in loop._scheduled if not x._cancelled]
    """

    """Option B
    new_scheduled = []
    for handle in loop._scheduled:
        if handle._cancelled:
            handle._scheduled = False
        else:
            new_scheduled.append(handle)

    loop._scheduled = new_scheduled
    """

    """Option C
    This is not correct as it is repeatedly operating on a list half the size
    offs = 0
    for i in range(len(loop._scheduled)):
        handle = loop._scheduled[i]
        if handle._cancelled:
            handle._scheduled = False
            offs += 1
        else:
            loop._scheduled[i-offs] = handle

    del loop._scheduled[len(loop._scheduled) - offs:]
    """

end = time.time()

print(end - start)

yselivanov

Joshua, I've just committed the patch to CPython and tulip: - https://hg.python.org/cpython/rev/2a868c9f8f15 - https://hg.python.org/cpython/rev/a6aaacb2b807 - ...

10 years, 7 months ago (2014-09-25 16:11:54 UTC) #22

gvrpython

Sounds good to me. It's possible that the difference between A and B is due ...

10 years, 7 months ago (2014-09-25 16:12:31 UTC) #23

Sounds good to me. It's possible that the difference between A and B is due
to the lookup of the append method -- list comprehensions use append too,
but they don't have to look it up each time.

On Thu, Sep 25, 2014 at 8:07 AM, <chatgris@gmail.com> wrote:

> On 2014/09/25 03:16:59, GvR wrote:
>
>> On 2014/09/25 02:55:48, yselivanov wrote:
>> > I was in the middle of committing the patch when I saw one more
>>
> thing I'd like
>
>> > us to discuss. Please see the comment below.
>> >
>> >
>>
> https://codereview.appspot.com/145220043/diff/200001/
> asyncio/base_events.py
>
>> > File asyncio/base_events.py (right):
>> >
>> >
>>
>
> https://codereview.appspot.com/145220043/diff/200001/
> asyncio/base_events.py#newcode1006
>
>> > asyncio/base_events.py:1006: self._scheduled = [x for x in
>>
> self._scheduled if
>
>> > not x._cancelled]
>> > Just noticed that we have a double iteration over what might be a
>>
> huge list
>
>> > potentially, and _run_once() is a central part of asyncio.
>> >
>> > Let's think about combining this list comprehension and the above
>>
> loop in one
>
>> > thing, like this:
>> >
>> >   new_scheduled = []
>> >   for handle in self._scheduled:
>> >     if handle._cancelled:
>> >        handle._scheduled = False
>> >     else:
>> >        new_scheduled.append(handle)
>> >   self._scheduled = heapify(new_scheduled)
>> >
>> > Or should we use deque for new_scheduled? Can you test if there are
>>
> any
>
>> > performance benefits in rewriting this piece?
>>
>
>  Heh, I suggested this in an earlier review. I expect it will be only a
>>
> modest
>
>> win, since heapify() is O(N) and its constant is much larger. But
>>
> yeah, good
>
>> idea to try.
>>
>
> Sorry, I don't recall seeing this in an earlier review. It was something
> I briefly thought about but my internal heuristic thought that calling
> append would be slower than the list comprehension since in my
> experience python function calls have a lot of overhead.
>
> However, in the interests of completeness, below is a program I ran to
> test this (it is included inline at the end of this message).
>
> Based on that program, over the course of 5 runs:
>
> Option A took between 12.25 and 12.4 seconds.
> Option B took between 13.2 and 14.4.
> Option C took around 17s (NOTE: Option C is not correct, after the first
> iteration it will be running on a list half the size (and since it's
> slower I didn't go to the effort to create an apples to apples
> comparison). But I thought I'd try an in-place method as well.)
>
> It appears that choosing the option that uses the least python/most C
> code is the winning option. And that appears to be what is already in
> the patch.
>
> Note that the above timing is the elapsed time the program prints out
> which excludes creation of the list.
>
> ====================
>
> import asyncio
> import time
>
> loop = asyncio.get_event_loop()
>
> def cb(self):
>     pass
>
> for x in range(1000000):
>     th = loop.call_later(3600, cb)
>     if x % 2 == 1:
>         th.cancel()
>
> start = time.time()
>
> orig_data = loop._scheduled
>
> for x in range(100):
>     loop._scheduled = orig_data
>
>     """Option A
>     for handle in loop._scheduled:
>         if handle._cancelled:
>             handle._scheduled = False
>
>     loop._scheduled = [x for x in loop._scheduled if not x._cancelled]
>     """
>
>     """Option B
>     new_scheduled = []
>     for handle in loop._scheduled:
>         if handle._cancelled:
>             handle._scheduled = False
>         else:
>             new_scheduled.append(handle)
>
>     loop._scheduled = new_scheduled
>     """
>
>     """Option C
>     This is not correct as it is repeatedly operating on a list half the
> size
>     offs = 0
>     for i in range(len(loop._scheduled)):
>         handle = loop._scheduled[i]
>         if handle._cancelled:
>             handle._scheduled = False
>             offs += 1
>         else:
>             loop._scheduled[i-offs] = handle
>
>     del loop._scheduled[len(loop._scheduled) - offs:]
>     """
>
> end = time.time()
>
> print(end - start)
>
>
>
> https://codereview.appspot.com/145220043/
>



-- 
--Guido van Rossum (python.org/~guido)

gvrpython

10 years, 7 months ago (2014-09-25 16:28:00 UTC) #24

Thanks Yuri and Joshua!! This was a very useful exercise.

On Thu, Sep 25, 2014 at 9:11 AM, <yselivanov@gmail.com> wrote:

> Joshua, I've just committed the patch to CPython and tulip:
>
> - https://hg.python.org/cpython/rev/2a868c9f8f15
> - https://hg.python.org/cpython/rev/a6aaacb2b807
> -
> https://code.google.com/p/tulip/source/detail?r=
> df3fa0556765e51ed06551af6fc2e1f1e0ec18a0
>
> Thank you for the patch and for your patience during the code review!
>
> https://codereview.appspot.com/145220043/
>



-- 
--Guido van Rossum (python.org/~guido)

Expand All Messages | Collapse All Messages