Categories
Blather Programming Python Software

Deferreds En Masse

Part 18: Deferreds En Masse

This continues the introduction started here. You can find an index to the entire series here.

Introduction

In the last Part we learned a new way of structuring sequential asynchronous callbacks using a generator. Thus, including deferreds, we now have two techniques for chaining asynchronous operations together.

Sometimes, though, we want to run a group of asynchronous operations in “parallel”. Since Twisted is single-threaded they won’t really run concurrently, but the point is we want to use asynchronous I/O to work on a group of tasks as fast as possible. Our poetry clients, for example, download poems from multiple servers at the same time, rather than one server after another. That was the whole point of using Twisted for getting poetry, after all.

And, as a result, all our poetry clients have had to solve this problem: how do you know when all the asynchronous operations you have started are done? So far we have solved this by collecting our results into a list (like the results list in client 7.0) and checking the length of the list. We have to be careful to collect failures as well as successful results, otherwise a single failure will cause the program to run forever, thinking there’s still work left to do.

As you might expect, Twisted includes an abstraction you can use to solve this problem and we’re going to take a look at it today.

The DeferredList

The DeferredList class allows us to treat a list of deferred objects as a single deferred. That way we can start a bunch of asynchronous operations and get notified only when all of them have finished (regardless of whether they succeeded or failed). Let’s look at some examples.

In deferred-list/deferred-list-1.py you will find this code:

from twisted.internet import defer

def got_results(res):
    print 'We got:', res

print 'Empty List.'
d = defer.DeferredList([])
print 'Adding Callback.'
d.addCallback(got_results)

And if you run it, you will get this output:

Empty List.
Adding Callback.
We got: []

Some things to notice:

  • A DeferredList is created from a Python list. In this case the list is empty, but we’ll soon see that the list elements must all be Deferred objects.
  • A DeferredList is itself a deferred (it inherits from Deferred). That means you can add callbacks and errbacks to it just like you would a regular deferred.
  • In the example above, our callback was fired as soon as we added it, so the DeferredList must have fired right away. We’ll discuss that more in a second.
  • The result of the deferred list was itself a list (empty).

Now look at deferred-list/deferred-list-2.py:

from twisted.internet import defer

def got_results(res):
    print 'We got:', res

print 'One Deferred.'
d1 = defer.Deferred()
d = defer.DeferredList([d1])
print 'Adding Callback.'
d.addCallback(got_results)
print 'Firing d1.'
d1.callback('d1 result')

Now we are creating our DeferredList with a 1-element list containing a single deferred. Here’s the output we get:

One Deferred.
Adding Callback.
Firing d1.
We got: [(True, 'd1 result')]

More things to notice:

  • This time the DeferredList didn’t fire its callback until we fired the deferred in the list.
  • The result is still a list, but now it has one element.
  • The element is a tuple whose second value is the result of the deferred in the list.

Let’s try putting two deferreds in the list (deferred-list/deferred-list-3.py):

from twisted.internet import defer

def got_results(res):
    print 'We got:', res

print 'Two Deferreds.'
d1 = defer.Deferred()
d2 = defer.Deferred()
d = defer.DeferredList([d1, d2])
print 'Adding Callback.'
d.addCallback(got_results)
print 'Firing d1.'
d1.callback('d1 result')
print 'Firing d2.'
d2.callback('d2 result')

And here’s the output:

Two Deferreds.
Adding Callback.
Firing d1.
Firing d2.
We got: [(True, 'd1 result'), (True, 'd2 result')]

At this point it’s pretty clear the result of a DeferredList, at least for the way we’ve been using it, is a list with the same number of elements as the list of deferreds we passed to the constructor. And the elements of that result list contain the results of the original deferreds, at least if the deferreds succeed. That means the DeferredList itself doesn’t fire until all the deferreds in the original list have fired. And a DeferredList created with an empty list fires right away since there aren’t any deferreds to wait for.

What about the order of the results in the final list? Consider deferred-list/deferred-list-4.py:

from twisted.internet import defer

def got_results(res):
    print 'We got:', res

print 'Two Deferreds.'
d1 = defer.Deferred()
d2 = defer.Deferred()
d = defer.DeferredList([d1, d2])
print 'Adding Callback.'
d.addCallback(got_results)
print 'Firing d2.'
d2.callback('d2 result')
print 'Firing d1.'
d1.callback('d1 result')

Now we are firing d2 first and then d1. Note the deferred list is still constructed with d1 and d2 in their original order. Here’s the output:

Two Deferreds.
Adding Callback.
Firing d2.
Firing d1.
We got: [(True, 'd1 result'), (True, 'd2 result')]

The output list has the results in the same order as the original list of deferreds, not the order those deferreds happened to fire in. Which is very nice, because we can easily associate each individual result with the operation that generated it (for example, which poem came from which server).

Alright, what happens if one or more of the deferreds in the list fails? And what are those True values doing there? Let’s try the example in deferred-list/deferred-list-5.py:

from twisted.internet import defer

def got_results(res):
    print 'We got:', res

d1 = defer.Deferred()
d2 = defer.Deferred()
d = defer.DeferredList([d1, d2], consumeErrors=True)
d.addCallback(got_results)
print 'Firing d1.'
d1.callback('d1 result')
print 'Firing d2 with errback.'
d2.errback(Exception('d2 failure'))

Now we are firing d1 with a normal result and d2 with an error. Ignore the consumeErrors option for now, we’ll get back to it. Here’s the output:

Firing d1.
Firing d2 with errback.
We got: [(True, 'd1 result'), (False, <twisted.python.failure.Failure <type 'exceptions.Exception'>>)]

Now the tuple corresponding to d2 has a Failure in slot two, and False in slot one. At this point it should be pretty clear how a DeferredList works (but see the Discussion below):

  • A DeferredList is constructed with a list of deferred objects.
  • A DeferredList is itself a deferred whose result is a list of the same length as the list of deferreds.
  • The DeferredList fires after all the deferreds in the original list have fired.
  • Each element of the result list corresponds to the deferred in the same position as the original list. If that deferred succeeded, the element is (True, result) and if the deferred failed, the element is (False, failure).
  • A DeferredList never fails, since the result of each individual deferred is collected into the list no matter what (but again, see the Discussion below).

Now let’s talk about that consumeErrors option we passed to the DeferredList. If we run the same code but without passing the option (deferred-list/deferred-list-6.py), we get this output:

Firing d1.
Firing d2 with errback.
We got: [(True, 'd1 result'), (False, >twisted.python.failure.Failure >type 'exceptions.Exception'<<)]
Unhandled error in Deferred:
Traceback (most recent call last):
Failure: exceptions.Exception: d2 failure

If you recall, the “Unhandled error in Deferred” message is generated when a deferred is garbage collected and the last callback in that deferred failed. The message is telling us we haven’t caught all the potential asynchronous failures in our program. So where is it coming from in our example? It’s clearly not coming from the DeferredList, since that succeeds. So it must be coming from d2.

A DeferredList needs to know when each deferred it is monitoring fires. And the DeferredList does that in the usual way — by adding a callback and errback to each deferred. And by default, the callback (and errback) return the original result (or failure) after putting it in the final list. And since returning the original failure from the errback triggers the next errback, d2 remains in the failed state after it fires.

But if we pass consumeErrors=True to the DeferredList, the errback added by the DeferredList to each deferred will instead return None, thus “consuming” the error and eliminating the warning message. We could also handle the error by adding our own errback to d2, as in deferred-list/deferred-list-7.py.

Client 8.0

Version 8.0 of our Get Poetry Now! client uses a DeferredList to find out when all the poetry has finished (or failed). You can find the new client in twisted-client-8/get-poetry.py. Once again the only change is in poetry_main. Let’s look at the important changes:

    ...
    ds = []

    for (host, port) in addresses:
        d = get_transformed_poem(host, port)
        d.addCallbacks(got_poem)
        ds.append(d)

    dlist = defer.DeferredList(ds, consumeErrors=True)
    dlist.addCallback(lambda res : reactor.stop())

You may wish to compare it to the same section of client 7.0.

In client 8.0, we don’t need the poem_done callback or the results list. Instead, we put each deferred we get back from get_transformed_poem into a list (ds) and then create a DeferredList. Since the DeferredList won’t fire until all the poems have finished or failed, we just add a callback to the DeferredList to shutdown the reactor. In this case, we aren’t using the result from the DeferredList, we just need to know when everything is finished. And that’s it!

Discussion

We can visualize how a DeferredList works in Figure 37:

Figure 37: the result of a DeferredList
Figure 37: the result of a DeferredList

Pretty simple, really. There are a couple options to DeferredList we haven’t covered, and which change the behavior from what we have described above. We will leave them for you to explore in the Exercises below.

In the next Part we will cover one more feature of the Deferred class, a feature recently introduced in Twisted 10.1.0.

Suggested Exercises

  1. Read the source code for the DeferredList.
  2. Modify the examples in deferred-list to experiment with the optional constructor arguments fireOnOneCallback and fireOnOneErrback. Come up with scenarios where you would use one or the other (or both).
  3. Can you create a DeferredList using a list of DeferredLists? If so, what would the result look like?
  4. Modify client 8.0 so that it doesn’t print out anything until all the poems have finished downloading. This time you will use the result from the DeferredList.
  5. Define the semantics of a DeferredDict and then implement it.

Leave a Reply

Discover more from krondo

Subscribe now to keep reading and get access to the full archive.

Continue reading