An Introduction to Asynchronous Programming and Twisted

Part 7: An Interlude,  Deferred

This continues the introduction started here. You can find an index to the entire series here.

Callbacks and Their Consequences

In Part 6 we came face-to-face with this fact: callbacks are a fundamental aspect of asynchronous programming with Twisted. Rather than just a way of interfacing with the reactor, callbacks will be woven into the structure of any Twisted program we write. So using Twisted, or any reactor-based asynchronous system, means organizing our code in a particular way, as a series of “callback chains” invoked by a reactor loop.

Even an API as simple as our get_poetry function required callbacks, two of them in fact: one for normal results and one for errors. Since, as Twisted programmers, we’re going to have to make so much use of them, we should spend a little bit of time thinking about the best ways to use callbacks, and what sort of pitfalls we might encounter.

Consider this piece of code that uses the Twisted version of get_poetry from client 3.1:

...
def got_poem(poem):
    print poem
    reactor.stop()

def poem_failed(err):
    print >>sys.stderr, 'poem download failed'
    print >>sys.stderr, 'I am terribly sorry'
    print >>sys.stderr, 'try again later?'
    reactor.stop()

get_poetry(host, port, got_poem, poem_failed)

reactor.run()

The basic plan here is clear:

  1. If we get the poem, print it out.
  2. If we don’t get the poem, print out an Error Haiku.
  3. In either case, end the program.

The ‘synchronous analogue’ to the above code might look something like this:

...
try:
    poem = get_poetry(host, port) # the synchronous version of get_poetry
except Exception, err:
    print >>sys.stderr, 'poem download failed'
    print >>sys.stderr, 'I am terribly sorry'
    print >>sys.stderr, 'try again later?'
    sys.exit()
else:
    print poem
    sys.exit()

So the callback is like the else block and the errback is like the except. That means invoking the errback is the asynchronous analogue to raising an exception and invoking the callback corresponds to the normal program flow.

What are some of the differences between the two versions? For one thing, in the synchronous version the Python interpreter will ensure that, as long as get_poetry raises any kind of exception at all, for any reason, the except block will run. If we trust the interpreter to run Python code correctly we can trust that error block to run at the right time.

Contrast that with the asynchronous version: the poem_failed errback is invoked by our code, the clientConnectionFailed method of the PoetryClientFactory. We, not Python, are in charge of making sure the error code runs if something goes wrong. So we have to make sure to handle every possible error case by invoking the errback with a Failure object. Otherwise, our program will become “stuck” waiting for a callback that never comes.

That shows another difference between the synchronous and asynchronous versions. If we didn’t bother catching the exception in the synchronous version (by not using a try/except), the Python interpreter would “catch” it for us and crash to show us the error of our ways. But if we forget to “raise” our asynchronous exception (by calling the errback function in PoetryClientFactory), our program will just run forever, blissfully unaware that anything is amiss.

Clearly, handling errors in an asynchronous program is important, and also somewhat tricky. You might say that handling errors in asynchronous code is actually more important than handling the normal case, as things can go wrong in far more ways than they can go right. Forgetting to handle the error case is a common mistake when programming with Twisted.

Here’s another fact about the synchronous code above: either the else block runs exactly once, or the except block runs exactly once (assuming the synchronous version of get_poetry doesn’t enter an infinite loop). The Python interpreter won’t suddenly decide to run them both or, on a whim, run the else block twenty-seven times. And it would be basically impossible to program in Python if it did!

But again, in the asynchronous case we are in charge of running the callback or the errback. Knowing us, we might make some mistakes. We could call both the callback and the errback, or invoke the callback twenty-seven times. That would be unfortunate for the users of get_poetry. Although the docstring doesn’t explicitly say so, it really goes without saying that, like the else and except blocks in a try/except statement, either the callback will run exactly once or the errback will run exactly once, for each specific call to get_poetry. Either we get the poem or we don’t.

Imagine trying to debug a program that makes three poetry requests and gets seven callback invocations and two errback invocations. Where would you even start? You’d probably end up writing your callbacks and errbacks to detect when they got invoked a second time for the same get_poetry call and throw an exception right back. Take that, get_poetry.

One more observation: both versions have some duplicate code. The asynchronous version has two calls to reactor.stop and the synchronous version has two calls to sys.exit. We might refactor the synchronous version like this:

...
try:
    poem = get_poetry(host, port) # the synchronous version of get_poetry
except Exception, err:
    print >>sys.stderr, 'poem download failed'
    print >>sys.stderr, 'I am terribly sorry'
    print >>sys.stderr, 'try again later?'
else:
    print poem

sys.exit()

Can we refactor the asynchronous version in a similar way? It’s not really clear that we can, since the callback and errback are two different functions. Do we have to go back to a single callback to make this possible?

Ok, here are some of the insights we’ve discovered about programming with callbacks:

  1. Calling errbacks is very important. Since errbacks take the place of except blocks, users need to be able to count on them. They aren’t an optional feature of our APIs.
  2. Not invoking callbacks at the wrong time is just as important as calling them at the right time. For a typical use case, the callback and errback are mutually exclusive and invoked exactly once.
  3. Refactoring common code might be harder when using callbacks.

We’ll have more to say about callbacks in future Parts, but for now this is enough to see why Twisted might have an abstraction devoted to managing them.

The Deferred

Since callbacks are used so much in asynchronous programming, and since using them correctly can, as we have discovered, be a bit tricky, the Twisted developers created an abstraction called a Deferred to make programming with callbacks easier. The Deferred class is defined in twisted.internet.defer.

The word “deferred” is either a verb or an adjective in everyday English, so it might sound a little strange used as a noun. Just know that, from now on, when I use the phrase “the deferred” or “a deferred”, I’m referring to an instance of the Deferred class. We’ll talk about why it is called Deferred in a future Part. It might help to mentally add the word “result” to each phrase, as in “the deferred result”. As we will eventually see, that’s really what it is.

A deferred contains a pair of callback chains, one for normal results and one for errors. A newly-created deferred has two empty chains. We can populate the chains by adding callbacks and errbacks and then fire the deferred with either a normal result (here’s your poem!) or an exception (I couldn’t get the poem, and here’s why). Firing the deferred will invoke the appropriate callbacks or errbacks in the order they were added. Figure 12 illustrates a deferred instance with its callback/errback chains:

Figure 12: A Deferred

Figure 12: A Deferred

Let’s try this out. Since deferreds don’t use the reactor, we can test them out without starting up the loop.

You might have noticed a method on Deferred called setTimeout that does use the reactor. It is deprecated and will cease to exist in a future release. Pretend it’s not there and don’t use it.

Our first example is in twisted-deferred/defer-1.py:

from twisted.internet.defer import Deferred

def got_poem(res):
    print 'Your poem is served:'
    print res

def poem_failed(err):
    print 'No poetry for you.'

d = Deferred()

# add a callback/errback pair to the chain
d.addCallbacks(got_poem, poem_failed)

# fire the chain with a normal result
d.callback('This poem is short.')

print "Finished"

This code makes a new deferred, adds a callback/errback pair with the addCallbacks method, and then fires the “normal result” chain with the callback method. Of course, it’s not much of a chain since it only has a single callback, but no matter. Run the code and it produces this output:

Your poem is served:
This poem is short.
Finished

That’s pretty simple. Here are some things to notice:

  1. Just like the callback/errback pairs we used in client 3.1, the callbacks we add to this deferred each take one argument, either a normal result or an error result. It turns out that deferreds support callbacks and errbacks with multiple arguments, but they always have at least one, and the first argument is always either a normal result or an error result.
  2. We add callbacks and errbacks to the deferred in pairs.
  3. The callback method fires the deferred with a normal result, the method’s only argument.
  4. Looking at the order of the print output, we can see that firing the deferred invokes the callbacks immediately. There’s nothing asynchronous going on at all. There can’t be, since no reactor is running. It really boils down to an ordinary Python function call.

Ok, let’s push the other button. The example in twisted-deferred/defer-2.py fires the deferred’s errback chain:

from twisted.internet.defer import Deferred
from twisted.python.failure import Failure

def got_poem(res):
    print 'Your poem is served:'
    print res

def poem_failed(err):
    print 'No poetry for you.'

d = Deferred()

# add a callback/errback pair to the chain
d.addCallbacks(got_poem, poem_failed)

# fire the chain with an error result
d.errback(Failure(Exception('I have failed.')))

print "Finished"

And after running that script we get this output:

No poetry for you.
Finished

So firing the errback chain is just a matter of calling the errback method instead of the callback method, and the method argument is the error result. And just as with callbacks, the errbacks are invoked immediately upon firing.

In the previous example we are passing a Failure object to the errback method like we did in client 3.1. That’s just fine, but a deferred will turn ordinary Exceptions into Failures for us. We can see that with twisted-deferred/defer-3.py:

from twisted.internet.defer import Deferred

def got_poem(res):
    print 'Your poem is served:'
    print res

def poem_failed(err):
    print err.__class__
    print err
    print 'No poetry for you.'

d = Deferred()

# add a callback/errback pair to the chain
d.addCallbacks(got_poem, poem_failed)

# fire the chain with an error result
d.errback(Exception('I have failed.'))

Here we are passing a regular Exception to the errback method. In the errback, we are printing out the class and the error result itself. We get this output:

twisted.python.failure.Failure
[Failure instance: Traceback (failure with no frames): : I have failed.
]
No poetry for you.

This means when we use deferreds we can go back to working with ordinary Exceptions and the Failures will get created for us automatically. A deferred will guarantee that each errback is invoked with an actual Failure instance.

We tried pressing the callback button and we tried pressing the errback button. Like any good engineer, you probably want to start pressing them over and over. To make the code shorter, we’ll use the same function for both the callback and the errback. Just remember they get different return values; one is a result and the other is a failure. Check out twisted-deferred/defer-4.py:

from twisted.internet.defer import Deferred
def out(s): print s
d = Deferred()
d.addCallbacks(out, out)
d.callback('First result')
d.callback('Second result')
print 'Finished'

Now we get this output:

First result
Traceback (most recent call last):
  ...
twisted.internet.defer.AlreadyCalledError

This is interesting! A deferred will not let us fire the normal result callbacks a second time. In fact, a deferred cannot be fired a second time no matter what, as demonstrated by these examples:

Notice those final print statements are never called. The callback and errback methods are raising genuine Exceptions to let us know we’ve already fired that deferred. Deferreds help us avoid one of the pitfalls we identified with callback programming. When we use a deferred to manage our callbacks, we simply can’t make the mistake of calling both the callback and the errback, or invoking the callback twenty-seven times. We can try, but the deferred will raise an exception right back at us, instead of passing our mistake onto the callbacks themselves.

Can deferreds help us to refactor asynchronous code? Consider the example in twisted-deferred/defer-8.py:

import sys

from twisted.internet.defer import Deferred

def got_poem(poem):
    print poem
    from twisted.internet import reactor
    reactor.stop()

def poem_failed(err):
    print >>sys.stderr, 'poem download failed'
    print >>sys.stderr, 'I am terribly sorry'
    print >>sys.stderr, 'try again later?'
    from twisted.internet import reactor
    reactor.stop()

d = Deferred()

d.addCallbacks(got_poem, poem_failed)

from twisted.internet import reactor

reactor.callWhenRunning(d.callback, 'Another short poem.')

reactor.run()

This is basically our original example above, with a little extra code to get the reactor going. Notice we are using callWhenRunning to fire the deferred after the reactor starts up. We’re taking advantage of the fact that callWhenRunning accepts additional positional- and keyword-arguments to pass to the callback when it is run. Many Twisted APIs that register callbacks follow this same convention, including the APIs to add callbacks to deferreds.

Both the callback and the errback stop the reactor. Since deferreds support chains of callbacks and errbacks, we can refactor the common code into a second link in the chains, a technique illustrated in twisted-deferred/defer-9.py:

import sys

from twisted.internet.defer import Deferred

def got_poem(poem):
    print poem

def poem_failed(err):
    print >>sys.stderr, 'poem download failed'
    print >>sys.stderr, 'I am terribly sorry'
    print >>sys.stderr, 'try again later?'

def poem_done(_):
    from twisted.internet import reactor
    reactor.stop()

d = Deferred()

d.addCallbacks(got_poem, poem_failed)
d.addBoth(poem_done)

from twisted.internet import reactor

reactor.callWhenRunning(d.callback, 'Another short poem.')

reactor.run()

The addBoth method adds the same function to both the callback and errback chains. And we can refactor asynchronous code after all.

Note: there is a subtlety in the way this deferred would actually execute its errback chain. We’ll discuss it in a future Part, but keep in mind there is more to learn about deferreds.

Summary

In this Part we analyzed callback programming and identified some potential problems. We also saw how the Deferred class can help us out:

  1. We can’t ignore errbacks, they are required for any asynchronous API. Deferreds have support for errbacks built in.
  2. Invoking callbacks multiple times will likely result in subtle, hard-to-debug problems. Deferreds can only be fired once, making them similar to the familiar semantics of try/except statements.
  3. Programming with plain callbacks can make refactoring tricky. With deferreds, we can refactor by adding links to the chain and moving code from one link to another.

We’re not done with the story of deferreds, there are more details of their rationale and behavior to explore. But we’ve got enough to start using them in our poetry client, so we’ll do that in Part 8.

Suggested Exercises

  1. The last example ignores the argument to poem_done. Print it out instead. Make got_poem return a value and see how that changes the argument to poem_done.
  2. Modify the last two deferred examples to fire the errback chains. Make sure to fire the errback with an Exception.
  3. Read the docstrings for the addCallback and addErrback methods on Deferred.

15 thoughts on “An Introduction to Asynchronous Programming and Twisted”

  1. Hi!

    Minor nitpick:

    d.addCallbacks(lambda r: out(r), lambda e: out(e))

    If the goal is to make it short, what’s wrong with:

    d.addCallbacks(out, out)

    Also, on recent Python 2.xen you can from __future__ import print_function, and turn that into:

    d.addCallbacks(print, print)

    1. Of course, you are right. I’m not sure what I was thinking there, though perhaps I meant to indicate, with the different argument names, that the callback and errback arguments receive different kinds of values. But I think your version is better, I’ll make that change with the next update, thank you. That’s a nice tip on the __future__ import, though I think I’ll leave it 2.x style.

      1. Yeah, I agree, the second one is definitely best. print, print is just too confusing for people reading 2.x code.

        More minor nitpicking: there’s a convenience method for blocks that conceptually match with “finally” clauses, much like this one:

        d.addBoth(out)

        Which, as the name suggests, adds the cb to both the cb and eb chains.

        Thanks very much for your posts, again
        lvh

        1. Yes, addBoth is very handy. But one thing I’ve learned teaching (I also teach Python to new programmers): don’t try to introduce too many new things at once :)

  2. Hi, exercise 1 is very interesting.

    I assumed it was because the callbacks/errbacks were chained and looked at twisted.internet.defer.addCallbacks() to verify. I also tried chaining more by combining addBoth(), addCallback(), addErrback(), and addCallbacks().

    Very nice introduction to Deferreds. Thank you.

  3. Thank you very much for the work and passion you put in this book! You write in the style i usually learn things – i need to know WHY is it like this, not just HOW to do things.

    >That shows another difference between the synchronous and asynchronous versions. If we didn’t bother catching the exception in the synchronous version (by not using a try/except), the Python interpreter would “catch” it for us and crash to show us the error of our ways. But if we don’t bother calling the errback function in PoetryClientFactory, our program will just run forever, blissfully unaware that anything is amiss.

    I think it’s worth modifying the paragraph to emphasize that it’s not asynchronous Python programming issue not catching the exception. It’s Twisted reactor’s way to work. At least that is what to came to my mind when i read: “That shows another difference between the synchronous and asynchronous versions.”

    1. What do you think of the new version? I’m trying to emphasize that in synchronous code the Python interpreter will guarantee that an except: block will run when it’s supposed to, while in the asynchronous code we’ve been using it’s mostly up to us to make sure that happens.

      1. I got stuck on this sentence: “That shows another difference between the synchronous and asynchronous versions.”
        So i would just modify it like this:
        “That shows another difference between the synchronous (where unhandled exceptions are shown by Python interpreter) and asynchronous (where undhandled exceptions are just logged) versions.”
        But you decide – this just my opinion.

  4. Hi Dave, I have another question:
    Is there a way I can add a callback that will invoke a calllater? In pseudo code this is what I want to do:
    d = maybeDeferred(….)
    d.addCallback(method1)
    if condition is true:
    d.addCallback(callLater(delay, method2, return-from-method1))

    1. Certainly, but in your pseudo-code you are calling callLater
      immediately, instead of in the callback. What you want is something like:

      d.addCallback(lambda res: reactor.callLater(...))

      Makes sense?

Leave a Reply