An Introduction to Asynchronous Programming and Twisted

Part 10: Poetry Transformed

This continues the introduction started here. You can find an index to the entire series here.

Client 5.0

Now we’re going to add some transformation logic to our poetry client, along the lines suggested in Part 9. But first, I have a shameful and humbling confession to make: I don’t know how to write the Byronification Engine. It is beyond my programming abilities. So instead, I’m going to implement a simpler transformation, the Cummingsifier. The Cummingsifier is an algorithm that takes a poem and returns a new poem like the original but written in the style of e.e. cummings. Here is the Cummingsifier algorithm in its entirety:

def cummingsify(poem)
    return poem.lower()

Unfortunately, this algorithm is so simple it never actually fails, so in client 5.0, located in twisted-client-5/get-poetry.py, we use a modified version of cummingsify that randomly does one of the following:

  1. Return a cummingsified version of the poem.
  2. Raise a GibberishError.
  3. Raise a ValueError.

In this way we simulate a more complicated algorithm that sometimes fails in unexpected ways.

The only other changes in client 5.0 are in the poetry_main function:

def poetry_main():
    addresses = parse_args()

    from twisted.internet import reactor

    poems = []
    errors = []

    def try_to_cummingsify(poem):
        try:
            return cummingsify(poem)
        except GibberishError:
            raise
        except:
            print 'Cummingsify failed!'
            return poem

    def got_poem(poem):
        print poem
        poems.append(poem)

    def poem_failed(err):
        print >>sys.stderr, 'The poem download failed.'
        errors.append(err)

    def poem_done(_):
        if len(poems) + len(errors) == len(addresses):
            reactor.stop()

    for address in addresses:
        host, port = address
        d = get_poetry(host, port)
        d.addCallback(try_to_cummingsify)
        d.addCallbacks(got_poem, poem_failed)
        d.addBoth(poem_done)

    reactor.run()

So when the program downloads a poem from the server, it will either:

  1. Print the cummingsified (lower-cased) version of the poem.
  2. Print “Cummingsify failed!” followed by the original poem.
  3. Print “The poem download failed.”

Although we have retained the ability to download from multiple servers, when you are testing out client 5.0 it’s easier to just use a single server and run the program multiple times, until you see all three different outcomes. Also try running the client on a port with no server.

Let’s draw the callback/errback chain we create on each Deferred we get back from get_poetry:

Figure 19: the deferred chain in client 5.0
Figure 19: the deferred chain in client 5.0

Note the pass-through errback that gets added by addCallback. It passes whatever Failure it receives onto the next errback (poem_failed). Thus, poem_failed can handle failures from both get_poetry (i.e., the deferred is fired with the errback method) and the cummingsify function.

Also note the hauntingly beautiful drop-shadow around the border of the deferred in Figure 19. It doesn’t signify anything other than me discovering how to do it in Inkscape. Expect more drop-shadows in the future.

Let’s analyze the different ways our deferred can fire. The case where we get a poem and the cummingsify function works correctly is shown in Figure 20:

Figure 20: when we download a poem and transform it correctly
Figure 20: when we download a poem and transform it correctly

In this case no callback fails, so control flows down the callback line. Note that poem_done receives None as its result, since got_poem doesn’t actually return a value. If we wanted subsequent callbacks to have access to the poem, we would modify got_poem to return the poem explicitly.

Figure 21 shows the case where we get a poem, but cummingsify raises a GibberishError:

Figure 21: when we download a poem and get a GibberishError
Figure 21: when we download a poem and get a GibberishError

Since the try_to_cummingsify callback re-raises a GibberishError, control switches to the errback line and poem_failed is called with the exception as its argument (wrapped in a Failure, of course).

And since poem_failed doesn’t raise an exception, or return a Failure, after it is done control switches back to the callback line. If we want poem_failed to handle the error completely, then returning None is a reasonable behavior. On the other hand, if we wanted poem_failed to take some action, but still propagate the error, we could change poem_failed to return its err argument and processing would continue down the errback line.

Note that in the current code neither got_poem nor poem_failed ever fail themselves, so the poem_done errback will never be called. But it’s safe to add it in any case and doing so represents an instance of “defensive” programming, as either got_poem or poem_failed might have bugs we don’t know about. Since the addBoth method ensures that a particular function will run no matter how the deferred fires, using addBoth is analogous to adding a finally clause to a try/except statement.

Now examine the case where we download a poem and the cummingsify function raises a ValueError, displayed in Figure 22:

Figure 22: when we download a poem and cummingsify fails
Figure 22: when we download a poem and cummingsify fails

This is the same as figure 20, except got_poem receives the original version of the poem instead of the transformed version. The switch happens entirely inside the try_to_cummingsify callback, which traps the ValueError with an ordinary try/except statement and returns the original poem instead. The deferred object never sees that error at all.

Lastly, we show the case where we try to download a poem from a non-existent server in Figure 23:

Figure 23: when we cannot connect to a server
Figure 23: when we cannot connect to a server

As before, poem_failed returns None so afterwards control switches to the callback line.

Client 5.1

In client 5.0 we are trapping exceptions from cummingsify in our try_to_cummingsify callback using an ordinary try/except statement, rather than letting the deferred catch them first. There isn’t necessarily anything wrong with this strategy, but it’s instructive to see how we might do this differently.

Let’s suppose we wanted to let the deferred catch both GibberishError and ValueError exceptions and send them to the errback line. To preserve the current behavior our subsequent errback needs to check to see if the error is a ValueError and, if so, handle it by returning the original poem, so that control goes back to the callback line and the original poem gets printed out.

But there’s a problem: the errback wouldn’t get the original poem, it would get the Failure-wrapped ValueError raised by the cummingsify function. To let the errback handle the error, we need to arrange for it to receive the original poem.

One way to do that is to modify the cummingsify function so the original poem is included in the exception. That’s what we’ve done in client 5.1, located in twisted-client-5/get-poetry-1.py. We changed the ValueError exception into a custom CannotCummingsify exception which takes the original poem as the first argument.

If cummingsify were a real function in an external module, then it would probably be best to wrap it with another function that trapped any exception that wasn’t GibberishError and raise a CannotCummingsify exception instead. With this new setup, our poetry_main function looks like this:

def poetry_main():
    addresses = parse_args()

    from twisted.internet import reactor

    poems = []
    errors = []

    def cummingsify_failed(err):
        if err.check(CannotCummingsify):
            print 'Cummingsify failed!'
            return err.value.args[0]
        return err

    def got_poem(poem):
        print poem
        poems.append(poem)

    def poem_failed(err):
        print >>sys.stderr, 'The poem download failed.'
        errors.append(err)

    def poem_done(_):
        if len(poems) + len(errors) == len(addresses):
            reactor.stop()

    for address in addresses:
        host, port = address
        d = get_poetry(host, port)
        d.addCallback(cummingsify)
        d.addErrback(cummingsify_failed)
        d.addCallbacks(got_poem, poem_failed)
        d.addBoth(poem_done)

And each deferred we create has the structure pictured in Figure 24:

Figure 24: the deferred chain in client 5.1
Figure 24: the deferred chain in client 5.1

Examine the cummingsify_failed errback:

    def cummingsify_failed(err):
        if err.check(CannotCummingsify):
            print 'Cummingsify failed!'
            return err.value.args[0]
        return err

We are using the check method on Failure objects to test whether the exception embedded in the Failure is an instance of CannotCummingsify. If so, we return the first argument to the exception (the original poem) and thus handle the error. Since the return value is not a Failure, control returns to the callback line. Otherwise, we return the Failure itself and send (re-raise) the error down the errback line. As you can see, the exception is available as the value attribute on the Failure.

Figure 25 shows what happens when we get a CannotCummingsify exception:

Figure 25: when we get a CannotCummingsify error
Figure 25: when we get a CannotCummingsify error

So when we are using a deferred, we can sometimes choose whether we want to use try/except statements to handle exceptions, or let the deferred re-route errors to an errback.

Summary

In Part 10 we updated our poetry client to make use of the Deferred‘s ability to route errors and results down the chain. Although the example was rather artificial, it did illustrate how control flow in a deferred switches back and forth between the callback and errback line depending on the result of each stage.

So now we know everything there is to know about deferreds, right? Not yet! We’re going to explore some more features of deferreds in a future Part. But first we’ll take a little detour and, in Part 11, implement a Twisted version of our poetry server.

Suggested Exercises

  1. Figure 25 shows one of the four possible ways the deferreds in client 5.1 can fire. Draw the other three.
  2. Use the deferred simulator to simulate all possible firings for clients 5.0 and 5.1. To get you started, this simulator program can represent the case where the try_to_cummingsify function succeeds in client 5.0:
    r poem p
    r None r None
    r None r None

19 thoughts on “An Introduction to Asynchronous Programming and Twisted”

  1. Great tutorial! thanks!

    Personally, I’m interested in how to make blocking function not block, so deferred make sense then in server. With calls to database we have runInteraction but how do we write such not blocking functions? Say in your example, server would check if client sent a valid request with validateRequest function, if course this function run fast, but still block

    1. Hi, glad you like the tutorial. There’s no general way to make a blocking function not block, it all depends on what you are doing. If you are doing pure computation and you just need the cpu for a long chunk of time, then here are a few options:

      1. Break up your computation into chunks and use, for example, LoopingCall to periodically make progress on the work.
      2. Use deferToThread to run the computation in a thread.
      3. Use ampoule, a third-party package that lets you send work to sub-processes.
  2. Hi again, Dave, I’ve completed your excellent tutorial and I think I’ve understood most of it. As I commented in a previous chapter, most of the twisted tutorials in the web are outdated (I always find the “From twisted.internet.app import Application” at the beginning of the examples).

    So, trying to learn a little more, I’d like to know if you know of free software
    projects that are working with twisted and that might be a good knowledge source if I take a look at their code.
    I’ve recently subscribed to the twisted-python mailing list, and I’ve digged into the twistedmatrix documentation and, as for me, it’s been almost impossible to find real examples of code using the current twisted api (even in the twistedmatrix documentation faq I’ve discovered code that doesn’t work now when I tried to test the persistence examples).
    Also, doing “apt-cache rdepends python-twisted” in my computer, it only gives very few twisted projects.

    Thanks again for this wonderful tutorial. I’m looking forward for future sequelas.
    José L.

    1. Hey José! You are welcome, I’m having a lot of fun writing it. I didn’t think it would be this long, but I can’t seem to stop.

      When it comes to reading Twisted code, I really recommend starting with Twisted itself, especially some of the ‘outer layers’ like the protocol implementations. Since this code was mainly written by the Twisted authors, it’s a great example of how to use Twisted the right way.

      After that, you might check out this page on the Twisted website that lists quite a few projects that use Twisted.

  3. Hi, I was wondering whether creating an advanced asynchronous port scanner using twisted could be a viable GSoC 13 project? Like including some of the advanced Nmap features? Whats your take on this?

    1. The Twisted developers would have the final say on that, of course, but my guess is that they would not be interested. A port scanner is a very specialized piece of networking code, very few applications would actual need it (aside from a port scanner itself).

  4. Thanks again for your wonderful guide to the twisted world.

    When I was reading the example of the part I think of the idea to send the deferred as deep as to the PoemProtocol. In this way the will not bother the factory to transfer the call chain. In my thought it is more straight to keep a factory only deal with the creation of protocols.

    Is it a good idea to allow the protocol to communicate with the application layer by the deferred? Or is it a more common practice to wrap the protocol in the factory and make the factory a proxy of all the protocols it creates?

    Thanks.

    1. I think it’s reasonable to have the protocol talk to the application layer, especially in the case of a client connection where the factory is only ever going to make one connection. There are Twisted APIs for that situation that allow you to connect with only a protocol.

      Now since the factory is a long-lived object that can keep global state about all protocols and connections, for some applications it might still make a lot of sense for the factory to talk with the application layer instead. The joys and problems of design! :)

  5. Fantastic tutorial. Your approach really crystallizes Twisted’s approach to async, especially all the otherwise “odd” helper functions implemented in many of the Twisted examples.

    As an aside, this would be a great place to reintroduce (or use) trap(), which is only mentioned in the suggested exercises of part 6. Since I had already reviewed some Twisted stuff before finding this tutorial, I didn’t read the Suggested Exercises in earlier chapters and was surprised when it wasn’t used here.

Leave a Reply