Part 14: When a Deferred Isn’t
This continues the introduction started here. You can find an index to the entire series here.
Introduction
In this part we’re going to learn another aspect of the Deferred
class. To motivate the discussion, we’ll add one more server to our stable of poetry-related services. Suppose we have a large number of internal clients who want to get poetry from the same external server. But this external server is slow and already over-burdened by the insatiable demand for poetry across the Internet. We don’t want to contribute to that poor server’s problems by sending all our clients there too.
So instead we’ll make a caching proxy server. When a client connects to the proxy, the proxy will either fetch the poem from the external server or return a cached copy of a previously retrieved poem. Then we can point all our clients at the proxy and our contribution to the external server’s load will be negligible. We illustrate this setup in Figure 30:

Consider what happens when a client connects to the proxy to get a poem. If the proxy’s cache is empty, the proxy must wait (asynchronously) for the external server to respond before sending a poem back. So far so good, we already know how to handle that situation with an asynchronous function that returns a deferred. On the other hand, if there’s already a poem in the cache, the proxy can send it back immediately, no need to wait at all. So the proxy’s internal mechanism for getting a poem will sometimes be asynchronous and sometimes synchronous.
So what do we do if we have a function that is only asynchronous some of the time? Twisted provides a couple of options, and they both depend on a feature of the Deferred
class we haven’t used yet: you can fire a deferred before you return it to the caller.
This works because, although you cannot fire a deferred twice, you can add callbacks and errbacks to a deferred after it has fired. And when you do so, the deferred simply continues firing the chain from where it last left off. One important thing to note is an already-fired deferred may fire the new callback (or errback, depending on the state of the deferred) immediately, i.e., right when you add it.
Consider Figure 31, showing a deferred that has been fired:

If we were to add another callback/errback pair at this point, then the deferred would immediately fire the new callback, as in Figure 32:

The callback (not the errback) is fired because the previous callback succeeded. If it had failed (raised an Exception or returned a Failure) then the new errback would have been called instead.
We can test out this new feature with the example code in twisted-deferred/defer-11.py. Read and run that script to see how a deferred behaves when you fire it and then add callbacks. Note how in the first example each new callback is invoked immediately (you can tell from the order of the print output).
The second example in that script shows how we can pause()
a deferred so it doesn’t fire the callbacks right away. When we are ready for the callbacks to fire, we call unpause()
. That’s actually the same mechanism the deferred uses to pause itself when one of its callbacks returns another deferred. Nifty!
Proxy 1.0
Now let’s look at the first version of the poetry proxy in twisted-server-1/poetry-proxy.py. Since the proxy acts as both a client and a server, it has two pairs of Protocol/Factory classes, one for serving up poetry, and one for getting a poem from the external server. We won’t bother looking at the code for the client pair, it’s the same as in previous poetry clients.
But before we look at the server pair, we’ll look at the ProxyService
, which the server-side protocol uses to get a poem:
class ProxyService(object): poem = None # the cached poem def __init__(self, host, port): self.host = host self.port = port def get_poem(self): if self.poem is not None: print 'Using cached poem.' return self.poem print 'Fetching poem from server.' factory = PoetryClientFactory() factory.deferred.addCallback(self.set_poem) from twisted.internet import reactor reactor.connectTCP(self.host, self.port, factory) return factory.deferred def set_poem(self, poem): self.poem = poem return poem
The key method there is get_poem
. If there’s already a poem in the cache, that method just returns the poem itself. On the other hand, if we haven’t got a poem yet, we initiate a connection to the external server and return a deferred that will fire when the poem comes back. So get_poem
is a function that is only asynchronous some of the time.
How do you handle a function like that? Let’s look at the server-side protocol/factory pair:
class PoetryProxyProtocol(Protocol): def connectionMade(self): d = maybeDeferred(self.factory.service.get_poem) d.addCallback(self.transport.write) d.addBoth(lambda r: self.transport.loseConnection()) class PoetryProxyFactory(ServerFactory): protocol = PoetryProxyProtocol def __init__(self, service): self.service = service
The factory is straightforward — it’s just saving a reference to the proxy service so that protocol instances can call the get_poem
method. The protocol is where the action is. Instead of calling get_poem
directly, the protocol uses a wrapper function from the twisted.internet.defer
module named maybeDeferred
.
The maybeDeferred
function takes a reference to another function, plus some optional arguments to call that function with (we aren’t using any here). Then maybeDeferred
will actually call that function and:
- If the function returns a deferred,
maybeDeferred
returns that same deferred, or - If the function returns a Failure,
maybeDeferred
returns a new deferred that has been fired (via.errback
) with that Failure, or - If the function returns a regular value,
maybeDeferred
returns a deferred that has already been fired with that value as the result, or - If the function raises an exception,
maybeDeferred
returns a deferred that has already been fired (via.errback()
) with that exception wrapped in a Failure.
In other words, the return value from maybeDeferred
is guaranteed to be a deferred, even if the function you pass in never returns a deferred at all. This allows us to safely call a synchronous function (even one that fails with an exception) and treat it like an asynchronous function returning a deferred.
Note 1: There will still be a subtle difference, though. A deferred returned by a synchronous function has already been fired, so any callbacks or errbacks you add will run immediately, rather than in some future iteration of the reactor loop.
Note 2: In hindsight, perhaps naming a function that always returns a deferred “maybeDeferred” was not the best choice, but there you go.
Once the protocol has a real deferred in hand, it can just add some callbacks that send the poem to the client and then close the connection. And that’s it for our first poetry proxy!
Running the Proxy
To try out the proxy, start up a poetry server, like this:
python twisted-server-1/fastpoetry.py --port 10001 poetry/fascination.txt
And now start a proxy server like this:
python twisted-server-1/poetry-proxy.py --port 10000 10001
It should tell you that it’s proxying poetry on port 10000 for the server on port 10001.
Now you can point a client at the proxy:
python twisted-client-4/get-poetry.py 10000
We’ll use an earlier version of the client that isn’t concerned with poetry transformations. You should see the poem appear in the client window and some text in the proxy window saying it’s fetching the poem from the server. Now run the client again and the proxy should confirm it is using the cached version of the poem, while the client should show the same poem as before.
Proxy 2.0
As we mentioned earlier, there’s an alternative way to implement this scheme. This is illustrated in Poetry Proxy 2.0, located in twisted-server-2/poetry-proxy.py. Since we can fire deferreds before we return them, we can make the proxy service return an already-fired deferred when there’s already a poem in the cache. Here’s the new version of the get_poem
method on the proxy service:
def get_poem(self): if self.poem is not None: print 'Using cached poem.' # return an already-fired deferred return succeed(self.poem) print 'Fetching poem from server.' factory = PoetryClientFactory() factory.deferred.addCallback(self.set_poem) from twisted.internet import reactor reactor.connectTCP(self.host, self.port, factory) return factory.deferred
The defer.succeed
function is just a handy way to make an already-fired deferred given a result. Read the implementation for that function and you’ll see it’s simply a matter of making a new deferred and then firing it with .callback()
. If we wanted to return an already-failed deferred we could use defer.fail
instead.
In this version, since get_poem
always returns a deferred, the protocol class no longer needs to use maybeDeferred
(though it would still work if it did, as we learned above):
class PoetryProxyProtocol(Protocol): def connectionMade(self): d = self.factory.service.get_poem() d.addCallback(self.transport.write) d.addBoth(lambda r: self.transport.loseConnection())
Other than these two changes, the second version of the proxy is just like the first, and you can run it in the same way we ran the original version.
Summary
In this Part we learned how deferreds can be fired before they are returned, and thus we can use them in synchronous (or sometimes synchronous) code. And we have two ways to do that:
- We can use
maybeDeferred
to handle a function that sometimes returns a deferred and other times returns a regular value (or throws an exception), or - We can pre-fire our own deferreds, using
defer.succeed
anddefer.fail
, so our “semi-synchronous” functions always return a deferred no matter what.
Which technique we choose is really up to us. The former emphasizes the fact that our functions aren’t always asynchronous while the latter makes the client code simpler. Perhaps there’s not a definitive argument for choosing one over the other.
Both techniques are made possible because we can add callbacks and errbacks to a deferred after it has fired. And that explains the curious fact we discovered in Part 9 and the twisted-deferred/defer-unhandled.py example. We learned that an “unhandled error” in a deferred, in which either the last callback or errback fails, isn’t reported until the deferred is garbage collected (i.e., there are no more references to it in user code). Now we know why — since we could always add another callback pair to a deferred which does handle that error, it’s not until the last reference to a deferred is dropped that Twisted can say the error was not handled.
Now that you’ve spent so much time exploring the Deferred
class, which is located in the twisted.internet
package, you may have noticed it doesn’t actually have anything to do with the Internet. It’s just an abstraction for managing callbacks. So what’s it doing there? That is an artifact of Twisted’s history. In the best of all possible worlds (where I am paid millions of dollars to play in the World Ultimate Frisbee League), the defer
module would probably be in twisted.python
. Of course, in that world you would probably be too busy fighting crime with your super-powers to read this introduction. I suppose that’s life.
So is that it for deferreds? Do we finally know all their features? For the most part, we do. But Twisted includes alternate ways of using deferreds that we haven’t explored yet (we’ll get there!). And in the meantime, the Twisted developers have been beavering away adding new stuff. In an upcoming release, the Deferred
class will acquire a brand new capability. We’ll introduce it in a future Part, but first we’ll take a break from deferreds and look at some other aspects of Twisted, including testing in Part 15.
Suggested Exercises
- Modify the twisted-deferred/defer-11.py example to illustrate pre-failing deferreds using
.errback()
. Read the documentation and implementation of thedefer.fail
function. - Modify the proxy so that a cached poem older than 2 hours is discarded, causing the next poetry request to re-request it from the server
- The proxy is supposed to avoid contacting the server more than once, but if several client requests come in at the same time when there is no poem in the cache, the proxy will make multiple poetry requests. It’s easier to see if you use a slow server to test it out.
Modify the proxy service so that only one request is generated. Right now the service only has two states: either the poem is in the cache or it isn’t. You will need to recognize a third state indicating a request has been made but not completed. When the
get_poem
method is called in the third state, add a new deferred to a list of ‘waiters’. That new deferred will be the result of theget_poem
method. When the poem finally comes back, fire all the waiting deferreds with the poem and transition to the cached state. On the other hand, if the poem fails, fire the.errback()
method of all the waiters and transition to the non-cached state. - Add a transformation proxy to the proxy service. This service should work like the original transformation service, but use an external server to do the transformations.
- Consider this hypothetical piece of code:
d = some_async_function() # d is a Deferred d.addCallback(my_callback) d.addCallback(my_other_callback) d.addErrback(my_errback)
Suppose that when the deferred
d
is returned on line 1, it has not been fired. Is it possible for that deferred
to fire while we are adding our callbacks and errback on lines 2-4? Why or why not?
26 replies on “When a Deferred Isn’t”
As always, great!
small typo “The key method there is get_poetry.” It is get_poem not get_poetry
Can’t wait for other parts 😉
Nice catch, thank you!
Hi Dave
Your articles are so great – and I wish I had time to read them 🙂
An advanced(?) exercise that could be used here is to imagine that the proxy takes an identifier (perhaps just a string title) of the poem it is to fetch. The exercise is to think of a way to arrange for a request for poem P to receive the poem properly if a previous request (from another client) for P is already in flight. I.e., it’s neither of your 2 cases a) the poem is not cached or b) the poem is cached, it’s the intermediate one. Make sense?
I hit this situation and wrote a little decorator for it that I like a lot. It could use a little improvement, but the code at http://paste.pocoo.org/show/204769/ gets the job done.
You use it like:
class PoemProxy:
@DeferredPooler
def get_poem(self, poem):
# Do your normal stuff here, and return a deferred.
That’s it.
Thanks again!
Terry
Neato 🙂 I love it.
Would you mind to write some description of what’s goinging on there?
I’ve written something similar myself 🙂 Here’s how I would describe it:
I wonder if it might be better in _callOthers to delete the key first, in case the callbacks trigger
another call to the decorator with the same arguments?
[…] 原文:http://krondo69349291.wpcomstaging.com/blog/?p=2205 作者:dave 译者:notedit 时间:2011.06.27 […]
Maybe it’s the law of diminishing returns taking affect, but I have been going through your exercises for hours and haven’t had any problems until now. For exercise 3, the only change I made to the Service.get_poem was this. One client prints the full poem, and one client prints a blank string. Any hints on what else I could have missed? I absolutely love this site and your writing/teaching style. It works so well!
if self.request_made == True:
#request has been made but not completed
return self.factory.deferred
print ‘Fetching poem from server.’
self.factory = PoetryClientFactory()
self.factory.deferred.addCallback(self.set_poem)
from twisted.internet import reactor
reactor.connectTCP(self.host, self.port, self.factory)
self.request_made = True
return self.factory.deferred
Thanks for the kind words!
So consider that you are returning the same deferred
for calls that are ‘sharing’ the same request. Which
means that one deferred will have multiple callbacks
added to it. The second (and third, etc.) callbacks
will not receive the original result of the deferred,
but the result from the previous callback.
In this situation, I think you will want to return a
separate deferred object for each request, even if they
are ‘sharing’ the same underlying request to the server.
Otherwise, the different users of the service could end
up stepping on each others callbacks, if you see what I
mean.
What if you kept a list of deferreds instead?
I was wondering if that was my problem when I went back and re-read your question. At first I was looking at a DeferredList to see if that does what I want. http://twistedmatrix.com/documents/12.1.0/api/twisted.internet.defer.DeferredList.html
But I am not so sure that’s what I really need. I probably just need to sleep on it, can only absorb so much in one day. Thanks again, I will report back when it clicks!
Sure, good luck!
Sometimes…you just need to sleep on it. I am not sure this is the best answer, but it works. If I could have done it better, let me know. Thanks so much!
class ProxyService(object):
poem = None # the cached poem
request_made = False
waiters = []
def __init__(self, host, port):
self.host = host
self.port = port
def get_poem(self):
if self.poem is not None:
print ‘Using cached poem.’
return self.poem
if self.request_made == True:
#request has been made but not completed
self.waiters.append(Deferred())
return self.waiters[-1]
print ‘Fetching poem from server.’
self.factory = PoetryClientFactory()
self.factory.deferred.addCallback(self.set_poem)
from twisted.internet import reactor
reactor.connectTCP(self.host, self.port, self.factory)
self.request_made = True
self.waiters.append(Deferred())
return self.waiters[-1]
def set_poem(self, poem):
self.request_made = False
self.poem = poem
for waiter in self.waiters:
waiter.callback(poem)
I think you got it! One thing to beware of is the class-level
attribute ‘waiters’. If there was more than one instance of
ProxyService in your program you wouldn’t want them to
share waiters amongst each other (since they would probably
be pointing to different servers).
Good point, thanks!
[…] When a Deferred Isn’t […]
Hi Dave, thanks a lot for a great tutorial!!
I was wondering, why do we need a lambda function in
d.addBoth(lambda r: self.transport.loseConnection())?
Hi Alex, that is because the
transport.loseConnection
method takes no arguments, but the callbacks and errbacks of a deferred are always given at least one argument: the result of the callback, or the failure of the errback. So we use thelambda
to ‘swallow’ the result/error.Got it! Thank you, Dave!
The function name of maybeDeferred means “Invoke a function that may or may not return a deferred.” http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.maybeDeferred.html
The tutorial is really nice, thank you.
Thanks very much!
Hi dave, U R amazing 🙂
i haven’t had any problem with excercises until the third excercise 🙁
you told : ” … On the other hand, if the poem fails, fire the .errback() method of all the waiters and transition to the non-cached state. ”
My code is almost similar to Nathan’s code ( look above ). i don’t know how transfer an already created defer to non-cached state !
let me explain more :
“get_poem” returns a defer :
> if self.request_made == True:
> self.waiters.append(Deferred())
> return self.waiters[-1]
after that, “connectionMade” adds callback to defer :
> d.addCallback(self.transport.write)
> d.addBoth(lambda r: self.transport.loseConnection())
so i can’t transfer a defer which it’s first callback is “transport.write” to non-cached state !!!!
i hope i explained it clearly 🙂
thanks
Hello! Glad you like the series. What I mean by ‘non-cached state’ is that the caching service will re-attempt to get that poem when the next request for it comes in. In other words, it will not cache ‘errors’, only successful poems. Does that make sense?
Unfortunately the code cannot be viewed anymore as the Pocoo team has retired their service.
Hi Dave, congrats and thanks for your work. It’s amazing.
I just have a very basic question:
I understand that every connection from a client to the proxy creates it’s own protocol instance.
Does this mean that there will never be a collision between variables in diferent connections such as ‘factory’ or ‘d’ (deferred object) even in the case that those connections happen simultaneous?.
Since the abstraction layer is pretty high I sometimes loose the track of what’s actually going on ‘beyond’, and reading code I feel that values in get_poetry() for one connection could be interfered by the arrival of another simultaneous connection which redefines these values.
Thanks.
Hi Andreu, glad you liked the tutorial!
Each connection in the server gets its own Protocol instance, so those are never shared.
There is, however, only on ProtocolFactory for the server on that connections, so each Protocol
instance will have a reference to the same factory. That allows the connections to share state.
To what extent they do is up to you.
[…] 本部分原作参见: dave @ http://krondo69349291.wpcomstaging.com/?p=2205 […]