Part 19: I Thought I Wanted It But I Changed My Mind
This continues the introduction started here. You can find an index to the entire series here.
Twisted is an ongoing project and the Twisted developers regularly add new features and extend old ones. With the release of Twisted 10.1.0, the developers added a new capability — cancellation — to the
Deferred class which we’re going to investigate today.
Asynchronous programming decouples requests from responses and thus raises a new possibility: between asking for the result and getting it back you might decide you don’t want it anymore. Consider the poetry proxy server from Part 14. Here’s how the proxy worked, at least for the first request of a poem:
- A request for a poem comes in.
- The proxy contacts the real server to get the poem.
- Once the poem is complete, send it to the original client.
Which is all well and good, but what if the client hangs up before getting the poem? Maybe they requested the complete text of Paradise Lost and then decided they really wanted a haiku by Kojo. Now our proxy is stuck with downloading the first one and that slow server is going to take a while. Better to close the connection and let the slow server go back to sleep.
Recall Figure 15, a diagram that shows the conceptual flow of control in a synchronous program. In that figure we see function calls going down, and exceptions going back up. If we wanted to cancel a synchronous function call (and this is just hypothetical) the flow control would go in the same direction as the function call, from high-level code to low-level code as in Figure 38:
Of course, in a synchronous program that isn’t possible because the high-level code doesn’t even resume running until the low-level operation is finished, at which point there is nothing to cancel. But in an asynchronous program the high-level code gets control of the program before the low-level code is done, which at least raises the possibility of canceling the low-level request before it finishes.
In a Twisted program, the lower-level request is embodied by a
Deferred object, which you can think of as a “handle” on the outstanding asynchronous operation. The normal flow of information in a deferred is downward, from low-level code to high-level code, which matches the flow of return information in a synchronous program. Starting in Twisted 10.1.0, high-level code can send information back the other direction — it can tell the low-level code it doesn’t want the result anymore. See Figure 39:
Let’s take a look at a few sample programs to see how canceling deferreds actually works. Note, to run the examples and other code in this Part you will need a version of Twisted 10.1.0 or later. Consider deferred-cancel/defer-cancel-1.py:
from twisted.internet import defer def callback(res): print 'callback got:', res d = defer.Deferred() d.addCallback(callback) d.cancel() print 'done'
With the new cancellation feature, the
Deferred class got a new method called
cancel. The example code makes a new deferred, adds a callback, and then cancels the deferred without firing it. Here’s the output:
done Unhandled error in Deferred: Traceback (most recent call last): Failure: twisted.internet.defer.CancelledError:
Ok, so canceling a deferred appears to cause the errback chain to run, and our regular callback is never called at all. Also notice the error is a
twisted.internet.defer.CancelledError, a custom Exception that means the deferred was canceled (but keep reading!). Let’s try adding an errback in deferred-cancel/defer-cancel-2.py:
from twisted.internet import defer def callback(res): print 'callback got:', res def errback(err): print 'errback got:', err d = defer.Deferred() d.addCallbacks(callback, errback) d.cancel() print 'done'
Now we get this output:
errback got: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.defer.CancelledError'>: ] done
So we can ‘catch’ the errback from a cancel just like any other deferred failure.
Ok, let’s try firing the deferred and then canceling it, as in deferred-cancel/defer-cancel-3.py:
from twisted.internet import defer def callback(res): print 'callback got:', res def errback(err): print 'errback got:', err d = defer.Deferred() d.addCallbacks(callback, errback) d.callback('result') d.cancel() print 'done'
Here we fire the deferred normally with the
callback method and then cancel it. Here’s the output:
callback got: result done
Our callback was invoked (just as we would expect) and then the program finished normally, as if
cancel was never called at all. So it seems canceling a deferred has no effect if it has already fired (but keep reading!).
What if we fire the deferred after we cancel it, as in deferred-cancel/defer-cancel-4.py?
from twisted.internet import defer def callback(res): print 'callback got:', res def errback(err): print 'errback got:', err d = defer.Deferred() d.addCallbacks(callback, errback) d.cancel() d.callback('result') print 'done'
In that case we get this output:
errback got: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.defer.CancelledError'>: ] done
Interesting! That’s the same output as the second example, where we never fired the deferred at all. So if the deferred has been canceled, firing the deferred normally has no effect. But why doesn’t
d.callback('result') raise an error, since you’re not supposed to be able to fire a deferred more than once, and the errback chain has clearly run?
Consider Figure 39 again. Firing a deferred with a result or failure is the job of lower-level code, while canceling a deferred is an action taken by higher-level code. Firing the deferred means “Here’s your result”, while canceling a deferred means “I don’t want it any more”. And remember that canceling is a new feature, so most existing Twisted code is not written to handle cancel operations. But the Twisted developers have made it possible for us to cancel any deferred we want to, even if the code we got the deferred from was written before Twisted 10.1.0.
To make that possible, the
cancel method actually does two things:
- Tell the
Deferredobject itself that you don’t want the result if it hasn’t shown up yet (i.e, the deferred hasn’t been fired), and thus to ignore any subsequent invocation of
- And, optionally, tell the lower-level code that is producing the result to take whatever steps are required to cancel the operation.
Since older Twisted code is going to go ahead and fire that canceled deferred anyway, step #1 ensures our program won’t blow up if we cancel a deferred we got from an older library.
This means we are always free to cancel a deferred, and we’ll be sure not to get the result if it hasn’t arrived (even if it arrives later). But canceling the deferred might not actually cancel the asynchronous operation. Aborting an asynchronous operation requires a context-specific action. You might need to close a network connection, roll back a database transaction, kill a sub-process, et cetera. And since a deferred is just a general-purpose callback organizer, how is it supposed to know what specific action to take when you cancel it? Or, alternatively, how could it forward the cancel request to the lower-level code that created and returned the deferred in the first place? Say it with me now:
I know, with a callback!
Canceling Deferreds, Really
Alright, take a look at deferred-cancel/defer-cancel-5.py:
from twisted.internet import defer def canceller(d): print "I need to cancel this deferred:", d def callback(res): print 'callback got:', res def errback(err): print 'errback got:', err d = defer.Deferred(canceller) # created by lower-level code d.addCallbacks(callback, errback) # added by higher-level code d.cancel() print 'done'
This code is basically like the second example, except there is a third callback (
canceller) that’s passed to the
Deferred when we create it, rather than added afterwards. This callback is in charge of performing the context-specific actions required to abort the asynchronous operation (only if the deferred is actually canceled, of course). The
canceller callback is necessarily part of the lower-level code that returns the deferred, not the higher-level code that receives the deferred and adds its own callbacks and errbacks.
Running the example produces this output:
I need to cancel this deferred: <Deferred at 0xb7669d2cL> errback got: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.defer.CancelledError'>: ] done
As you can see, the
canceller callback is given the deferred whose result we no longer want. That’s where we would take whatever action we need to in order to abort the asynchronous operation. Notice that
canceller is invoked before the errback chain fires. In fact, we may choose to fire the deferred ourselves at this point with any result or error of our choice (and thus preempting the
CancelledError failure). Both possibilities are illustrated in deferred-cancel/defer-cancel-6.py and deferred-cancel/defer-cancel-7.py.
Let’s do one more simple test before we fire up the reactor. We’ll create a deferred with a
canceller callback, fire it normally, and then cancel it. You can see the code in deferred-cancel/defer-cancel-8.py. By examining the output of that script, you can see that canceling a deferred after it has been fired does not invoke the
canceller callback. And that’s as we would expect since there’s nothing to cancel.
The examples we’ve looked at so far haven’t had any actual asynchronous operations. Let’s make a simple program that invokes one asynchronous operation, then we’ll figure out how to make that operation cancellable. Consider the code in deferred-cancel/defer-cancel-9.py:
from twisted.internet.defer import Deferred def send_poem(d): print 'Sending poem' d.callback('Once upon a midnight dreary') def get_poem(): """Return a poem 5 seconds later.""" from twisted.internet import reactor d = Deferred() reactor.callLater(5, send_poem, d) return d def got_poem(poem): print 'I got a poem:', poem def poem_error(err): print 'get_poem failed:', err def main(): from twisted.internet import reactor reactor.callLater(10, reactor.stop) # stop the reactor in 10 seconds d = get_poem() d.addCallbacks(got_poem, poem_error) reactor.run() main()
This example includes a
get_poem function that uses the reactor’s
callLater method to asynchronously return a poem five seconds after
get_poem is called. The
main function calls
get_poem, adds a callback/errback pair, and then starts up the reactor. We also arrange (again using
callLater) to stop the reactor in ten seconds. Normally we would do this by attaching a callback to the deferred, but you’ll see why we do it this way shortly.
Running the example produces this output (after the appropriate delay):
Sending poem I got a poem: Once upon a midnight dreary
And after ten seconds our little program comes to a stop. Now let’s try canceling that deferred before the poem is sent. We’ll just add this bit of code to cancel the deferred after two seconds (well before the five second delay on the poem itself):
reactor.callLater(2, d.cancel) # cancel after 2 seconds
The complete program is in deferred-cancel/defer-cancel-10.py, which produces the following output:
get_poem failed: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.defer.CancelledError'>: ] Sending poem
This example clearly illustrates that canceling a deferred does not necessarily cancel the underlying asynchronous request. After two seconds we see the output from our errback, printing out the
CancelledError as we would expect. But then after five seconds will still see the output from
send_poem (but the callback on the deferred doesn’t fire).
At this point we’re just in the same situation as deferred-cancel/defer-cancel-4.py. “Canceling” the deferred causes the eventual result to be ignored, but doesn’t abort the operation in any real sense. As we learned above, to make a truly cancelable deferred we must add a
cancel callback when the deferred is created.
What does this new callback need to do? Take a look at the documentation for the
callLater method. The return value of
callLater is another object, implementing
IDelayedCall, with a
cancel method we can use to prevent the delayed call from being executed.
That’s pretty simple, and the updated code is in deferred-cancel/defer-cancel-11.py. The relevant changes are all in the
def get_poem(): """Return a poem 5 seconds later.""" def canceler(d): # They don't want the poem anymore, so cancel the delayed call delayed_call.cancel() # At this point we have three choices: # 1. Do nothing, and the deferred will fire the errback # chain with CancelledError. # 2. Fire the errback chain with a different error. # 3. Fire the callback chain with an alternative result. d = Deferred(canceler) from twisted.internet import reactor delayed_call = reactor.callLater(5, send_poem, d) return d
In this new version, we save the return value from
callLater so we can use it in our cancel callback. The only thing our callback needs to do is invoke
delayed_call.cancel(). But as we discussed above, we could also choose to fire the deferred ourselves. The latest version of our example produces this output:
get_poem failed: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.defer.CancelledError'>: ]
As you can see, the deferred is canceled and the asynchronous operation has truly been aborted (i.e., we don’t see the
Poetry Proxy 3.0
As we discussed in the Introduction, the poetry proxy server is a good candidate for implementing cancellation, as it allows us to abort the poem download if it turns out that nobody wants it (i.e., the client closes the connection before we send the poem). Version 3.0 of the proxy, located in twisted-server-4/poetry-proxy.py, implements deferred cancellation. The first change is in the
class PoetryProxyProtocol(Protocol): def connectionMade(self): self.deferred = self.factory.service.get_poem() self.deferred.addCallback(self.transport.write) self.deferred.addBoth(lambda r: self.transport.loseConnection()) def connectionLost(self, reason): if self.deferred is not None: deferred, self.deferred = self.deferred, None deferred.cancel() # cancel the deferred if it hasn't fired
You might compare it to the older version. The two main changes are:
- Save the deferred we get from
get_poemso we can cancel later if we need to.
- Cancel the deferred when the connection is closed. Note this also cancels the deferred after we actually get the poem, but as we discovered in the examples, canceling a deferred that has already fired has no effect.
Now we need to make sure that canceling the deferred actually aborts the poem download. For that we need to change the
class ProxyService(object): poem = None # the cached poem def __init__(self, host, port): self.host = host self.port = port def get_poem(self): if self.poem is not None: print 'Using cached poem.' # return an already-fired deferred return succeed(self.poem) def canceler(d): print 'Canceling poem download.' factory.deferred = None connector.disconnect() print 'Fetching poem from server.' deferred = Deferred(canceler) deferred.addCallback(self.set_poem) factory = PoetryClientFactory(deferred) from twisted.internet import reactor connector = reactor.connectTCP(self.host, self.port, factory) return factory.deferred def set_poem(self, poem): self.poem = poem return poem
Again, you may wish to compare this with the older version. This class has a few more changes:
- We save the return value from
reactor.connectTCP, an IConnector object. We can use the
disconnectmethod on that object to close the connection.
- We create the deferred with a
cancelercallback. That callback is a closure which uses the
connectorto close the connection. But first it sets the
None. Otherwise, the factory might fire the deferred with a “connection closed” errback before the deferred itself fires with a
CancelledError. Since this deferred was canceled, having the deferred fire with
CancelledErrorseems more explicit.
You might also notice we now create the deferred in the
ProxyService instead of the
PoetryClientFactory. Since the canceler callback needs to access the
IConnector object, the
ProxyService ends up being the most convenient place to create the deferred.
And, as in one of our earlier examples, our
canceler callback is implemented as a closure. Closures seem to be very useful when implementing cancel callbacks!
Let’s try out our new proxy. First start up a slow server. It needs to be slow so we actually have time to cancel:
python blocking-server/slowpoetry.py --port 10001 poetry/fascination.txt
Now we can start up our proxy (remember you need Twisted 10.1.0):
python twisted-server-4/poetry-proxy.py --port 10000 10001
Now we can start downloading a poem from the proxy using any client, or even just curl:
After a few seconds, press Ctrl-C to stop the client, or the curl process. In the terminal running the proxy you should
see this output:
Fetching poem from server. Canceling poem download.
And you should see the slow server has stopped printing output for each bit of poem it sends, since our proxy hung up. You can start and stop the client multiple times to verify each download is canceled each time. But if you let the poem run to completion, then the proxy caches the poem and sends it immediately after that.
One More Wrinkle
We said several times above that canceling an already-fired deferred has no effect. Well, that’s not quite true. In Part 13 we learned that the callbacks and errbacks attached to a deferred may return deferreds themselves. And in that case, the original (outer) deferred pauses the execution of its callback chains and waits for the inner deferred to fire (see Figure 28).
Thus, even though a deferred has fired the higher-level code that made the asynchronous request may not have received the result yet, because the callback chain is paused waiting for an inner deferred to finish. So what happens if the higher-level code cancels that outer deferred? In that case the outer deferred does not cancel itself (it has already fired after all); instead, the outer deferred cancels the inner deferred.
So when you cancel a deferred, you might not be canceling the main asynchronous operation, but rather some other asynchronous operation triggered as a result of the first. Whew!
We can illustrate this with one more example. Consider the code in deferred-cancel/defer-cancel-12.py:
from twisted.internet import defer def cancel_outer(d): print "outer cancel callback." def cancel_inner(d): print "inner cancel callback." def first_outer_callback(res): print 'first outer callback, returning inner deferred' return inner_d def second_outer_callback(res): print 'second outer callback got:', res def outer_errback(err): print 'outer errback got:', err outer_d = defer.Deferred(cancel_outer) inner_d = defer.Deferred(cancel_inner) outer_d.addCallback(first_outer_callback) outer_d.addCallbacks(second_outer_callback, outer_errback) outer_d.callback('result') # at this point the outer deferred has fired, but is paused # on the inner deferred. print 'canceling outer deferred.' outer_d.cancel() print 'done'
In this example we create two deferreds, the outer and the inner, and have one of the outer callbacks return the inner deferred. First we fire the outer deferred, and then we cancel it. The example produces this output:
first outer callback, returning inner deferred canceling outer deferred. inner cancel callback. outer errback got: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.defer.CancelledError'>: ] done
As you can see, canceling the outer deferred does not cause the outer cancel callback to fire. Instead, it cancels the inner deferred so the inner cancel callback fires, and then outer errback receives the
CancelledError (from the inner deferred).
You may wish to stare at that code a while, and try out variations to see how they affect the outcome.
Canceling a deferred can be a very useful operation, allowing our programs to avoid work they no longer need to do. And as we have seen, it can be a little bit tricky, too.
One very important fact to keep in mind is that canceling a deferred doesn’t necessarily cancel the underlying asynchronous operation. In fact, as of this writing, most deferreds won’t really “cancel”, since most Twisted code was written prior to Twisted 10.1.0 and hasn’t been updated. This includes many of the APIs in Twisted itself! Check the documentation and/or the source code to find out whether canceling the deferred will truly cancel the request, or simply ignore it.
And the second important fact is that simply returning a deferred from your asynchronous APIs will not necessarily make them cancelable in the complete sense of the word. If you want to implement canceling in your own programs, you should study the Twisted source code to find more examples. Cancellation is a brand new feature so the patterns and best practices are still being worked out.
At this point we’ve learned just about everything about Deferreds and the core concepts behind Twisted. Which means there’s not much more to introduce, as the rest of Twisted consists mainly of specific applications, like web programming or asynchronous database access. So in the next couple of Parts we’re going to take a little detour and look at two other systems that use asynchronous I/O to see how some of their ideas relate to the ideas in Twisted. Then, in the final Part, we will wrap up and suggest ways to continue your Twisted education.
- Did you know you can spell canceled with one or two els? It’s true. It all depends on what sort of mood you’re in.
- Peruse the source code of the
Deferredclass, paying special attention to the implementation of cancellation.
- Search the Twisted 10.10 source code for examples of deferreds with cancel callbacks. Study their implementation.
- Make the deferred returned by the
get_poetrymethod of one of our poetry clients cancelable.
- Make a reactor-based example that illustrates canceling an outer deferred which is paused on an inner deferred. If you use
callLateryou will need to choose the delays carefully to ensure the outer deferred is canceled at the right moment.
- Find an asynchronous API in Twisted that doesn’t support a true cancel and implement cancellation for it. Submit a patch to the Twisted project. Don’t forget unit tests!
7 replies on “I Thought I Wanted It But I Changed My Mind”
[…] 原文:http://krondo69349291.wpcomstaging.com/blog/?p=2601 作者:dave 译者:notedit 时间:2011.07.03 […]
Thanks a ton for this article. I was updating my Twisted.Web code to support cancellation (or at least, not throw a bunch of errors on closed connections.) It was somewhat confusing, but your examples helped clarify the conditions under which each callback or error callback is run. I now have it properly working at my web request level, and hopefully soon will be able to actually have my actual database transactions abort properly on cancellation.
Cool, you are welcome!
Okay this is more of a Python scoping question I suppose but I find it astonishing that your use of delayed_call in defer-cancel-11.py works.
It looks like a closure but delayed_call had not been defined get_poem when the function was created. I’m not even entirely sure why I find that weird but I do. :/
How does Python know to take this variable from the enclosing scope and store it with the closure if it has not been defined yet?
Does that make any sense? 🙂
It is definitely a closure and they can be a little brain-bendy 🙂 Closures
seem to work very well for the cancel callbacks.
I haven’t looked exactly at how Python implements closures, but evidently
it is doing some syntactic analysis of the closure code to identify the local
variables it will need to close over from the enclosing scope (I don’t think
it just blindly hangs on to all the locals, but I could be mistaken there).
[…] I Thought I Wanted It But I Changed My Mind […]