Twistier Poetry

Part 5: Twistier Poetry

This continues the introduction started here. You can find an index to the entire series here.

Abstract Expressionism

In Part 4 we made our first poetry client that uses Twisted. It works pretty well, but there is definitely room for improvement.

First of all, the client includes code for mundane details like creating network sockets and receiving data from those sockets. Twisted provides support for these sorts of things so we don’t have to implement them ourselves every time we write a new program. This is especially helpful because asynchronous I/O requires a few tricky bits involving exception handling as you can see in the client code. And there are even more tricky bits if you want your code to work on multiple platforms. If you have a free afternoon, search the Twisted sources for “win32” to see how many corner cases that platform introduces.

Another problem with the current client is error handling. Try running version 1.0 of the Twisted client and tell it to download from a port with no server. It just crashes. We could fix the current client, but error handling is easier with the Twisted APIs we’ll be using today.

Finally, the client isn’t particularly re-usable. How would another module get a poem with our client? How would the “calling” module know when the poem had finished downloading? We can’t write a function that simply returns the text of the poem as that would require blocking until the entire poem is read. This is a real problem but we’re not going to fix it today — we’ll save that for future Parts.

We’re going to fix the first and second problems using a higher-level set of APIs and Interfaces. The Twisted framework is loosely composed of layers of abstractions and learning Twisted means learning what those layers provide, i.e, what APIs, Interfaces, and implementations are available for use in each one. Since this is an introduction we’re not going to study each abstraction in complete detail or do an exhaustive survey of every abstraction that Twisted offers. We’re just going to look at the most important pieces to get a better feel for how Twisted is put together. Once you become familiar with the overall style of Twisted’s architecture, learning new parts on your own will be much easier.

In general, each Twisted abstraction is concerned with one particular concept. For example, the 1.0 client from Part 4 uses IReadDescriptor, the abstraction of a “file descriptor you can read bytes from”. A Twisted abstraction is usually defined by an Interface specifying how an object embodying that abstraction should behave. The most important thing to keep in mind when learning a new Twisted abstraction is this:

Most higher-level abstractions in Twisted are built by using lower-level ones, not by replacing them.

So when you are learning a new Twisted abstraction, keep in mind both what it does and what it does not do. In particular, if some earlier abstraction A implements feature F, then F is probably not implemented by any other abstraction. Rather, if another abstraction B needs feature F, it will use A rather than implement F itself. (In general, an implementation of B will either sub-class an implementation of A or refer to another object that implements A).

Networking is a complex subject, and thus Twisted contains lots of abstractions. By starting with lower levels first, we are hopefully getting a clearer picture of how they all get put together in a working Twisted program.

Loopiness in the Brain

The most important abstraction we have learned so far, indeed the most important abstraction in Twisted, is the reactor. At the center of every program built with Twisted, no matter how many layers that program might have, there is a reactor loop spinning around and making the whole thing go. Nothing else in Twisted provides the functionality the reactor offers. Much of the rest of Twisted, in fact, can be thought of as “stuff that makes it easier to do X using the reactor” where X might be “serve a web page” or “make a database query” or some other specific feature. Although it’s possible to stick with the lower-level APIs, like the client 1.0 does, we have to implement more things ourselves if we do. Moving to higher-level abstractions generally means writing less code (and letting Twisted handle the platform-dependent corner cases).

But when we’re working at the outer layers of Twisted it can be easy to forget the reactor is there. In any Twisted program of reasonable size, relatively few parts of our code will actually use the reactor APIs directly. The same is true for some of the other low-level abstractions. The file descriptor abstractions we used in client 1.0 are so thoroughly subsumed by higher-level concepts that they basically disappear in real Twisted programs (they are still used on the inside, we just don’t see them as such).

As far as the file descriptor abstractions go, that’s not really a problem. Letting Twisted handle the mechanics of asynchronous I/O frees us to concentrate on whatever problem we are trying to solve. But the reactor is different. It never really disappears. When you choose to use Twisted you are also choosing to use the Reactor Pattern, and that means programming in the “reactive style” using callbacks and cooperative multi-tasking. If you want to use Twisted correctly, you have to keep the reactor’s existence (and the way it works) in mind. We’ll have more to say about this in Part 6, but for now our message is this:

Figure 5 and Figure 6 are the most important diagrams in this introduction.

We’ll keep using diagrams to illustrate new concepts, but those two Figures are the ones that you need to burn into your brain, so to speak. Those are the pictures I constantly have in mind while writing programs with Twisted.

Before we dive into the code, there are three new abstractions to introduce: Transports, Protocols, and Protocol Factories.

Transports

The Transport abstraction is defined by ITransport in the main Twisted interfaces module. A Twisted Transport represents a single connection that can send and/or receive bytes. For our poetry clients, the Transports are abstracting TCP connections like the ones we have been making ourselves in earlier versions. But Twisted also supports I/O over UNIX Pipes and UDP sockets among other things. The Transport abstraction represents any such connection and handles the details of asynchronous I/O for whatever sort of connection it represents.

If you scan the methods defined for ITransport, you won’t find any for receiving data. That’s because Transports always handle the low-level details of reading data asynchronously from their connections, and give the data to us via callbacks. Along similar lines, the write-related methods of Transport objects may choose not to write the data immediately to avoid blocking. Telling a Transport to write some data means “send this data as soon as you can do so, subject to the requirement to avoid blocking”. The data will be written in the order we provide it, of course.

We generally don’t implement our own Transport objects or create them in our code. Rather, we use the implementations that Twisted already provides and which are created for us when we tell the reactor to make a connection.

Protocols

Twisted Protocols are defined by IProtocol in the same interfaces module. As you might expect, Protocol objects implement protocols. That is to say, a particular implementation of a Twisted Protocol should implement one specific networking protocol, like FTP or IMAP or some nameless protocol we invent for our own purposes. Our poetry protocol, such as it is, simply sends all the bytes of the poem as soon as a connection is established, while the close of the connection signifies the end of the poem.

Strictly speaking, each instance of a Twisted Protocol object implements a protocol for one specific connection. So each connection our program makes (or, in the case of servers, accepts) will require one instance of a Protocol. This makes Protocol instances the natural place to store both the state of “stateful” protocols and the accumulated data of partially received messages (since we receive the bytes in arbitrary-sized chunks with asynchronous I/O).

So how do Protocol instances know what connection they are responsible for? If you look at the IProtocol definition, you will find a method called makeConnection. This method is a callback and Twisted code calls it with a Transport instance as the only argument. The Transport is the connection the Protocol is going to use.

Twisted includes a large number of ready-built Protocol implementations for various common protocols. You can find a few simpler ones in twisted.protocols.basic. It’s a good idea to check the Twisted sources before you write a new Protocol to see if there’s already an implementation you can use. But if there isn’t, it’s perfectly OK to implement your own, as we will do for our poetry clients.

Protocol Factories

So each connection needs its own Protocol and that Protocol might be an instance of a class we implement ourselves. Since we will let Twisted handle creating the connections, Twisted needs a way to make the appropriate Protocol “on demand” whenever a new connection is made. Making Protocol instances is the job of Protocol Factories.

As you’ve probably guessed, the Protocol Factory API is defined by IProtocolFactory, also in the interfaces module. Protocol Factories are an example of the Factory design pattern and they work in a straightforward way. The buildProtocol method is supposed to return a new Protocol instance each time it is called. This is the method that Twisted uses to make a new Protocol for each new connection.

Get Poetry 2.0: First Blood.0

Alright, let’s take a look at version 2.0 of the Twisted poetry client. The code is in twisted-client-2/get-poetry.py. You can run it just like the others and get similar output so I won’t bother posting output here. This is also the last version of the client that prints out task numbers as it receives bytes. By now it should be clear that all Twisted programs work by interleaving tasks and processing relatively small chunks of data at a time. We’ll still use print statements to show what is going on at key moments, but the clients won’t be quite as verbose in the future.

In client 2.0, sockets have disappeared. We don’t even import the socket module and we never refer to a socket object, or a file descriptor, in any way. Instead, we tell the reactor to make the connections to the poetry servers on our behalf like this:

factory = PoetryClientFactory(len(addresses))

from twisted.internet import reactor

for address in addresses:
    host, port = address
    reactor.connectTCP(host, port, factory)

The connectTCP method is the one to focus on. The first two arguments should be self-explanatory. The third is an instance of our PoetryClientFactory class. This is the Protocol Factory for poetry clients and passing it to the reactor allows Twisted to create instances of our PoetryProtocol on demand.

Notice that we are not implementing either the Factory or the Protocol from scratch, unlike the PoetrySocket objects in our previous client. Instead, we are sub-classing the base implementations that Twisted provides in twisted.internet.protocol. The primary Factory base class is twisted.internet.protocol.Factory, but we are using the ClientFactory sub-class which is specialized for clients (processes that make connections instead of listening for connections like a server).

We are also taking advantage of the fact that the Twisted Factory class implements buildProtocol for us. We call the base class implementation in our sub-class:

def buildProtocol(self, address):
    proto = ClientFactory.buildProtocol(self, address)
    proto.task_num = self.task_num
    self.task_num += 1
    return proto

How does the base class know what Protocol to build? Notice we are also setting the class attribute protocol on PoetryClientFactory:

class PoetryClientFactory(ClientFactory):

    task_num = 1

    protocol = PoetryProtocol # tell base class what proto to build

The base Factory class implements buildProtocol by instantiating the class we set on protocol (i.e., PoetryProtocol) and setting the factory attribute on that new instance to be a reference to its “parent” Factory. This is illustrated in Figure 8:

As we mentioned above, the factory attribute on Protocol objects allows Protocols created with the same Factory to share state. And since Factories are created by “user code”, that same attribute allows Protocol objects to communicate results back to the code that initiated the request in the first place, as we will see in Part 6.

Note that while the factory attribute on Protocols refers to an instance of a Protocol Factory, the protocol attribute on the Factory refers to the class of the Protocol. In general, a single Factory might create many Protocol instances.

The second stage of Protocol construction connects a Protocol with a Transport, using the makeConnection method. We don’t have to implement this method ourselves since the Twisted base class provides a default implementation. By default, makeConnection stores a reference to the Transport on the transport attribute and sets the connected attribute to a True value, as depicted in Figure 9:

Figure 9: a Protocol meets its Transport

Once initialized in this way, the Protocol can start performing its real job — translating a lower-level stream of data into a higher-level stream of protocol messages (and vice-versa for 2-way connections). The key method for processing incoming data is dataReceived, which our client implements like this:

def dataReceived(self, data):
    self.poem += data
    msg = 'Task %d: got %d bytes of poetry from %s'
    print  msg % (self.task_num, len(data), self.transport.getPeer())

Each time dataReceived is called we get a new sequence of bytes (data) in the form of a string. As always with asynchronous I/O, we don’t know how much data we are going to get so we have to buffer it until we receive a complete protocol message. In our case, the poem isn’t finished until the connection is closed, so we just keep adding the bytes to our .poem attribute.

Note we are using the getPeer method on our Transport to identify which server the data is coming from. We are only doing this to be consistent with earlier clients. Otherwise our code wouldn’t need to use the Transport explicitly at all, since we never send any data to the servers.

Let’s take a quick look at what’s going on when the dataReceived method is called. In the same directory as our 2.0 client, there is another client called twisted-client-2/get-poetry-stack.py. This is just like the 2.0 client except the dataReceived method has been changed like this:

def dataReceived(self, data):
    traceback.print_stack()
    os._exit(0)

With this change the program will print a stack trace and then quit the first time it receives some data. You could run this version like so:

python twisted-client-2/get-poetry-stack.py 10000

And you will get a stack trace like this:

File "twisted-client-2/get-poetry-stack.py", line 125, in
    poetry_main()

... # I removed a bunch of lines here

File ".../twisted/internet/tcp.py", line 463, in doRead  # Note the doRead callback
    return self.protocol.dataReceived(data)
File "twisted-client-2/get-poetry-stack.py", line 58, in dataReceived
    traceback.print_stack()

There’s the doRead callback we used in client 1.0! As we noted before, Twisted builds new abstractions by using the old ones, not by replacing them. So there is still an IReadDescriptor implementation hard at work, it’s just implemented by Twisted instead of our code. If you are curious, Twisted’s implementation is in twisted.internet.tcp. If you follow the code, you’ll find that the same object implements IWriteDescriptor and ITransport too. So the IReadDescriptor is actually the Transport object in disguise. We can visualize a dataReceived callback with Figure 10:

Once a poem has finished downloading, the PoetryProtocol object notifies its PoetryClientFactory:

def connectionLost(self, reason):
    self.poemReceived(self.poem)

def poemReceived(self, poem):
    self.factory.poem_finished(self.task_num, poem)

The connectionLost callback is invoked when the transport’s connection is closed. The reason argument is a twisted.python.failure.Failure object with additional information on whether the connection was closed cleanly or due to an error. Our client just ignores this value and assumes we received the entire poem.

The factory shuts down the reactor after all the poems are done. Once again we assume the only thing our program is doing is downloading poems, which makes PoetryClientFactory objects less reusable. We’ll fix that in the next Part, but notice how the poem_finished callback keeps track of the number of poems left to go:

...
    self.poetry_count -= 1

    if self.poetry_count == 0:
        ...

If we were writing a multi-threaded program where each poem was downloaded in a separate thread we would need to protect this section of code with a lock in case two or more threads invoked poem_finished at the same time. Otherwise we might end up shutting down the reactor twice (and getting a traceback for our troubles). But with a reactive system we needn’t bother. The reactor can only make one callback at a time, so this problem just can’t happen.

Our new client also handles a failure to connect with more grace than the 1.0 client. Here’s the callback on the PoetryClientFactory class which does the job:

def clientConnectionFailed(self, connector, reason):
    print 'Failed to connect to:', connector.getDestination()
    self.poem_finished()

Note the callback is on the factory, not on the protocol. Since a protocol is only created after a connection is made, the factory gets the news when a connection cannot be established.

A simpler client

Although our new client is pretty simple already, we can make it simpler if we dispense with the task numbers. The client should really be about the poetry, after all. There is a simplified 2.1 version in twisted-client-2/get-poetry-simple.py.

Wrapping Up

Client 2.0 uses Twisted abstractions that should be familiar to any Twisted hacker. And if all we wanted was a command-line client that printed out some poetry and then quit, we could even stop here and call our program done. But if we wanted some re-usable code, some code that we could embed in a larger program that needs to download some poetry but also do other things, then we still have some work to do. In Part 6 we’ll take a first stab at it.

Suggested Exercises

Use callLater to make the client timeout if a poem hasn’t finished after a given interval. Use the loseConnection method on the transport to close the connection on a timeout, and don’t forget to cancel the timeout if the poem finishes on time.
Use the stacktrace method to analyze the callback sequence that occurs when connectionLost is invoked.

78 replies on “Twistier Poetry”

Cool! Best Twisted tutorial ever! Thank so much!

Glad you like it!

Excellent! There is no excuse now for people interested in asynchronous programming to understand and accept Twisted as *the* framework in Python.

This really helps visual learners too. Thank you very much.

You’re welcome!

s/we invented for our own purposes/we invent for our own purposes/

s/Well making Protocol instances/Well, making Protocol instances/

(sorry for the multiple trivial fixes)

Not at all, I appreciate the help.

As a programmer which is new to Twisted I must say: Great tutorial!

I’m looking forwards for part 6 AND at least one part explaining deferres!

Thank you! Part 7 will introduce deferreds, while Part 6 will provide the motivation for using them.

Great introduction to Twisted! Thank you very much for writing these articles up.

As for exercise #1 in part 6, Twisted is returning a reason of type ConnectionDone in all cases, i.e. the connection was done cleanly. Am I doing something wrong?
According to the docs, I should expect a ConnectionLost reason.

http://pastebin.com/m564634f0

To terminate the servers, I tried sending the interrupt and quit signals, as well as kill -9

I am running Twisted 9.0.0 on Mac OS X 10.6 and system python 2.6.1

Hey Olivier, you’re on the right track, it’s just that making tcp connections fail is harder than I realized. Since the OS will close the connection when the process exits, it still gets closed cleanly on the client side. You’d probably need to use two servers and then physically disconnect the cable to get a failed connection, and even then you would need to wait for the tcp connection to timeout.

I’m going to take out this exercise, it’s not so good for a tutorial, I think. One thing, though, is when you are checking the class of ‘reason’: the ‘reason’ argument is a Failure, not the exception itself. Look at the ‘check’ method on Failure objects to see how you can test what type of exception it is wrapping.

Glad you like the introduction, and thanks for helping me debug it 🙂

Thanks for a great introduction! My head is bleeding slightly from trying to learn Python, Twisted and RabbitMQ all at the same time, but your tutorials have made a lot more sense than the Twisted docs.

Thanks! I’ve been playing with RabbitMQ myself, it’s a neat system.

A very good “hands-on” for the Protocol and ProtocolFactory classes, the diagrams were pretty helpful.

Being able to cancel reactor.callLater() is cool too.

Hey, these tutorials are super helpful, thanks so much for writing them. My pathetic contribution is to alert you of a typo: “At the center of every program built with Twisted, no matter how may” should be “…how many”.

Thank you, much appreciated!

OMG, I love your tutorial. Thanks to you, I lost my fear and browse the source code to learn more about Twisted. Thank you very much!

You are welcome! Glad it’s been helpful.

First of all, thank you for this great work-tutorial.
One small question, near the end of this part you write that ‘Since a protocol is only created after a connection is made, the factory gets the news when a connection cannot be established.’, but earlier you write that a protocol’s construction is made by the actions in figures 8 & 9, that have the reverse ‘chronological’ order (protocol is created firstly and after that it is connected with a transport object). So which order is the right?
Thank you,
George

Hey George, the connection (Transport) is created before the Protocol object. The diagrams don’t really show the Transport being created at all, that is done by the connectTCP call. If I have some time, I’ll update them to make that clearer.

[…] 5: Twistier Poetry 原文:http://krondo69349291.wpcomstaging.com/blog/?p=1522 作者:dave 译者:notedit […]

[…] 回想一下在五部分的图片八和图片九.这些图片描述了一个新的protocol 实例是怎样在twisted 创建一个新的连接之后被创建和初始化的.在server 端twisted 接受一个新来的连接也是采用了同样的原理.那就是为什么 connectTCP 和 listenTCP都需要一个factory 的参数. […]

You can probably tell how quickly I’m going through these :p. Ebullient thanks in order, yet again Dave.

I got stuck on that point that George brought up, but like Pingu said, I lost my fear of the source code and dug in. For anyone suffering the same fate: I discovered that the Transport gets created before the protocol, but the protocol gets a None placeholder for its ‘protocol.transport’ attribute until makeConnection is called. Overriding this method lets you access the transport before the connection is started. See ***Spoiler*** http://pastebin.com/fv3yzzQu

Nicely done! You might also check out the connectionMade method on protocol objects, which
gets called right after the protocol has been hooked up to the transport.

WOW! Learnt more here that I even did from the 2 booked on twisted network programming. Question – I was working fine with the protocol.LineReceiver > datareceived callName (required)
E-Mail (will not be published) (required) -back until my client message exceeded the TCP frame size. Now, I get only partial messages from the client, is there a standard protocol to manage this or do I have to write my own protocol? Thanks mate.
STAN
Australia

Hey Stan, glad you like the series. For the LineReceiver protocol you’ll want to
use the lineReceived method (I may not have spelled it quite right) instead of
the dataReceived method. The dataReceived method is the ‘raw’ stream of bytes
which can come in arbitrary size chunks. The lineReceived method is specific to
the LineReceiver and it is called for each complete line you get. Make sense?

Great tutorials! I´m learning so much! 🙂

I have one correction though. The method used to identify which server the data is coming from is getPeer(), while getHost() is for the local side of the connection.

Good catch, fixed!

I too am thrilled at finding such an awesome course on using twisted – complete with proper diagrams and starting from the very beginning.

One thing that I noticed when reading the twisted sources was that some of the classes are new-style classes and some are not. Newcomers to python may have learned to call the base class methods with super (classname, self).__the_superclass_method__(). Some of the classes in twisted (esp. the lower-level, longer-established classes) are calling the base class methods directly like baseclass.__the_superclass_method__() (if I’m understanding correctly).

Looking forward to reading the rest …

Hi Brenda, thanks for the kind words! Since Twisted has been around for a number
of years now, it probably contains a mix of Python styles as Python has evolved over
time. I imagine the core developers plan on eventually moving everything to the new
style, but they take backwards compatibility very seriously (thankfully) so the changes
tend to happen gradually.

Wish I’d read this comment earlier, would have saved me a bit of headache! But then I wouldn’t have learned about old- and new-style classes at all. 😉 Wanted to say that. While I wouldn’t say I’m a newcomer, I am (was) ignorant to super() and its nuances.

Oh, and while Googling, I found a couple responses to a 2012 Twisted mailing list which give the impression that, like Dave said, it’s intentional for backward compatibility. But the first of these two links suggests they don’t plan on changing going forward.

http://permalink.gmane.org/gmane.comp.python.twisted.web/4930
http://twistedmatrix.com/pipermail/twisted-web/2012-October/004942.html

Hi Dave!

I need to make a server with custom protocol, which is extremely simple but also should be very reliable. All I need are prefix messages with size and following normal messages, two-way communication. So I’ve got a question. Is there any way to set data size limt in dataReceived method of the Protocol class? As far as I understood, this method just returns all the data that was passed to the socket. But what if someone throws huge files with malicious aims?

Hey Alex, the protocol you describe sounds very much like netstrings. Twisted already includes a built-in Protocol class called NetstringReceiver (or something similar) which allows you to set a length limit. If the limit is passed, the Protocol will close the connection. Even if you decide not to use the built-in protocol, it is a nice, short example of how to do the kind of limiting you want.

Hi Dave!
I’ve already looked into the implementation of netstring provided by twisted, but it is just a usual class inherited from Protocol. Therefore, it is just kind of wrapper over dataReceived method. Nevertheless I decided to do some tests on this method, so I wrote a test server (based on Protocol class) and sent 3 Gb file there. It appeared that twisted automatically limits size of data which can be received with dataReceived method at one call. The size of data pieces were around 30 kb (but every time size was different though). So I don’t need to worry about this problem any more and just control the overall size inside dataReceived method.
Thanks for immediate reply and willing for help, Dave 🙂

Hi, Dave, this is my homework: http://pastebin.com/mkxeDjrf

Cool, not quite finished, right? For example, you seem to be calling os._exit, is that for debugging?
Also, looks like you might have left the parentheses off the call to self.success…

[…] Twistier Poetry […]

Hi Dave,
I am looking to participate in this year’s GSoC under Twisted. Could you suggest some project ideas?

Glad you are planning on doing that, it is a great way to contribute.
I would ask on the Twisted mailing list, the core folks would have the
best ideas about what Twisted could use right now.

Thanks. By the way, I looked up the twisted fellowship page, they’ve planned to include your tutorial and updating the code to suit the recent release with your permission. Great work once again!
https://twistedmatrix.com/trac/wiki/Fellowship2013

Thank you, I will certainly agree to that.

Here is my solution for exercise one :
http://pastebin.com/Y7NYG7Ls

I have also some more questions focused on python rather on twisted itself.

Class attributes vs instance attributes:

Why in PoetryProtocol class there is class attribute “poem” ?
Do all protocols need to share that ?

Plus every Protcol, which inherits from BaseProtocol has class variable “transport”, I don’t get it.
If transport is a connection then why all protocols need to share one connection?
I am pretty sure I misunderstood something here.

Please if you could explain it.

So this is a little bit of Python trickery. When you access an attribute on a Python object and the object does not have that attribute, Python will look for the attribute on the class and, if it exists, return that instead. So far so good. But if you set an attribute on a Python object then Python will always set that attribute on the object itself, even if the attribute doesn’t currently exist on the object but it does exist on the class. Thus, Python programmers will often use class attributes as “default” values for object attributes, knowing that whenever they set a non-default value, it will be set on the object and thereafter override the value on the class.

Wow thanks for your quick answer!

I get it now.

Btw. it’s really rare to find such a good tutorial ( good, because of figures, code examples,not only text and chance to ask the author a question ).

Best regards

Thank you, glad you are finding it helpful!

By the way I have question related to Twisted, but not exactly with your tutorial. However since you are experienced Twisted programmer I thought you might know. I am currently working on a network game ( in browser ) and I plan to use Twisted + Javascript, but I miss one thing – HTML templating, it is such a hassle to write html output in a form of a python string and return that in GET/POST handler. Do you maybe know any HTML templating solution for Twisted? I’ve done a small research and I found something called Nevow, which is already dead and not supported, and also a combination of Twisted + Flask/Django for templating, but I wonder if that will be truely asynchronous..
I know this is not directly related to the topic, but I couldn’t find any contact information to send you that as private message.

You can use pretty much any templating system you want to with
Twisted Web, you just call the templating engine once you’ve got
all the data you need back from your async requests.

Hi Dave,
thanks for the detailed and well to read tutorial first!
I have a question regarding Python. I’d like to know what the benefit is of using a class attribute like:

class PoetryProtocol(Protocol):
poem = ”

Instead of placing it in the __init__ method as an object attribute.

class PoetryProtocol(Protocol):
def __init__(self):
self.poem = ”

Is it just that I can then access it in the class object too?

Thanks!

Defining the attributes in the class are probably just a matter of taste, I don’t
think they have a big advantage or disadvantage.

Brilliant set of tutorials right here. This part had me wondering: are protocols also used outside the application layer? Or do transports cover everything below that?

Thanks very much! I seem to recall some recent posts on the Twisted mailing list about people using Twisted for lower-level networking, but I don’t recall the details. I think it’s a safe bet the majority of Protocol implementations are at the application layer, with Transports abstracting away the rest.

Hi Dave, like other already said, great tutorial, thank you for it. One question regarding this part.
I would like to know if I good understand the sequence of execution:
– all is started in line “reactor.connectTCP(host, port, factory)”
– twisted is responsible for all execution (it knows when to create Protocol, start connection, readData etc). So, it knows which part from our implementation need to use and when.

Is above correct or I miss something? I’m new in Python and it is hard to me to understand what exactly “connectTCP” do.

Hi Jacek, I think you have a good sense of it. The call to reactor.connectTCP tells Twisted to make a TCP connection to the host and port and use the factory to make a protocol if the connection attempt succeeds. The call to reactor.run hands control to the Twisted event loop so all that can happen.

hi Dave. I’m new to twisted and trying to get a hang of it. I’m unclear as to how the protocol factory is aware of the protocol that needs to be built. We do create an instance of ‘PoetryProtocol’ inside PoetryClientFactory but we don’t pass that to buildprotocol. So how does it know what protocol is being used?
Also I did’nt understand what is meant by “Note that while the factory attribute on Protocols refers to an instance of a Protocol Factory, the protocol attribute on the Factory refers to the class of the Protocol.”

Hello! So a protocol factory generally builds one kind of protocol. So a PoetryClientFactory always builds a PoetryProtocol. It does so because the protocol attribute on PoetryClientFactory is the PoetryProtocol class: https://github.com/jdavisp3/twisted-intro/blob/master/twisted-client-2/get-poetry.py#L73

Now after the factory creates an instance of the protocol, that instance will have a factory attribute that refers to the instance of PoetryClientFactory which created it. Does that make sense?

yes. I got it. thanks!!!

I had another question. So some of these functions are defined by Twisted? We have to use the same name but we could have our own implementation. Is that correct? Eg:connectionLost()

That’s right, those methods are defined by Twisted to have a particular meaning (i.e., the connection was lost) and you can implement them in your subclasses to add appropriate code to handle those situations and Twisted will call them at the right time.

[…] I am going through the example in this blog, I found that when ClientFactory instance, which is an argument to PoetryClientFactory, or Protocol […]

Best twisted Tutorial, I have seen so far ..

Thank you!

Awesome tutorials!
7 years passed, these tutorials are still the best Twisted tutorials I’ve ever seen. Thanks for your excellent works 🙂

Thanks very much!

I’m a computer science professor and I have to say, this is very useful stuff, but yow.. quite complicated in places. Especially since I’m a Python novice and the way Twisted uses classes, objects, interfaces, factories, etc., isn’t at all straightforward. I wish there were better discussions of the design rationale somewhere. Why is the Twisted way better than (say) just passing addresses into a method that creates clients that do asynch I/O? And why do we have to provide some implementations of certain methods but not others? Unfortunately, it still seems like we have another ocean of methods and classes to memorize in order to use this stuff, and the simplest programs are still quite involved. But that seems to be the trend post-Java.

It can be a bit daunting, for sure, though once you learn the highest level APIs you can write some very concise code. See the example mail client.

[…] 本部分原作参见: dave @ http://krondo69349291.wpcomstaging.com/?p=1522 […]

Thanks for the tutorial Dave, it’s great! I have a question for exercise #1:

The solution I wrote is here:

http://pastebin.com/PpdUAGa3

So my question is: is this a valid solution? Checking if the callLater has been called feels wrong. I thought about using the reason argument but I think both ways (receiving the poem and using loseConnection) have the same “reason” value (object).

Thank you, glad you like it!

I would recommend using the public interface method active to check to see if the timeout has been called.

Thanks for your tutorial! This is the best twisted tutorial on the internet. I really learned a lot!

You are quite welcome, glad you liked it!

Hi Dave! Thank you a lot for this tutorial! I’ve got 1 question and 1 comment.
Question: Why do we always import reactor locally, in function or in class method? Can’t we just import it in the begging of the code as we do with other modules?
Comment: All of the links to methods, like loseConnection etc, leads to not found error page. Where I can find correct links?

Hi, glad you like it! So importing the reactor locally is because importing it also installs a particular implementation of the reactor if it hasn’t already been installed. So we import locally to allow later code that wants to install a different reactor to do so. Now, granted, this was a while ago. Things may have changed in the latest version.

Hm, it appears the twisted source code browser has changed. Hrm. Ok, I’ll have to see about fixing up those links, thanks for the report!

I think I’ve got the links working again, can you confirm?

Thanks a lot! Works perfectly!

Hi Dave, thanks for this very measured and detailed tutorial. I also struggled with the concept of the attribute called ‘protocol’ being set in the PoetryClientFactory subclass of ClientFactory to tell the factory to build PoetryProtocols, but I think I understand it now.

In the loop in poetry_main() over the 3 addresses, I wanted to see how the reactor changed in response to the 3 calls to connectTCP() and I found it quite reassuring / reinforcing to see how the _newTimedCalls list grew by 2 entries for each of the 3 calls, a DelayedCall to Client.resolveAddress() and another to Client.failifNotConnected().

I appreciate that this is twisted implementation dependent detail and may be specific to this version of twisted that I have, 17.9.0. But are there other things in the reactor which respond to each new call to connectTCP()?

Thanks for the kind words! It’s been a while since I looked at the reactor code so I can’t remember offhand exactly what is changing in reactor state for each new connection. I would recommend studying the source itself.