Categories

Twisted Introduction

This multi-part series introduces Asynchronous Programming and the Twisted networking framework.

  1. In Which We Begin at the Beginning
  2. Slow Poetry and the Apocalypse
  3. Our Eye-beams Begin to Twist
  4. Twisted Poetry
  5. Twistier Poetry
  6. And Then We Took It Higher
  7. An Interlude, Deferred
  8. Deferred Poetry
  9. A Second Interlude, Deferred
  10. Poetry Transformed
  11. Your Poetry is Served
  12. A Poetry Transformation Server
  13. Deferred All The Way Down
  14. When a Deferred Isn’t
  15. Tested Poetry
  16. Twisted Daemonologie
  17. Just Another Way to Spell “Callback”
  18. Deferreds En Masse
  19. I Thought I Wanted It But I Changed My Mind
  20. Wheels within Wheels: Twisted and Erlang
  21. Lazy is as Lazy Doesn’t: Twisted and Haskell
  22. The End

This introduction has some translations in other languages:

129 replies on “Twisted Introduction”

Hello Dave,
I find your intro on Twisted to be the best I have found to date, clear and covers important details that support a real understanding of Twisted. I would like to ask if in the future you have any plans for other Twisted tutorials. Just for an example tutorial; as the intro above couldn’t cover some deep areas of Twisted, I was thinking a multiple part (over time) tutorial that each covers say different abstractions. I would love to support you and the work you do in making Twisted a learning joy for me, a first time. Please let me know if any money I can send that will help.

Ed.

Hey Ed, thanks for the kind words! I appreciate your offer, but I’m happy to make this tutorial a labor of love. I plan to have a few more parts in this series and then take a break for a while. It’s hard work 🙂 I’m not sure whether I’m going to write more Twisted tutorials after that, but if you make it through the end of mine, then I think you’ll be well prepared to dive into the main documentation of Twisted itself. Also, Jp Calderone has a lengthy series that goes into the details of the Twisted Web framework over here.

Thanks Dave, just thinking such work a labor of love …wow.
I did look over the link you gave, and like it. Dave you take care, and I look forward to the parts you plan to add here, as I’m learning Twisted to build a larger tool set in python and your tutorial has been a great help in that direction.

Ed.

Hello, I understand this is not an interactive tutorial, if I may ask a question please. I find in part 5 something that I wish to understand; the word “buildFactory” has had me scratch my head because I can’t seem to find it mentioned anywhere I have looked as portocol interfaces and others, along with the tutorial pages 4 and 6. Also, have not found a reference to it on web searches I have done too. Where do I find it or do I just lack some background in Twisted I need.

Ed.
Thank you.

Glad it was helpful for you! And thanks for the pdf, can you tell me how to go about doing that?
Send me the link to your translation and I’ll put it on the main index page.

Hi timgluz,
can you tell how to download your pdf…it is asking for username/passwd when I click on the above link.

thanx

Dear Dave
I’m sure your writeup is a great source of knowledge.

But caused by the lame of my brain, I fail to understand how to adopt it to my problem.

I need to write application that :
1. Have to services in it:
1.a. a TCP server (just like a basic echo server)
1.b. an XMPP-Client/bot
2. Every msg come from XMPP-server, will wrote to StdOut (it’ll be extended inthe future)
3. Every msg received by TCP-server part, will be forwarded by the xmpp-client to XMPP-server.

I post my code at pastebin:
1. echobot.py : http://pastebin.com/Mn4bRvCb
2. echobot.tac : http://pastebin.com/BgjMKiMC
3. the occured error : http://pastebin.com/6Rdge4L5

Kindly please give me any enlightment.

Sincerely
-bino-

Hey bino, I might not have time to look into your example to closely. But the traceback just looks like you are using an undefined name.
I’m not too familiar with wokkel, but perhaps you mean self.send()?

I think your tutorials are really awesome, and I love how you’re consistently putting so much work into this while remaining responsive and approachable.

This was *long* overdue, but the official documentation page now links back to you:

http://twistedmatrix.com/trac/wiki/Documentation

I couldn’t figure out if marketing your tutorials as good-for-newbies or not was doing you a disservice or not; I’m just mainly using them as a pointer for newbies who are willing to learn. If you disagree with the wording, please feel free to e-mail me about it.

Thank you, I appreciate the kind words! I am specifically aiming this tutorial for people who are new to Twisted and, more importantly, new to asynchronous programming. So your description is apt.

This guide looks good (I haven’t actually read much of it yet :-)).

Given the length, it would be better to have a PDF for an e-reader, or printing it out, but the PDF mentioned earlier in the comments only go up to part 10.

Any plans to produce an updated PDF? Are the parts that are in the PDF up-to-date, or has there been edits, so that one would be better off just reading the whole guide online?

There have been a few edits of the earlier parts, but no dramatic changes. I’m going to look into packaging options once I finish the parts I initially planned to write, which will take me to Part 23.

BTW, I added ‘joliprint’ links for all my posts. That’s a website that claims they can produce a PDF of any webpage. It might be useful for you, I’m not sure.

Dear Dave,

I have converted the twisted tutorial into .mobi format so that I can read it on my kindle. Do you mind if I publish the ebook version of this tutorial?

Regards,
Sangeet Kumar

Hi Dave,

I have converted your tutorial into .mobi format to read in my kindle, so do you mind if I publish the ebook version of this tutorial.

Regards,
Sangeet

Hello,
Am enjoying this tutorial very much and interested in where I can obtain the Kindle-compatible version to try out on the Kindle I just got my wife for HER birthday 🙂

TCB

Sounds like you’ll need to get a second one 🙂 I just installed a Kindle plugin, and now there
should be a Kindle It! section on the right hand toolbar where you can send a post to your Kindle.
Let me know how that works.

I’ve been skimming through this, but one thing I haven’t seen is an example of a two-way protocol eg. a poetry server that takes a poem name as input and returns the requested poem. Do you cover this at all?

Thanks for that 🙂 Am I also right in thinking that the factories are “single use only”, eg. once you use a PoetryClientFactory to get one poem, you can’t use it again?

Factories are actually multi-use. For example, the client factory in Part 5 is used for every poem you download, and actually
keeps track of the number of poems we’ve gotten so far so it knows when to quit the program. That’s solved a little differently
in later versions, but it illustrates that factories can be used over and over again, and often are. Protocols, on the other hand,
are created and used for a single connection.

But in, say, twisted-client-8/get-poetry.py, the “PoetryClientFactory” keeps a single Deferred which is called back upon completion. This can’t be done more than once… have I misunderstood, or is this a different usage?

You’re absolutely right. For that case, the factory is a single-use object. So I guess I should have said that factories can be multi-use, but don’t necessarily have to be. Server factories are almost always multi-use, since most servers accept many connections. Client factories, on the other hand, are probably more of a mixed bag in that respect.

great articles! far better than ‘twisted essentials’ in my opinion. you should write a book with the name ‘asyn-model and twisted’, lol.

No, i was trying to insert some code highlighted with HTML. As the post area does not have help, i didn’t know if it will work. It didn’t.
And i didn’t want to add another comment stating this 🙂

There isn’t a PDF of the entire thing, but each post has a ‘joliprint’ link at the end
which will create a PDF of that article. If you have a Kindle, I added a widget on the
right to send a post your free Kindle account as well.

I raised this issue using Feedback option on joliprint.com In the meantime I could create PDFs by cutting out unneeded elements from DOM using Firebug and then print to PDF using PDF Creator. I could send you one article so you could see if it looks ok.

Wow, thanks! Hopefully the joliprint people can fix that and it will just work. But in the meantime if you want to send me that article that would be cool, though I’d hate to have you do a lot of busywork if it’s a pain.

Hello Dave,

i am writing my Thesis right now about performance on the Internet and wonder which kind of licese is your tutorial using. Or in other words, am I allowed to use three images from your work. Mainly those about sync and async Modell.
Thanks for your work!

Hey Lukas, I guess I should give the particular license some thought. I shall do so!

But the short answer is that you are perfectly welcome to use the figures as long
as you give me credit for them.

thanks,
dave

Hi, how would one handle a situation where the server closes the port? I don’t get any callbacks to the connectionlost function in a factory.

Hi sma, on a factory the callbacks you are looking for are ‘clientConnectionLost’ and ‘clientConnectionFailed’.
The first one is called when a connection is lost after it was connected. The second is called if the attempt to
connect never succeeded in the first place.

The ‘connectionLost’ callback is actually a callback on the Protocol, which is another place you can handle
lost connections (but not failed connections, those can only be handled by the factory).

Make sense?

Hi Dave.
First, thank you for the great job you have done. I believe, every reader of this excellent tutorial deeply appreciates the effort you put into it. *applause*

If you don’t mind, I have a couple of questions regarding Twisted in particucular and programming in general. Hope you could answer them 🙂

Suppose you have a protocol that is almost completely symmetric, i.e. it doesn’t matter who the server is and who the client is. However, a few cases when this actually matters should be handled. For this purpose one could make two distinct but very similar implementations, subclassing both ClientFactory and ServerFactory, then doing connectTCP on a ClientFactory subclass instance and listenTCP on the ServerFactory subclass instance. But this sounds quite ridiculous, doesn’t it? So, alternatively, one could subclass the generic Factory class and use it for both serving and being a client. However, here arises a problem I’ve bumped into recently: how could one tell whether he is accepting a connection or connecting himself from within the connectionMade callback on Factory subclass?

And there’s something more generic. Suppose you have a large project that has to handle both the network interaction and (ugly, huh?) a GUI. How would you structure such a project? I’ve come up with a solution where you have a base object which is in charge for everything and stores references to other objects which are responsible for the specific parts of the program. For instance, you could have a base class called Application and it could store references to classes like NetworkController, GUI, DiskController and so on. (Here I use the words “class” and “object” interchangeably, hope you’re not going to find me and make me write “Class is not an object” 9001 times for that.) These “child” objects, in turn, store references to the Application itself, so whenever a class from the network-responsible part has to reach the GUI to tell it that the bandwidth is exhausted and immediate action has to be taken, it simply calls something like self.app.gui.TellTheUserWeAreScrewed(). However, I believe there exists a better approach but I can’t just figure out what it is 🙁 Hope you can help me.

Glad you liked the series! I will address the first question now. The second is much larger 🙂

I think your alternative solution (subclassing from Factory) will work, but you would need
to use two instances, one for listening and one for connecting. The instances will need to
know for which end of the protocol they are creating protocols for. You would provide that
information when you create them. Make sense?

Hey there, so I’ve been thinking about your second question. It’s a big question 🙂 It basically turns into the question “how should you write software”, since most really useful programs end up getting fairly complicated.

I don’t have a real answer to your question, even though I’ve written a lot of software, some of it ending up kind of big. To me there is a legitimate viewpoint that interprets most of the major innovations in software (functions, modules, classes, types, etc.) as different ways
to answer this question, basically different strategies for dealing with complexity.

But here are some general rules of thumb that I think most programmers would agree with:

+ Build your software out of small components and make the interfaces to those components as small as possible.

+ Try to keep the dependencies of any one component small in number.

Following this strategy will make it easier to test your components and to change them over time.

Although it is more controversial, I also think there is some value in the idea of Dependency Injection, a pattern where components declare their dependencies rather than explicitly create them. Then, during runtime, the dependencies are ‘injected’ by the context, usually some sort of configuration system. This also makes it very easy to, say, substitute a mock component during testing.

The situation you are describing where you have on big object that all the others depend on is a common one that people end up with, I think. And whether it is good enough for your projects depends on your situation. For smallish programs, I think you can get away with that, as long as you keep that top-level object very simple (basically it’s just a container of other things without methods of its own and acts kind of like a dependency injection configuration).

Anyway, good luck on your project, if you haven’t already finished 🙂

Hi Dave, great tutorial.
I now have a running twisted daemon working fine and dandy which (listens) receives information over the internet from a number of gsm devices.

The problem is that this device’s ip/port change every so often and when this hapens it leaves the open connection socket listening for more data(not in time wait or finished state) and creates another socket. eventually the machine runs out of sockets and everything starts to fail.

So far I have to restart the daemon every now and then to workarround this issue. but that’s just the lamest solution. and it’s not even a solution because even if a get a cron job to do it, when I scale the amount of devices it will become unpractical.

I tried to set a timeout but since there is not any specific response that the server is waiting for then timeout is never met.

So, is there a way close sockets aotumatically when they have say 180 seconds idle, or a way to tell the sockets to close after each transmission, or any other solution that I can implement, or something that I should change in my implementation. or anything you can think of that will still work when I have at least 1000 devices transmitting every 90 seconds.

I am running in a centos 5 server VPS. python 2.4

Hi Ricardo, when you say you tried to set a timeout, what do you mean?
I’m not sure which timeout setting you were using.

But in any case, you should be able to do this in a pretty straightforward way.
In your Protocol implementation you will want to set a timeout each time a
connection is made (you can do that in the Protocol __init__) and then refresh
the timeout when any data is received (you can do that in the dataReceived method).

The timeout itself can be a DelayedCall that you create with reactor.callLater.
That object has an api that let’s you reschedule the delay so you can refresh
the timeout. If the timeout is reached, you will want to call .transport.loseConnection()
to break the connection.

Make sense?

it does make sense, let me see if I get this straight,
Every time I open a new socket i statr a countdown timer (say 120 seconds) and every time I receive data I update the timer (back to 120 seconds) so that if 120 seconde shall pass with no activity whatsoever I close the socket using the loseConnection method.
sound pretty nifty,
what I don’t get is how do I do that.
what should i do in the protocol init and in the data received , do you have an example?
Thank you very much for your help.

ok, now if I have 128 sockets in my machine this solution will only allow me 128 clients to be connected in a two minute frame since the devices send data every 90 seconds (roughly). and when i do loose the connection the device has to login again prior to sending data. so it still makes me a bit unconfortable, isn’t thre a way that many clients can share a socket. or that I could take care of saying 1000 devices even if I only have 128 sockets. or that it can drop a socket whithout droping the connection or without leeting the device know his connectio has been dropped. so that it does’t feel the need to relogin all the time

With TCP connections each connection will use one socket,
there is no way for two different clients to share the same
TCP socket.

Is 128 just a hypothetical number or are you really limited
to 128 sockets? That seems pretty small.

BTW the part of the reactor.callLater that i don’t get is, how do I update the call later, I thought that i could only make call something later and that’s it. not that i could keep adjusting the timer, it a wicked idea that opens up many watchdog like possibilities.
I will check the documentation on the call later api.
forgive my typos, i need to get a new keyboard soon.

Hey Ricardo, check out the documentation
for callLater — the return value is an IDelayedCall object with an API you can use to adjust the timeout (or cancel the call).

Hey Dave I am using the twisted.internet API
I did check on the call later, and the only issue would be the one with the sockets, since i am on a virtual private server (shared server) thats all i get and I tried to change the ammount of socketsit by modifiying /proc/sys/net/core/somaxconn but even as root I am unable to change it. Likely it only chageable by the administrators.

Now on the other hand the devices I am listening to areable to speack UDP also,
Do you in you expert opinion think that it would be a good idea to switch to udp?
would that allow me to have unlimited clients?
what is the tradeoff?

Is there a way for me to drop the socket without letting the device know and the making a new one 90 seconds later when it transmits again, only to drop it right after the package has been received.

I get 100 Byte packages every 90 seconds from remote devices informating me the current state of a number of variables via gprs/internet.

Hey Ricardo, if you close a TCP socket, the TCP protocol
will take care of informing the other side that the connection
has been dropped. At that point it is up to the client to do
the right thing and reconnect if needed — you’ll have to test
to see if these devices will do that.

With a bash shell, the command: ‘ulimit -n’ tells you the number
of open file descriptors (sockets) you can have. You can try to
increase it with ‘ulimit -n XXX’ where ‘XXX’ is the number you
want.

It sounds like UDP is an option here. It would allow you to have
essentially unlimited clients since UDP is a connectionless protocol.
As I am sure you know, UDP does not guarantee delivery. Is it ok if
sometimes packets are not delivered? If so, UDP could be a nice fit
for you. Otherwise you would have to implement your own retry
mechanism under UDP, essentially replicating parts of TCP anyway.

they do indeed, but its apin that every time they are first reconnecting, then relogin which is not needed, and then sending a package, then i do loseConnection to free the socket, but i do get the same behavior all over again, and i was expecting to just get the data package.
thats because the connection is been closes cleanly, yet if there were a way that I could drop it (not claenly close it) in such a manner that the device wouldn’t know, then when it transmits again the reactor.listen on the port would listen to the incoming package open a socket, process it, and close it again, so that the socket is used only in the time it takes the package to be delivered.
I did this and i got recursive attempts to login from the device which were satisfactorily responded from the server.
that is the device would succesfully login, then the connection lost close the connection and then the device attempts to login again instead of just transmitting the data.

The devices did what i want them to when I had my first issue which was that the gprs would rotate the ip/port of the device and the device would open another socket when it sended the next data package(not trying to relogin), leaving the other socket in an established state, eventually making the server run out of sockets even when I had only one or two devices.

Hi, ricardo. If you still reading this, I’m doing the same thing with gps/gprs trackers which has mostly the same behaviour. And I found that the ‘timeout’ solution is quite appropriate in our situation (I use TCP connections). The only difference is that I don’t know exactly refresh time interval for each device, so I use some alogirithm to predict timeout value.

BTW if you have the list of open connectoins where each connection mapped to the gps device (I do), then for every new login request from the device you could consider all old connections already in list is lost, so you could manually ‘lose’ them.

Another thing for ricardo – if you’re limited in number of simultaneously opened TCP connections, then you could try to close connection after receiving first data packet (which is after login packet). In that case, each next portion of data will be consisting off two packets: login and data

Hi Dave,

I have just started your introduction and i am enjoying it very much.

I was also looking for something to do to give back what the python community has been giving me for the past two years. I have noticed that there is no one working in a spanish translation of the tutorial. How about i do it?

Please, feel free to contact me, i’ll be waiting for your permission!

Marcial

Permission granted! Thanks very much. I’m especially excited about
this because I’ve been slowly trying to learn Spanish and having
my own words translated will be like having my own Rosetta Stone 🙂

Dude, thank you for this tutorial!

I spent like a week on the twisted documentation without getting my head around it.
This really brought light into the cave.

Hey Dave!

I just started to read the blog but the link to Part 2 of the article is broken, its redirecting to the Index page. Please make the change as Part 2 has many references elsewhere. I am anxious to read the full article. 🙂

Utkarsh

Whoops, should be fixed. I’ll fix the others. I switched to more meaningful link names, but I need to update the links on the posts.

Leave a Reply to daveCancel reply