Categories
Blather Programming Python Software

Twisted Poetry

Part 4: Twisted Poetry

This continues the introduction started here. You can find an index to the entire series here.

Our First Twisted Client

Although Twisted is probably more often used to write servers, clients are simpler than servers and we’re starting out as simply as possible. Let’s try out our first poetry client written with Twisted. The source code is in twisted-client-1/get-poetry.py. Start up some poetry servers as before:

python blocking-server/slowpoetry.py --port 10000 poetry/ecstasy.txt --num-bytes 30
python blocking-server/slowpoetry.py --port 10001 poetry/fascination.txt
python blocking-server/slowpoetry.py --port 10002 poetry/science.txt

And then run the client like this:

python twisted-client-1/get-poetry.py 10000 10001 10002

And you should get some output like this:

Task 1: got 60 bytes of poetry from 127.0.0.1:10000
Task 2: got 10 bytes of poetry from 127.0.0.1:10001
Task 3: got 10 bytes of poetry from 127.0.0.1:10002
Task 1: got 30 bytes of poetry from 127.0.0.1:10000
Task 3: got 10 bytes of poetry from 127.0.0.1:10002
Task 2: got 10 bytes of poetry from 127.0.0.1:10001
...
Task 1: 3003 bytes of poetry
Task 2: 623 bytes of poetry
Task 3: 653 bytes of poetry
Got 3 poems in 0:00:10.134220

Just like we did with our non-Twisted asynchronous client. Which isn’t surprising as they are doing essentially the same thing. Let’s take a look at the source code to see how it works. Open up the client in your editor so you can examine the code we are discussing.

Note: As I mentioned in Part 1, we will begin our use of Twisted by using some very low-level APIs. By doing this we bypass some of the layers of Twisted’s abstractions so we can learn Twisted from the “inside out”. But this means a lot of the APIs we will learn in the beginning are not often used when writing real code. Just keep in mind that these early programs are learning exercises, not examples of how to write production software.

The Twisted client starts up by creating a set of PoetrySocket objects. A PoetrySocket initializes itself by creating a real network socket, connecting to a server, and switching to non-blocking mode:

self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.sock.connect(address)
self.sock.setblocking(0)

Eventually we’ll get to a level of abstraction where we aren’t working with sockets at all, but for now we still need to. After creating the network connection, a PoetrySocket passes itself to the reactor via the addReader method:

# tell the Twisted reactor to monitor this socket for reading
from twisted.internet import reactor
reactor.addReader(self)

This method gives Twisted a file descriptor you want to monitor for incoming data. Why are we passing Twisted an object instead of a file descriptor and a callback? And how will Twisted know what to do with our object since Twisted certainly doesn’t contain any poetry-specific code? Trust me, I’ve looked. Open up the twisted.internet.interfaces module and follow along with me.

Twisted Interfaces

There are a number of sub-modules in Twisted called interfaces. Each one defines a set of Interface classes. As of version 8.0, Twisted uses zope.interface as the basis for those classes, but the details of that package aren’t so important for us. We’re just concerned with the Interface sub-classes in Twisted itself, like the ones you are looking at now.

One of the principle purposes of Interfaces is documentation. As a Python programmer you are doubtless familiar with Duck Typing, the notion that the type of an object is principally defined not by its position in a class hierarchy but by the public interface it presents to the world. Thus two objects which present the same public interface (i.e., walk like a duck, quack like a …) are, as far as duck typing is concerned, the same sort of thing (a duck!). Well an Interface is a somewhat formalized way of specifying just what it means to walk like a duck.

A quick note on terminology: with zope.interface we say that a class implements an interface and instances of that class provide the interface (assuming it is the instances upon which we invoke the methods defined by the interface). We will try to stick to that terminology in our discussion.

Skip down the twisted.internet.interfaces source code until you come to the definition of the addReader method. It is declared in the IReactorFDSet Interface and should look something like this:

def addReader(reader):
    """
    I add reader to the set of file descriptors to get read events for.

    @param reader: An L{IReadDescriptor} provider that will be checked for
                   read events until it is removed from the reactor with
                   L{removeReader}.

    @return: C{None}.
    """

IReactorFDSet is one of the Interfaces that Twisted reactors provide. Thus, any Twisted reactor has a method called addReader that works as described by the docstring above. The method declaration does not have a self argument because it is solely concerned with defining a public interface, and the self argument is part of the implementation (i.e., the caller does not have to pass self explicitly). Interface objects are never instantiated or used as base classes for real implementations.

Note 1: Technically, IReactorFDSet would only be provided by reactors that support waiting on file descriptors. As far as I know, that currently includes all available reactors.

Note 2: It is possible to use Interfaces for more than documentation. The zope.interface module allows you to explicitly declare that a class implements one or more interfaces, and comes with mechanisms to examine these declarations at run-time. Also supported is the concept of adaptation, the ability to dynamically provide a given interface for an object that might not support that interface directly. But we’re not going to delve into these more advanced use cases.

Note 3: You might notice a similarity between Interfaces and Abstract Base Classes, a recent addition to the Python language. We will not be exploring their similarities and differences here, but you might be interested in reading an essay by Glyph, the Twisted project founder, that touches on that subject.

According to the docstring above, the reader argument of addReader should implement the IReadDescriptor interface. And that means our PoetrySocket objects have to do just that.

Scrolling through the module to find this new interface, we see:

class IReadDescriptor(IFileDescriptor):

    def doRead():
        """
        Some data is available for reading on your descriptor.
        """

And you will find an implementation of doRead on our PoetrySocket class. It reads data from the socket asynchronously, whenever it is called by the Twisted reactor. So doRead is really a callback, but instead of passing it directly to Twisted, we pass in an object with a doRead method. This is a common idiom in the Twisted framework — instead of passing a function you pass an object that must provide a given Interface. This allows us to pass a set of related callbacks (the methods defined by the Interface) with a single argument. It also lets the callbacks communicate with each other through shared state stored on the object.

So what other callbacks are provided on PoetrySocket objects? Notice that IReadDescriptor is a sub-class of IFileDescriptor. That means any object that provides IReadDescriptor must also provide IFileDescriptor. And if you do some more scrolling, you will find:

class IFileDescriptor(ILoggingContext):
    """
    A file descriptor.
    """

    def fileno():
        ...

    def connectionLost(reason):
        ...

I left out the docstrings above, but the purpose of these callbacks is fairly clear from the names: fileno should return the file descriptor we want to monitor, and connectionLost is called when the connection is closed. And you can see our PoetrySocket objects provide those methods as well.

Finally, IFileDescriptor inherits from ILoggingContext. I won’t bother to show it here, but that’s why we need to include the logPrefix callback. You can find the details in the interfaces module.

Note: You might notice that doRead is returning special values to indicate when the socket is closed. How did I know to do that? Basically, it didn’t work without it and I peeked at Twisted’s implementation of the same interface to see what to do. You may wish to sit down for this: sometimes software documentation is wrong or incomplete. Perhaps when you have recovered from the shock, I’ll have finished Part 5.

More on Callbacks

Our new Twisted client is really quite similar to our original asynchronous client. Both clients connect their own sockets, and read data from those sockets (asynchronously). The main difference is the Twisted client doesn’t need its own select loop — it uses the Twisted reactor instead.

The doRead callback is the most important one. Twisted calls it to tell us there is some data ready to read from our socket. We can visualize the process in Figure 7:

Figure 7: the doRead callback
Figure 7: the doRead callback

Each time the callback is invoked it’s up to us to read all the data we can and then stop without blocking. And as we said in Part 3, Twisted can’t stop our code from misbehaving (from blocking needlessly). We can do just that and see what happens. In the same directory as our Twisted client is a broken client called twisted-client-1/get-poetry-broken.py. This client is identical to the one you’ve been looking at, with two exceptions:

  1. The broken client doesn’t bother to make the socket non-blocking.
  2. The doRead callback just keeps reading bytes (and possibly blocking) until the socket is closed.

Now try running the broken client like this:

python twisted-client-1/get-poetry-broken.py 10000 10001 10002

You’ll get some output that looks something like this:

Task 1: got 3003 bytes of poetry from 127.0.0.1:10000
Task 3: got 653 bytes of poetry from 127.0.0.1:10002
Task 2: got 623 bytes of poetry from 127.0.0.1:10001
Task 1: 3003 bytes of poetry
Task 2: 623 bytes of poetry
Task 3: 653 bytes of poetry
Got 3 poems in 0:00:10.132753

Aside from a slightly different task order this looks like our original blocking client. But that’s because the broken client is a blocking client. By using a blocking recv call in our callback, we’ve turned our nominally asynchronous Twisted program into a synchronous one. So we’ve got the complexity of a select loop without any of the benefits of asynchronicity.

The sort of multi-tasking capability that an event loop like Twisted provides is cooperative. Twisted will tell us when it’s OK to read or write to a file descriptor, but we have to play nice by only transferring as much data as we can without blocking. And we must avoid making other kinds of blocking calls, like os.system. Furthermore, if we have a long-running computational (CPU-bound) task, it’s up to us to split it up into smaller chunks so that I/O tasks can still make progress if possible.

Note that there is a sense in which our broken client still works: it does manage to download all the poetry we asked it to. It’s just that it can’t take advantage of the efficiencies of asynchronous I/O. Now you might notice the broken client still runs a lot faster than the original blocking client. That’s because the broken client connects to all the servers at the start of the program. Since the servers start sending data immediately, and since the OS will buffer some of the incoming data for us even if we don’t read it (up to a limit), our blocking client is effectively receiving data from the other servers even though it is only reading from one at a time.

But this “trick” only works for small amounts of data, like our short poems. If we were downloading, say, the three 20 million-word epic sagas that chronicle one hacker’s attempt to win his true love by writing the world’s greatest Lisp interpreter, the operating system buffers would quickly fill up and our broken client would be scarcely more efficient than our original blocking one.

Wrapping Up

I don’t have much more to say about our first Twisted poetry client. You might note the connectionLost callback shuts down the reactor after there are no more PoetrySockets waiting for poems. That’s not such a great technique since it assumes we aren’t doing anything else in the program other than download poetry, but it does illustrate a couple more low-level reactor APIs, removeReader and getReaders.

There are Writer equivalents to the Reader APIs we used in this client, and they work in analogous ways for file descriptors we want to monitor for sending data to. Consult the interfaces file for more details. The reason reading and writing have separate APIs is because the select call distinguishes between those two kinds of events (a file descriptor becoming available for reading or writing, respectively). It is, of course, possible to wait for both events on the same file descriptor.

In Part 5, we will write a second version of our Twisted poetry client using some higher-level abstractions, and learn some more Twisted Interfaces and APIs along the way.

Suggested Exercises

  1. Fix the client so that a failure to connect to a server does not crash the program.
  2. Use callLater to make the client timeout if a poem hasn’t finished after a given interval. Read about the return value of callLater so you can cancel the timeout if the poem finishes on time.

95 replies on “Twisted Poetry”

Hello!
Thanks for great tutorial!

I just don’t get how to solve your second suggested exercise. Would you please explain how to do it? Why should I call my function with callLater? I’d like to call it now and if it hasn’t finished in some period of time, then it should be canceled (like socket timeout). But callLater will call my function later and not now.

Thanks!

Hey Petr, you’re right. Just calling callLater on the function itself won’t work. Here’s the idea:

1. Invoke your operation as you normally would. In this case that means creating the PoetrySocket objects.

2. Invoke callLater on another function whose job it is to cancel the first one, if the first one hasn’t already finished. What the second function actually does is going to be context-specific. For a PoetrySocket object, that probably means unregistering itself from the reactor and closing the raw socket. Does that make sense?

Yes, canceling asynchronous operations is often convenient. With the release of version 10.1.0, Twisted added some features for doing that. I’ll be discussing them in Part 19.

Hi Dave and thanks for the great tutorial on Twisted!
I think there is a problem in twisted-client-1 / get-poetry.py, doRead lines 89-95:
bytes = ”
while True:
try:
bytes += self.sock.recv(1024)
if not bytes:
break

This will loop forever since bytes will contain the first 1024 bytes of the poem
at the first iteration of the while loop and bytes is not empty after that.

Maybe this is better, it reads the poem with 4 iterations of the while loop.

bytesread = self.sock.recv(1024)
if bytesread:
print “in doRead:while bytesread:%s, length:%d” % (bytesread, len(bytesread))
bytes += bytesread
if not bytesread:
break

Hey Coruja, did you actually run the client? 🙂 The client works as is because we set the socket to non-blocking mode,
and eventually we get an exception because there are no more bytes to read and then we break out of the loop. And it takes a
lot more recv() calls to get a poem from the slow server because the bytes only come in a few at a time. The 1024
is an upper limit on the number of bytes to read, not necessarily how many you actually get.

Hi Coruja, I think I might have accidentally deleted one of your messages that got marked as spam. I was cleaning out my wordpress spam queue and thought I saw your name there. But my finger was already pressing the delete key 🙁

Hi Dave,
I did not post more messages after the previous one.
Of course i have run the client twisted-client-1/get-poetry.py on both the slowpoetry.py and the fastpoetry.py and this client hangs forever for me. I can’t show you any output since there are no print statements within the while loop, I can just contemplate my CPU usage toping at 100%. I am using Twisted 10.0.0 on Ubuntu 10.04

great catch , as long as the delay on the server side is not 0, client will run into an infinite loop

Hi Dave,

Once again Windows throws and error with parsed address in the client. Line 40 in the twisted client needs to be changed from

if ‘:’ not in addr:
host = ”

to

if ‘:’ not in addr:
host = ‘127.0.0.1’

I suspect that this problem is universal for all the code using the parse_args() function.

Dave,

Just been through my copy of the code with find and replace, and now have a version of all clients which should work. Would you like me to send this to you and save yourself five mins, and if how?

Hi Thomas, I would appreciate that. Do you have ‘git’ installed, are you able to generate a patch?
That would be best for me. But if not, you can just send me a zip file with your code.

thanks,
dave

Unfortunately I don’t have git. Using windows restricts my ability to use git quite a lot. Where should I send the zip file???

Woops 🙂 the first thing I noticed was the case difference and so I assumed that, despite knowing that wikipedia is case insensitive. One day I’ll slow down and think more 🙂

The blocking client can be fixed fairly easily. The only reason it “blocked” and read in each poem in its entirety is because of the “while True:” loop in the doRead.
If the doRead function was changed to the following then it works:

def doRead(self):
poem = ”

bytes = self.sock.recv(1024)

msg = ‘Task %d: got %d bytes of poetry from %s’
print msg % (self.task_num, len(bytes), self.format_addr())

if not bytes:
return main.CONNECTION_DONE
else:
self.poem += bytes

I thought this may cause problems if the data received was larger than 1024 but I made my blocking server send the whole poem (using ecstacy) and the doRead was called until all the data was received/read in.

I don’t know if there is a benefit either way using blocking or non-blocking sockets in this case. The select/reactor takes care of guaranteeing that there will be data available when the doRead is called so there will be no wait either way.

There may be a benefit of removing the while loop from the non-blocking case also. If there was lots of data coming from several sources the loop could cause data to be continually read from one source until the “flood” of data stopped/slowed. With the loop removed then each source gets one read before moving to the next, this allows all sources to perform their reads even if one is being overwhelmed (I am assuming a fair servicing of each source by the reactor).

Hey Erik, that’s a good point about the while loop. Twisted’s select loop means that doRead is only ever called
when there is at least some data to read so the first call to recv() should never block. Using setblocking(0) is
probably a good idea to be safe, though, as it might work differently on other platforms.

Limiting the amount of data to read in one ‘gulp’ to prevent starving other sockets is also a good idea, and is how
Twisted’s actual socket code works.

Small point, but when reading stuff for the first time I’m very literal!

class PoetrySocket(object):

poem = ”

This is a class variable, which must be a mistake? When you later go:

self.poem += bytes

You’re assigning self.poem, not touching the previously defined class variable of the same name. (Although it uses that for the initialisation – it evaluates to self.poem = PoetrySocket.poem + bytes)

So probably the initialisation should be moved from class scope to __init__()

Hey Doug, I guess it’s debatable style, but using class variables to provide fixed
defaults for instance variables is not an uncommon practice.

You will find the same pattern in the Twisted source code itself. You are, of course,
free to avoid doing so yourself 🙂

Hey – thanks for the quick reply!

Um.. yeh I’ve just noticed it’s everywhere. I personally don’t like it – it’s misleading, it duplicates names into the class, and you have to look in 2 places to see what is ‘supposed’ to be an instance initialisation. But at least I know to expect it 🙂

Thanks again. Enjoying the series!

Especially something like this in twisted-client-2/get-poetry.py

class PoetryClientFactory(ClientFactory):

task_num = 1

protocol = PoetryProtocol # tell base class what proto to build

One of them is a true class variable, and is used by the superclass. The other is meant to be an instance default. Anyway – all good.

Erik :
The blocking client can be fixed fairly easily. The only reason it “blocked” and read in each poem in its entirety is because of the “while True:” loop in the doRead.
If the doRead function was changed to the following then it works:
def doRead(self):
poem = ”
bytes = self.sock.recv(1024)
msg = ‘Task %d: got %d bytes of poetry from %s’
print msg % (self.task_num, len(bytes), self.format_addr())
if not bytes:
return main.CONNECTION_DONE
else:
self.poem += bytes
I thought this may cause problems if the data received was larger than 1024 but I made my blocking server send the whole poem (using ecstacy) and the doRead was called until all the data was received/read in.
I don’t know if there is a benefit either way using blocking or non-blocking sockets in this case. The select/reactor takes care of guaranteeing that there will be data available when the doRead is called so there will be no wait either way.
There may be a benefit of removing the while loop from the non-blocking case also. If there was lots of data coming from several sources the loop could cause data to be continually read from one source until the “flood” of data stopped/slowed. With the loop removed then each source gets one read before moving to the next, this allows all sources to perform their reads even if one is being overwhelmed (I am assuming a fair servicing of each source by the reactor).

—————————–
I tried the eric’s solution of removing the outer ‘while’ loop, but change the receiving bytes to 4. Below is a snippet of the modified code:
————–snippet of func: doRead in get-poetry.py (modified) ——————————————————-
bytes = ”

# while True:
try:
bytesread = self.sock.recv(4) # changes to 4
# if not bytesread:
# break
# else:
bytes += bytesread
except socket.error, e:
if e.args[0] == errno.EWOULDBLOCK:
# break
pass
return main.CONNECTION_LOST
—————————————————————————–

Then I run the code, It results good.
(‘python twisted-client-1/get-poetry.py 10000 10001 10002’ ) (Servers are the same as this article(part4) mentioned)

——————————– snippet of output ————————
Task 1: got 2 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000 # get 1st 4 B
Task 1: got 4 bytes of poetry from 127.0.0.1:10000 # get 2nd 4B
Task 1: got 4 bytes of poetry from 127.0.0.1:10000 # … 3rd
Task 1: got 4 bytes of poetry from 127.0.0.1:10000 # 4th
Task 1: got 4 bytes of poetry from 127.0.0.1:10000 # 5th
Task 1: got 4 bytes of poetry from 127.0.0.1:10000 # 6th
Task 1: got 4 bytes of poetry from 127.0.0.1:10000 # 7th
Task 1: got 2 bytes of poetry from 127.0.0.1:10000 # get last 2B. 4B * 7 + 2B = 30B (server;10000 sends 30B a time)
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 2 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 4 bytes of poetry from 127.0.0.1:10000
Task 1: got 2 bytes of poetry from 127.0.0.1:10000
Task 1: got 3 bytes of poetry from 127.0.0.1:10000
Task 1 finished
Task 1: 3003 bytes of poetry
Task 2: 615 bytes of poetry
Task 3: 653 bytes of poetry
Got 3 poems in 0:00:10.126736

———————————————————————-
My conclusion:
server:10000 sends 30 bytes per time, and server:10001 and server:10002 send 10 bytes per time, they all send more than receiver of getting 4 bytes per time.
The only explanation I can think of is: once there is data staying on the buffer to read(no matter the data is fresh or stale), the reactor will call our doRead function until it consumes the entire buffer. ( note: 1. the ‘stale data’ here, I mean, is the data in the buffer which is not yet touched by the receiver because receiving speed is less than the sending speed from server. 2. We have already removed while loop from the doRead function, so the guarantee of consuming data in ‘read buffer’ is relied on reactor).

for example(‘*’ means a byte in buffer, ‘|’ means the reactor calls the doRead function to receive data.
client’s buffer: 30 bytes received from server
client gets 4B from buffer every time which is invoked by reactor)

|****|****|****|****|****|****|****|**
. 4 . 4 . 4 . 4 . 4 . 4 . 4 . 2

Am I right ?

Hey Lauren, it’s looking good! One thing, I had intended the ‘not crash’
portion of Part 1 to mean that it would still download from any servers
that were actually working, even if one (or more) were not.

To answer your second question, the setTimeout socket method
only makes sense for blocking sockets. It in effect declares that you are
only willing to block while reading or writing to a socket for so long.

But in Twisted, or any other asynchronous I/O system, you never block on
sockets. In effect, the socket timeout is always zero. Blocking on a socket
would basically defeat the whole point of using asynchronous I/O which is
to only service the ports which are not going to block. So to do things like
timeouts on individual sockets, you need another mechanism.

By the wait, callLater does end up setting a timeout, but it’s
on the select() (or poll(), etc.) call, not on
individual sockets.

Excuse me if this is mentioned in the later sections but I am working through this tutorial as I speak. For a while I was unable to understand how, the twisted framework knows about the existence of of the Poetry class although it implemented the interface.
Then I saw this import which completed the puzzle.
“from twisted.internet import main
if __name__ == ‘__main__’:
poetry_main()”

Adding a note here so that others who may not see the link get a clue.

Thanks for the great tutorials. I’m really learning a lot!

For exercise 1, I modified ‘init’ as follows:

def __init__(self, task_num, address):
self.task_num = task_num
self.address = address
self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
if self.sock.connect_ex(address) == 0:
self.sock.setblocking(0)
reactor.addReader(self)
else:
print ‘Could not connect to Task’, task_num

It works with one or more of my servers off, but is it better to look for the error somewhere else?

Hey Dave,
now I am learning asynchronous programming!
First of all , thank you for such wonderful explanation of not so easy to understand topic.
The question is : why do we need now callLater, if connectionLost will react as it’s callback.
Please, explain

Glad you are enjoying the series!

You are using callLater to timeout the request before the other end hangs up.
(Imagine the other side is ‘stuck’ and is never going to close the connection).

Yeaaaaap, now i understood the reason. I ‘ve just thought for this particular program, not for something more bigger.
callLater solves the problem of stucking nice for ‘reactor’ pattern.
Hmmmm, could you recommend some articles OF USING COMET AND GEVENT, PLEASE?

I need some consultation Dave.
First of all, one twisted process uses one CPU (one application). If I have 8 CPUs, how can I use them all.
It means that 8 processes of twisted must collaborate with each other, but how ?

Hi Rustem, you have a few options. First, you can use threads in Twisted, see the deferredToThread API call. Of course, you still have to contend with Python’s global interpreter lock, so that only really makes sense if your threads are calling out to, say, C libs that release the GIL from Python code. But another option is to run multiple processes. One possibility is to have a master process that sends work to slaves. See the open source project Ampoule for an example, and possibility a library to use, for doing that.

Hi Dave,
Many thanks for excellent series of articles.
I have just noticed that you wrote “… you pass an object that must implement a given Interface …” at somewhere in your article -in part 4-. As i read from zope.interface documentation, objects does not implement an interface, they provides interface(s). It may not a big deal for those who are familiar with zope.interface concepts, but for newcomers like me it might be confusing.

Hi Senol, glad you like them. I guess I haven’t read the zope.interface docs
too closely, are you saying they specifically use the term ‘provides’ and
never ‘implements’?

Actually they use both terms. The term ‘implements’ is used for the class and the term ‘provides’ is used for objects (instance objects). They describe these two terms as in the following quotation from ZopeGuideInterfaces/Declaring interfaces

Now you should familiarise two more terms to understand other concepts. First one is ‘provide’ and the other one is ‘implement’. Object provides interfaces and classes implement interfaces. In other words, objects provide interfaces that their classes implement.

It seems to me that the “while” loop inside the PoetrySocket’s doRead method is unnecessary. I removed it, removed the break statements that assumed it, reordered some conditionals, and ran the twisted-client-1/get-poetry.py client with three slowpoetry.py servers. The result was a successful reading of several poems, as far as I can tell.

On second thought, perhaps the “while” loop would be a good thing if the socket were receiving message that were larger than its “recv” buffer. Then the loop would repeat until everything the server sent on that socket was read.
If the while loop is removed then there’s no difference if the server always sends less than the python socket object reads in a single socket.recv operation, but if the server sends a larger message then… hmmm…

Exactly, you’ve summarized the issue precisely. I include the while loop to illustrate the issue, and the basic way you go about handling it
if you want to read multiple times from the socket, but I don’t claim my solution is the definitive “right answer”.

Hi Dave, I need some help with the doread() method.

def doRead(self):
bytes = ”

while True:
try:
bytesread = self.sock.recv(1024)
if not bytesread:
break
else:
bytes += bytesread
except socket.error, e:
if e.args[0] == errno.EWOULDBLOCK:
break
return main.CONNECTION_LOST

if not bytes:
print ‘Task %d finished’ % self.task_num
return main.CONNECTION_DONE
else:
msg = ‘Task %d: got %d bytes of poetry from %s’
print msg % (self.task_num, len(bytes), self.format_addr())

self.poem += bytes

I am clear with all the code within the while loop. But what coonfuses me is the next line.

If not bytes condition is true, it means bytes is still a NULL string. Then how does it make sense of having the task finished?

Hello Indradhanush, when bytes is the empty string, it means that the connection
has been closed (usually by the server, since the client does not close it). And in our
poetry protocol, we have defined the close of the connection to mean the
end of the poem. In other protocols, of course, the fact that the connection has
closed may not mean you finished — maybe it was closed halfway through and you
just never got to the end. Using the connection close to indicate the end of a
poem makes our examples simpler, but it’s not really best practice.

The bytes variable won’t become empty during that particular
invocation of doRead. So the socket will stay
in the twisted reactor and during the next iteration, the
closure of the socket will cause the select loop to return
it as ready for reading. Then we will discover that the
socket has closed (bytes will be empty now) and we will
be done.

It takes a while, but you will get it. Remember that doRead
will be repeatedly called by the reactor (like in the earlier example where
we explicitly used select). Try putting in some more print statements
so you can see what is happening.

Hi Dave,
In the first exercise, I tried to use reactor.callWhenRunning() to do the socket connection. In this way the exception of connection error will be handled by the reactor. But it failed and my code looks like this:

def __init__(self, task_num, address):
self.task_num = task_num
self.address = address
self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
#self.sock.connect(address)
self.sock.setblocking(0)

from twisted.internet import reactor
# connect when the reactor start to let the reactor handle exceptions.
reactor.callWhenRunning(self.connect)
# tell the Twisted reactor to monitor this socket for reading
#reactor.addReader(self)

def connect(self):
self.sock.connect(self.address)
from twisted.internet import reactor
reactor.addReader(self)

And I got an exception:
socket.error: [Errno 115] Operation now in progress

Could you please tell me what is wrong with it? Thanks!

Hi Dave, thanks for an excellent tutorial.

Regarding the parent question, what if I want my connect not to block the reactor? I have a Bluetooth socket where I have to implement my own FileDescriptor and the connect can block for a while. My current solution is to do this via deferToThread which feels a bit like cheating…

I’m sorry if you cover this in a later chapter, only read the first 4 yet 🙂

Hello! Glad you liked the tutorial. Using a thread isn’t cheating at all and Twisted itself sometimes does that, see, for example ThreadedResolver.

Hi Dave. This is a solution for excercise 2: (Please keep in mind I’m a self-taught amateur programmer and quite new at it):

First add ‘self.terminate = False’ to the beggining of the __init__() function, and add reactor.callLater(2, checkFin) at its end.

Define the checkFin() function as follows:

def checkFin(self):
self.terminate = true

Finally, at the pen-ultimate line of the doRead function add this:

if self.terminate == True:
sock.close()
print(‘Process timed out, poem retrieval aborted.’)

Hi Kareem, I don’t think that would work. What if doRead never gets called after two seconds (because no data showed up to be read)? I think you will need to do something else in checkFin.

You can also use the return object of the call later function as follows:

First add:

cancelCheckFin = reactor.callLater(2, checkFin)

to the end of the __init()__ function

and modify the doRead function as follows:

def doRead(self):
bytes = ”

while True:
try:
bytesread = self.sock.recv(1024)
if not bytesread:
************* cancelCheckFin.cancel()
break
else:
bytes += bytesread
except socket.error, e:
if e.args[0] == errno.EWOULDBLOCK:
break
return main.CONNECTION_LOST

if not bytes:
print ‘Task %d finished’ % self.task_num
return main.CONNECTION_DONE
else:
msg = ‘Task %d: got %d bytes of poetry from %s’
print msg % (self.task_num, len(bytes), self.format_addr())

if self.terminate:
print(‘grace period over, poem transfer aborted)
self.sock.close()

if self.terminate == True:
sock.close()
print(‘Process timed out, poem retrieval aborted.’)
self.poem += bytes

That almost a solution for canceling the timeout, but I think you would need self.cancelCheckFin = reactor.callLater(...)

Otherwise cancelCheckFin is just a local variable in __init__ and the doRead method will not have access to it.

Thanks Dave. I’ll keep reading, this is all new to me (networking in general), hopefulley I’ll get the hang of it. I’m eventually going to try using twisted to make a game server for a simple 2-D game for my boys. Just for the fun of it. If I’m undrstanding all this correctly, the client side will have to be plain non-twisted asynchronous programming because I can’t relinquish control to the reactor on the client side, where the game loop will reside. Correct?

Hi Kareem! That’s a great project, game programming is a fun way to learn. If the client is in Python, then both client and server can certainly be in Twisted. As you point out, a game client typically has a ‘game loop’. This is simply a loop that waits for things to happen (the user clicks here, or presses this key, or a time tick happens and the state of the world needs to be updated, etc.). Well that’s all the reactor loop is doing, too, it’s just waiting for network events. So the reactor loop can be the game loop. Or the game loop can be the reactor loop. If you program the client with GTK or QT, there are adaptors that allow Twisted to use the GTK or QT event loops as the Twisted loop. There might be one for PyGame, too, I’m not sure.

Hi Dave,

I got a question when I read this:

‘Furthermore, if we have a long-running computational (CPU-bound) task, it’s up to us to split it up into smaller chunks so that I/O tasks can still make progress if possible’

I don’t understand what it means by I/O tasks.Do you mean functions like recv() that take data out from socket,or you mean the process of receiving the data from a server though a connection and putting it into a socket.

For the former cases:
I understand this.
If the callback is long-running computational task,of course,the ‘select’ loop is blocking when the callback is running.Thus other recv() cannot be called.

For the latter cases:
I think it should make progress even if a a long-running computational callback is running

And why we are suppose to split them into smaller pieces,it does’t speed up the overall performance,since it will not block on I/O ,

Hi Tommy, by I/O tasks I mean both reading and writing to a socket. The select statement is used to wait for both reading and writing to sockets because both of those actions may block. So a long-running function will block both readers and writers waiting for their turn in the select loop. Does that make sense?

This tutorial is excellent, I am a Python programmer with some experience but never had contact with anything but single-thread, synchronous programming. Nevertheless, with the examples here I feel I am really understanding the ideas. I tweak and check that really I know what is going on. So, thanks for it!!

I have a small doubt/comment. When I read the class PoetrySocket I was surprised to find poem as a class attribute. I thought: “all poems are going to get mixed up in a single text variable”. I added a loop with print instructions after they are read, and found that in fact that is not the case, they are correctly stored in the poem attribute of each instance of PoetrySocket. But then, why declare the class attribute poem, empty? Why not set self.poem = ” as the first line of the constructor, instead?

Glad you like the tutorial!

Regarding the class attribute — that’s a bit of a shortcut for making default instance attributes.
If you don’t set the attribute on the instance it uses the class attribute, but setting an attribute on an instance always sets the instance-level attribute not the class-level one. Setting it in the constructor would work just as well and is probably better as it is less surprising.

Thanks Dave,

I understand, in fact I needed to perform a few simple tests to realize that when within a method you evaluate self.att (being att a class attribute) you access the class attribute. But the first assignment self.att = … seems to create the instance attribute att, and any evaluation of self.att after that returns such instance attribute. This behavior of Python bit me, I would not say it follows the Principle of Least Astonishment.

Here is my attempt at exercise 2, any feedback? https://gist.github.com/alpha-beta-soup/37e47fa6ddccd3bb733f

When the PoetrySocket is initialised, I add the callLater with a complete_timeout variable that specifies how long a poem must take to complete (5 seconds is crazily low, but I want it to trigger). The method forceClose just calls connectionLost with a explanation about why it is being closed. If the task finishes, the timeout is cancelled (just before main.CONNECTION_DONE is returned). I suppose it would be better to not start the callLater during initialisation. Rather, right before reactor.run()?

Everything seems to behave as I expect.

Hey Dave, can you help me? This is my attempt for both exercises -> http://pastebin.com/9E7SESUD . It does work but I keep getting this error when it finishes and I can’t figure out why:

twisted.internet.error.ReactorNotRunning: Can’t stop reactor that isn’t running.

What am I doing wrong?

By the way, great course, I was lost before I found it, thanks!!

Glad you like it! It looks like you might be stopping the reactor every time a connection is lost, but the code is making more then one connection. You want to stop the reactor only after the all connections are done. So you’ll need to keep track of how many connections have been made and lost. Make sense?

Hey Dave, a lot of sense. The problem was that I included the self.sock.setblocking(0) part inside the try/except.

Thank you!

Hello Dave,

I am writing server in python twisted using tcp for multiplayer game server and all client are mobile app, so issue is that server get stuck or hang after sometime and use 100% cpu what issue may be/ please help.

Hi Dave,
I have a question. Why the time used by twisted-client-1/get-poetry-broken.py and twisted-client-1/get-poetry-broken.py are same ? I thought they should be different, since the block client should use more time.
Thank you!

Good question! The reason is that the broken client does make a connection to all the servers at once and all the servers immediately start writing data. There is some buffering in TCP so by the time the blocking client is ready to read from the second and third severs, all the data is ready to be read. Make sense?

Leave a Reply to AdamCancel reply

Discover more from krondo

Subscribe now to keep reading and get access to the full archive.

Continue reading