Article:
  Untwisting Python Network Programming
Subject:   Why untwisting?
Date:   2006-08-11 10:34:28
From:   glyph@divmod.com
Many of the examples in this article are wrong. I'll just point out the errors in the telnet example, because I'm more familiar with Twisted than non-Twisted code, and I also don't have too much time to spend reviewing it...


By overriding Telnet's dataReceived method, it short-circuits the machinery which actually speaks the telnet protocol. By inspecting the 'data' argument for full strings, it depends on implementation accidents of the transport to deliver whole messages - TCP does not guarantee that, and although packets are rarely fragmented at such small sizes, as example code this is very bad form.


Try looking at TelnetTransport and ITelnetProtocol for a more correct way to implement this; you should probably use something like StatefulTelnetProtocol to verify that you're processing whole lines, and to deal with the negotiation of various telnet options.


"For basic programs such as the command-line client of this example, the Python core networking modules are more desirable due to the simplicity and performance advantages."


What do you mean by "performance advantages"? Have you done any measurements indicating that the core networking modules are ever faster?

Full Threads Oldest First

Showing messages 1 through 5 of 5.

  • Why untwisting?
    2006-08-13 22:19:59  Kendrew [View]

    Thank you for pointing out the possibility of TCP segmenting, which may cause problems to dataReceived.

    Regarding performance, I observed the core modules runs faster than Twisted when running the programs. I also briefly did some measurements and here are the results:


    Using core modules:
    start server (no net): 0.6040 sec
    send mails (smtp): 1.5585 sec
    view mails (pop3): 0.7506 sec
    delete mails (pop3): 0.5159 sec
    stop server (telnet): 0.5063 sec

    Using Twisted:
    start server (no net): 0.6668 sec
    send mails (smtp): 2.4919 sec
    view mails (pop3): 1.4418 sec
    delete mails (pop3): 1.2992 sec
    stop server (telnet): 2.1045 sec

    (Windows XP, Python 2.4.3, Twisted 2.4.0)


    These numbers are the average of 10 runs, and the mail server is run in localhost. While the measurements are by no means vigorous, they basically agree with the observations. Of course, the differences may not be significant in real uses when the network delay counts for majority of the execution time.
    • performance
      2006-08-14 08:16:53  radix [View]

      How were these numbers reached? Can you show us any python code or commands that you used to find them?

      I notice that your synchronous version of the telnet code immediately closes the socket whereas the twisted version is waiting for the *server* to end the connection; this could definitely be affecting your times. If you add a transport.loseConnection call after your write call, the semantics should line up better, and I imagine performance will be closer to what we expect.

      Also, why are you calling the private "_write" method in the telnet example?
      • performance
        2006-08-17 04:57:06  Kendrew [View]

        The time includes starting the interpreter and doing the whole script. It is true that the import statements (and any networking delay) contributes the majority to the execution time in these small programs. In fact, before measuring I did some tidying of the import statements so that only required items are imported before use.

        For the telnet, I measure again with the addition of transport.loseConnection() and using StatefulTelnetProtocol instead of Telnet. The Twisted telnet runs faster than before, as expected:


        Using core modules:
        start server (no net): 0.5809 sec
        send mails (smtp): 1.3601 sec
        view mails (pop3): 0.7007 sec
        delete mails (pop3): 0.5187 sec
        stop server (telnet): 0.5124 sec

        Using Twisted:
        start server (no net): 0.5959 sec
        send mails (smtp): 2.2488 sec
        view mails (pop3): 1.4274 sec
        delete mails (pop3): 1.3074 sec
        stop server (telnet): 1.3213 sec


        If you're interested, there is the Python program to measure the timing. It just invokes various usages of the two networking programs and takes the average.


        #!/usr/bin/python
        # file: mail-timeit.py
        # Measures the timing of invoking mail-core.py and mail-twisted.py

        from time import sleep
        from timeit import Timer
        cmdline = ''

        def doit(cmd, arg, array, rest):
        global cmdline
        cmdline = cmd + ' ' + arg
        print; print cmdline
        array.append(Timer('os.system(cmdline)',
        'import os; from __main__ import cmdline').timeit(1))
        sleep(rest)

        def dostat(cmd, times):
        stat = [[], [], [], [], []] # for 1, s, v, d, 0

        for i in range(times):
        doit(cmd, '1', stat[0], 12)
        doit(cmd, 's', stat[1], 5)
        doit(cmd, 'v', stat[2], 1)
        doit(cmd, 'd', stat[3], 1)
        doit(cmd, '0', stat[4], 2)

        return stat

        def avgstat(stat, fr, to):
        return [ sum(i[fr:to]) / (to-fr) for i in stat ]

        def printavgs(avgs):
        labels = [
        'start server (no net)',
        'send mails (smtp)',
        'view mails (pop3)',
        'delete mails (pop3)',
        'stop server (telnet)']
        for i, j in zip(labels, avgs):
        print '%25s: %.4f sec' % (i, j)

        if __name__ == '__main__':
        times = 11

        stat1 = dostat('mail-core.py', times)
        avgs1 = avgstat(stat1, 1, times)

        stat2 = dostat('mail-twisted.py', times)
        avgs2 = avgstat(stat2, 1, times)

        # print stat1
        printavgs(avgs1)
        # print stat2
        printavgs(avgs2)

        # end of mail-timeit.py

    • Why untwisting?
      2006-08-14 08:15:01  glyph@divmod.com [View]

      Are you measuring the time it takes to perform a task, or the amount of time it takes to start the interpreter, load every module, perform the task, and shut down the interpreter?

      Twisted has more code in it than the Python standard library version, so unless you've carefully optimized the package for importing, the amount of time spent loading code will dwarf the amount of time spent actually doing anything.
      • Short examples don't show event driven-driven benefits
        2006-08-23 11:11:38  andypurshottam [View]

        The main advantages of event driven programming become visible when large amounts of data can be processed incrementally ("streaming") and when there are multiple event sources, especially a gui tooklkit. Small cute programming examples typically do not need these resources. The smallest example I have seen of a application that needs and benefits from event-driven programming is the tcp proxy spy with gui window, like tcpwatch (done with async stuff from medusa, but would be instructive example with any event-driven system.)

        Found I really understood POE after completing such a program, and would advise those trying to learn a event-driven stsrem to code such. Only proplem is that doing so is not trivial, especially given the small examples that come with most systems, that do not explain how to do the tricky things needed so code a proxy, that also has GUI or standard IO, and possibly multple network connections.

        Andy (andypurshottam@gmail.com)