advertisement

Weblog:   The Python comunity has too many deceptive XML benchmarks
Subject:   Trying my own tool (Gnosis Utils)
Date:   2005-01-24 19:47:49
From:   David_Mertz

I always so like the breath of fresh air Uche brings to most topics. His benchmark examples are nicely down to earth (I would point out that I always do almost exactly the same thing--including full code--when I benchmark tools in my articles).


Anyway, with no real a priori sense of how it would come out, I decided to try gnosis.xml.objectify in the mix. I like my API best and all :-).


First, the script used:


$ cat time_xo.py 
from gnosis.xml.objectify import make_instance, walk_xo, tagname
ot = make_instance('ot/ot.xml')
for node in walk_xo(ot):
if tagname(node) == 'v' and 'begat' in node.PCDATA:
print node.PCDATA


I don't use the gnosis.xml.objectify.utils.XPath() function here, though I could. That's because I don't really believe XPath is entirely Pythonic


The timings are quite consistent between five runs:



$ time python2.3 time_xo.py > verses


real 0m7.200s
user 0m5.790s
sys 0m0.350s


Oh... I run on a quite different architecture than Uche, but the Pystone on my Powerbook is just about the same as Uche's:



$ uname -a
Darwin gnosis-powerbook.local 7.7.0 Darwin Kernel Version 7.7.0: Sun Nov 7 16:06:51 PST 2004; root:xnu/xnu-517.9.5.obj~1/RELEASE_PPC Power Macintosh powerpc
$ python /sw/lib/python2.3/test/pystone.py
Pystone(1.1) time for 50000 passes = 3.04
This machine benchmarks at 16447.4 pystones/second
Main Topics Oldest First

Showing messages 1 through 1 of 1.

  • Trying my own tool (Gnosis Utils)
    2005-01-24 20:03:21  David_Mertz [Reply | View]

    I am thinking, BTW, of wrapping cElementTree.iterparse() into another gnosis.xml.objectify parser. Currently, there's a painfully slow DOM parser, and a reasonably fast EXPAT parser in there. But the design makes it easy to plug in something else. I have vaguely wanted to create an RXPU parser too (and more recently LXML)... but if I think cElementTree is even faster, I might just do that.

    I like my (more Pythonic) API better than that in ElementTree, but if I get speed, why not take advantage of /F's underlying work? Of course, there're a zillion things I want to get around to, so it's not quite a promise.

Showing messages 1 through 1 of 1.