| Weblog: | The Python comunity has too many deceptive XML benchmarks | |
| Subject: | "There's a Riot Goin' On" | |
| Date: | 2005-01-24 17:06:26 | |
| From: | uche | |
|
Sly Stone said it.
|
||
Showing messages 1 through 2 of 2.
-
"There's a Riot Goin' On"
2005-01-25 03:07:14 faassen [Reply | View]
I wouldn't say the speed of parsing XML into a Pythonic datastructure is completely alien to people's use. It can be done a lot more slowly, as has been shown in the past over and over, and cElementTree can do it very quickly.
That means we can now be far less concerned with parsing overhead. Since the structure is already Python-style, the overhead of ElementTree API calls can then be minimal, as is shown by the fast performance of the find operation in ElementTree. Non-C ElementTree find() sometimes can even beat libxml2 XPath, which is implemented in C.
lxml.etree can do a parse very quickly too, using the underlying libxml2 library. Unfortunately it isn't "done" yet then if you want to use the ElementTree API, are there are Python proxies to be produced while the user accesses the XML. This has been made fairly fast by now, but it still lags behind ElementTree. For libxml2 native xpath this proxy overhead is far less, and you can get down to busines right away.
If you want to know how I know all this, see my blog for a lot of benchmarking over the last couple of weeks. I didn't have a 'begat' test yet, but I did test a simple //v test, as Uche did in an earlier article.
| Showing messages 1 through 2 of 2. |




What kind of excuse is that?
You're the one that brought up the whole thing yet it seems that you have done a worse job at becnhmarking than others. Very ironic.
I think your benchmarking method is very ad-hoc and you'd be better served if you fixed the glaring errors and posted an updated version of your findings.
I'm getting incomparably better results with cElementtree (runing the same program as you do but I'm benchmarking it with timeit, around 0.25 seconds/run) on a similar laptop. Could not test your framework since your FTP system is down.