Last week on XML-DEV I suggested that there should be a competition with prize money to stimulate development of faster XML parsers suitable for high transaction work. Elliotte Harold even quoted my (not original) Open Source is not a free lunch, it is stone soup.
There are examples of industry consortia sponsoring such development: the OpenMP implementation contest for example. And groups such as OSDL offer fellowships to make sure strategic software is maintained.
At the tail of this discussion, Robin Berjon of the W3C Efficient XML Interchange WG announced a competition for a fast XML parser. His WG wants to find the fastest XML parser in order to have some benchmark to see whether the non-XML (but XML infoset carrying and XML API interfaceable) binary formats being considered deliver enough improvement to be worthwhile.
This is exactly the right kind of initiative from the W3C. It addresses my hobbyhorse that current Open Source parsers have not been written for raw speed (in fact, I see from Perl benchmarks that they get a 30 to 1 performance difference on different parsers and interfaces) and many have not benefitted from recent advances in optimizations (it is both sad and a tribute to James Clark that expat, about the first XML parser, is still about the fastest Open Source XML parser for some document types), for example my technique for scanning with SSE on x86. Not being written for speed, the relative performance of optimized binary formats is mercurial.
However, I don’t know if it will work. There needs to be money involved. It is unreasonable to expect someone of the quality of, say, SAXON’s Michael Kay to work without either support or the chance of reward. The software strategists and information architects at banks, Fortune 500s, governments, militaries and other consortia who use Open Source software should consider the benefit in time, timeliness, cost, ROI, efficiency, lower hardware outlay, etc. that, say, a 50% speed up in XML parsing time would cause: is this worth contributing to, say, a prize equal to two week’s salary of a good programmer? (say US$4000) That’s one hundred bucks each, if sharing in this with the other 40 members of your consortium?
I see it as a no-brainer, myself. Institutional users of Open Source softwre have a great opporitunity to guide development in ways that suit their organization: to encourage real product and open research that can also potentially feedback into even proprietary parsers.
I think the current top two strategic investments for encouraging higher-transaction rates in Open Source XML software would be first for some group to offer a prize to join the W3C Efficient XML Interchange competition (perhaps with a longer deadline), and second for people who use Michael Kay’s SAXON software to take out corporate licenses (Michael uses the higher-end product licenses to support work on the lower-end Open Source product.)
Indeed, I think the Open Source development does not deliver its best results when the customer is a large corporation at a financial distance from the developer. So banks, governments, corporations, consortia, vendors and so really have a vital interest in stimulating Open Source development that meets their needs. Rather than being passive bottom feeders.
Just because some software is branded Apache or GNU or Sourceforge does not mean it has been designed or optimized for high performance. In fact, often the reverse: much Open Source software is designed for exploration or maintainability or to meet a particular real or research need of the original developer. If you look at the Apache Xalan 1.0 code, for example, there is almost no attempt at optimization. And even the development plan for the Apache Xalan 2.0 code has very little emphasis on optimization. (Indeed, Apache is a byword for frameworkitus.)
Bottom line: businesses that use Open Source have a vital interest in having that software with appropriate performance and quality attributes. They need not be passive in this area; on the contrary, there are opportunities to influence Open Source development cheaply, through contests, licenses, fellowships, ex-gratia payments, fee-for-service customizations and the like.
One other good point made on the discussion on the XML-DEV thread was that there are are many good improvements to be made using existing software, just by configuring it optimally. I certainly wouldn’t want to stand in the way of a competition for, say, the best Xerces configuration manual or the easiest to use API. Indeed, I think as we settle down to an Open Source -only (!) software economy, this kind of cost-sharing will become more the norm.)