Håkon Wium Lie’s recent CNET column is entirely dodgy in its details, but solid in its ultimate premise (can a premise be ultimate?): in all the talk about ODF and OOXML, it is important not to lose track of HTML’s potential and actual suitability for much document interchange.
I’ve endorsed this position many times, but it is worth stating it again. For simple word processing style documents, if you need interoperability (and you want to get it by restricting the kinds of structures in the document so that the documents can be read by many different applications and be easily repurposed), then HTML is the format to consider first: validated, standards compliant XHTML in particular. Think of it in terms of a continuum, with HTML at one end (simple WP documents), PDF at the other end (full page fidility but read-only): HTML, ODF, OOXML, PDF. And certainly not to forget the ultimate premise(!) of markup: to rigorously label the important information in your documents accroding to its rhetorical and semantic structures, which sometimes simply requires custom schemas and microformats, extending or augmenting or even replacing the standard formats.
On this last point, Lie has a great line: speaking of ODF and OOXML … I’m no fan of either specification. Both are basically memory dumps with angle brackets around them. Lie thinks this is a bad thing; I think it is a necessary thing: sometimes you want to only save what can beread everywhere (the case with HTML save) but usually you want to save everything that is in your document so that when you next open it, it is exactly the same publication you saved
W3C versus ISO
When looking at any writer on standards at the moment, it is good to establish point of view. Lie is employed by a competitor of Microsoft, Opera Software, whose business is based on standards from W3C not ISO. It is not shocking that his response to the issue of Office formats at ISO revolves around promoting that applications should follow W3C standards. It is not as much a non sequitur as it seems. Lie is the inventor of CSS: it is hardly suprising he does not want it sidelined, especially if organizations or governments adopt ISO formats ahead of W3C ones.
Perhaps it is time for W3C to take ISO seriously, befriend it, snuggle up to it, and put the core Web standards through some kind of fast-track procedure as well? Deal yourselves a better hand! The world of fast-track that ISO JTC1 has, in its wisdom, launched us into is intended to allow specifications from consortia that corporations can participate in and dominate (W3C, Oasis, Ecma, etc.) to get promoted up to ISO standard level, if national bodies vote to accept them. This is because ISO sees its core activity at enabling agreement, not sniping at rival brands.
ISO has taken a more constructive approach than denigrating the boutique standards consortia: it does not denigrate ECMA merely because it is designed to help companies make their technologies public with copyright-free specs fast; nor that W3C may be extra-accomodating to the larger fee-payers (such as the ‘phone calls’ that Lie mentions); nor that OASIS’ procedures made it susceptible to ‘branch stacking’ to favour one groups technology. Because, at the end of the day, ISO votes are very hard for commercial organizations to manipulate. There are simply too many countries; where manipulation is possible, of course, is when there is the appearance of a grassroots movement in many countries; however, at least some national standards bodies (and their committee members) take a dim view of lobbeying on non-technical issues, and certainly against single-issue committee people with no real interest in promoting standards.
Malthus
Lie adopts an extreme view towards overlap of standards: overlap at all brings nothing but misery and bloat. I doubt if there is much sympathy for his view that XSL was unnecessary in 1999 because of CSS (CSS now has a better set of formatting properties and selectors, but where are the transformations?) And to blame MS for XSL rather ignores James Clark’s involvement. [Lie thinks ODA and SGML were competitors, even though they took extremely different approaches (binary, fixed, application-dependent format, versus text-based, free, universal): unfortunately, he does not burden us with any facts behind his mention that 1986’s ISO SGML was more complicated because of ODA, something I hadn’t heard before. ]
But what about the dodgy details?
Borat!
Well, the first one is personal, and the reason I am bothering to write this. Lie says In the past, consultants paid by Microsoft have joined standardization groups and have become sympathetic voices. Are they buying countries this time? Is Lie saying that small nations have no interest in document and document processing languages? Again, no detail, just slurs. (I am sensitive to this, of course; when I first read it, I wondered if Lie was referring to me obliquely, which would be fairly hypocritical considering Lie himself is undoubtedly a sympathetic vote for his company at W3C. Companies or their representatives are allowed to participate in standards work indeed encouraged, but the difference with ISO is that they don’t get a final vote. But reading it carefully, it is clear that Lie knows that individuals don’t vote on ISO standards, and that he was saying something uncertain about Kazakhstan. (I worked with two brilliant programmers from Kazakhstan, via Russia, Israel and Australia, for many years, one of whom is now at the highest technical level of a government department here; the distance between their excellence and Borat only makes Borat all the funnier!)
Ambition
The next dodgy detail is to make blanket comparisons between HTML and ODF/OOXML. ODF and OOXML deal with many issues that HTML/CSS simply does not. What is HTML/CSS’s story for spreadsheets? What is HTML/CSS’s story for ZIP packaging? (Well, I think the W3C argument might be to say that every part should have a URL and be available on the web. W3C’s worldview is bounded by the web.)
Don’t laugh at the fat girl
Lie repeats the 6000 page claim, which I thought had been retired from debate along with the more embarassingly bad Groklaw material. But mud sticks, we say here in Australia. When you add SVG and all the other standards that ODF or HTML/CSS invoke or require to get similar general capabilities,and typeset them similarly, the numbers are not so different (in the order of 3000 to 5000 pages excluding primer material.) Probably I should repeat my complexity metrics analysis now we have the final schemas out (assuming the schemas correspond to what is in the standards…the lack of SC34 checking on all fast-track standards means that national bodies vote(d) for ODF and OOXML without the most basic quality check from SC34 WG1 grrr.)
Deathwish?
Lie has a strange theory that MS wants ODF and OOXML to both fail: I am not sure what possible mechanism could be used for this. “If both specifications fail, the most likely result is that the world continues to use Microsoft’s proprietary “doc,” “xls” and “ppt” formats. This is consistent with Microsoft’s attitude in other areas in which the company is pushing closed formats. For example, the MSN Messenger protocol is not public. ” Even apart from the basic problem that those formats are no longer the default save formats, I don’t get it: that MS should not finally do the right thing with one product (open up and standardise Office) because it would be inconsistent with them being bad elsewhere? With respect to Lie, this is complete crap. (We consider Hitler a monster, we consider him a monster even when he pats his dog on the head, but patting dogs on the head is a good thing even when done by monsters, and it is better for monsters to be patting dogs on the head rather than going around being monsterous. You get the idea. Just as in negotiation you need to have a position in which the opposition can consider itself to advance or not lost face, we cannot expect MS to embrace standards if we block them out of the process.)
Infidelity
But Lie is right, I think, to be alarmed by the prospect that if OOXMLfails MS will revert away from open formats. I don’t see them adopting ODF as the default format for general sale. for a start, current ODF simply does not have matching capabilities. This issue of fit is strong enough that we don’t even need to get to the issue of control. We have this nice little window now where MS is inclined to open up its formats, something that the document processing community has been pleading for for years. The ODF sideshow runs the risk of screwing this up; I’ve said it before, but I say it again: being pro-ODF does not mean you have have to be anti-OOXML. ODF has not been designed to be a satisfactory dump format for MS Office; OOXMLhas not been designed to be a suitable format for Sun’s Star Office or Open Office or IBM’s products. HTML is the format of choice for interchange of simple documents; ODF will evolve to be the format of choice for more complicated documents; OOXML is the format of choice for full-fidelity dumps from MS Office; PDF is the format of choice for non-editable page-faithful documents; all of them are good candidates for standardization, all have overlap but are worthwhile to have as cards in the deck of standards. But systems for custom markup trumps all.
Brave New World
I guess, behind it all, there is this idea that there will be one true document format. The future will be beautiful because we will have full-potential HTML. Or the future will be beautiful because we have full-potential ODF. Or the future be beautiful when we adopt the same common microformats regardless of the framework. I tend to the view that the future will never be beautiful in that kind of monochromatic (’Stalinist’ is entirely too dramatic) way, but that we need to to encourage a rich library of standard technologies, widely deployed, free, unencumbered, explicit, together with the awareness of when each is appropriate and with an adequate set of profiles and profile validators (using ISO Schematron!). Plurality. (HTML browsers are not weaker because there is GIF, JPEG and PNG, let alone TIFF, even though there is almost complete overlap.)


I tend to the view that the future will never be beautiful in that kind of monochromatic ('Stalinist' is entirely too dramatic) way, but that we need to to encourage a rich library of standard technologies, widely deployed, free, unencumbered, explicit, together with the awareness of when each is appropriate and with an adequate set of profiles and profile validators (using ISO Schematron!). Plurality. (HTML browsers are not weaker because there is GIF, JPEG and PNG, let alone TIFF, even though there is almost complete overlap.)
Bravo, Rick!
While I appreciate the suggestion that MS generously blessing us with OOXML is akin to hitler patting his dog on the head (does godwin apply when comparing your own client to hitler?), I find the likening of a world where the leading document format is completely interoperable to monochromatic stalinism a little over the top.
Dave: Thanks
Finite: Indeed. That's why I said "entirely too dramatic". We finally agree on something!
Once again your excellent grasp of the shallow technicalities of the ODF/OOXML debate seems to prevent you from understanding the underlying issues that generate the debate in the first place.
OOXML is a bug-wards compatible MS Office dump format. That means that it will allow me to manage documents in the foreseeable past. Many people, myself and a number of governments included, think that looking into the unforeseeable future is a better bet.
The core issue is that OOXML cannot be implemented by anyone who is not MS, because it will infringe upon their intellectual property. Thus, it cannot be used as an interchange format with systems not using MS Office, and it cannot be used as an interchange format with our future selves or our children and grandchildren. Given that, it is useless to me and millions of others, and its existence ensures that billions of documents will be closed.
I agree with you when you when you write of plurality of standards for different purposes. The problem is that OOXML's purpose is vendor lock-in, and I don't want that. I am delighted that people can make a good living transforming documents from one format to another. I just don't want to do it for my documents, or force the governments and non-profits to spend money on it. Profit-directed companies will find the cheaper way on their own.
Your comment at the end of your post sums up your position perfectly, suggesting that "HTML browsers are not weaker because there is GIF, JPEG and PNG, let alone TIFF, even though there is almost complete overlap". It shows a willingness to ignore both the technical differences (lossless vs. lossy, small colour pallet vs. large) and more importantly, the intellectual property differences - the encumberences on patented formats is a huge deal, and you keep ignoring it.
William: But the presence of PNG as a viable alternative (real or potential) killed off the GIFs patent threat. Plurality is good.
It is delightful to be accused of ignoring technical differences; only recently I was accused of nitpicking not to mention willfuly ignoring elephants. I should have used "competition" rather that "overlap" because that is the term Håkon Wium Lie uses (hence he says that ODA and SGML compete...) which is the context for my comments.
On the issue of IP and encumbrances, I have read the statements by the OASIS specs and the Ecma specs on copyright, and by Sun and MS on IP, and I don't see any problem, on the face of them. I don't ignore IP issues, I am just bored by them and by alarmists in general; I happy to leave IP issues to ISO JTC1 and their PAS/FastTrack laywers and national bodies to work out.
hmm, I made the same point about SVG in a report somewhere but my conclusion was basically that one could hope for libraries for SVG implementation that implementors could reuse.
maybe at some point there will be libraries for VML reuse, but I'm thinking the bets are on the svg side in this matter.
I think that the goodness of Microsoft in opening up is viewed suspiciously not just because of previous bad behavior but because it is assumed that it would be more beneficial to them not to do so. The conclusion of that viewpoint would be that they are doing so to protect business against encroachment by ODF, thus if one has this view it is natural to say 'they want them both to fail.'
Actually this is the viewpoint I have, the view gives a good guiding narrative for why things are the way they are. But that a good narrative exists to explain events does not make the narrative true.
Now it's Spy vs Spy vs smaller spy.
I can't blame him for defending Opera's bread and butter application. That is his job. I can't find fault with using HTML as the exchange format for low complexity documents. I've done quite a bit of that and it works great. The rest of the article is filled with sleights of fact and misdirection, and that is a grinding axe.
I don't think MS will turn away from the standards game. It benefits them internally and externally as a means to focus their developers and their customers. It is part of the contracting game they have to play as do others. In the internals of the exchange of data and documents, HTML will still get the majority of the bits on the wire.
Much ado about nothing. Attention is shifting away from such issues as how to move data to how to keep them online at lower environmental costs.
We all have our dreams. My dream begins with someone from the state lottery calling me. Rick's dream begins with OOXML joining with lots of other good little standards in a fairy tale wedding down at the Justice of the ISO. The only thing is, it is merely a fantasy. Just as VHS eventually killed Beta, one of the two office document file formats will eventually drive the other one to extinction, as they really serve the same purpose, so only one is necessary. In biology-speak, they both fit into the same ecological niche, and thus the success of one will come at the expense of the loss of the other.
Let us not compare Microsoft to Stalin. They have never been that bad, and indeed never will be.
Let us, however, recognize that the elimination of competition has been a long-term strategy of that company. Viewed in that context, it is perfectly understandable why they propose a second "standard" using similar technologies in the same problem domain. OOXML's purpose is to deflect competition, pure & simple. All the rest is decoration, meant to keep decision-makers from looking too deeply at their purposes until they have already decided to continue using Microsoft-specific formats, as I mention here.
Len: Oh, I wasn't meaning to imply anything improper about Lie or his comments. His POV is important, and I sometimes use Opera myself especially when doing demos.
W^L+: Your nick is a nightmare...I am using a mates PC which has a Japanese layout keyboard. For winning, I think inside ODF and OOXML are many little languages, and each will compete individally. In the long term, I suspect there will be some hybrid, with both teams supporting a modified Open Packaging Convention (now Ecma only), neither side agreeing on a native WP format but both supporting import/export in each format, Tiny SVG being required for government document interchange in either format (now only ODF) but DrawML/VML allowed for strippable decoration, and MS' formalae with extensions. In other words, I think a convergence of parts driven by profiles is the way things will go over the next decade, and after that, who knows...
I've asked many experts to take out their crystal balls. A few of them had glass eyes too. (Old Ronnie Barker gag.)
@William
I think the intellectual property issue is nicely discussed by The Wraith:
http://ooxmlhoaxes.blogspot.com/2007/02/ooxml-hoax-2-standard-is-not-really.html
It also gives a good argument why Microsoft Office can't use ODF as a default as that is a format which future development is still under control of a competitor, namely Sun.
Also I guess the recent MP3 verdict costing MS 1,5 billion dollar will make it highly unlikely that Micrsoft will start using native formats they have no IP rights to themselves in the near future unless they get some kind of guarantee that they won't be liable.
Not improper, Rick. Just disingenuous.
Why shouldn't MS publish an open standard for the document types they popularized and support? If Opera used PDF as its native format, they would support that. They support CSS for the same reason.
Len: Sometimes I think that it just is an issue of personality type. The artistic temperament favours exploration and alternatives; the soldierly temperament favours fixity and unity. (Well, actually that is a bogus stereotype: the marines for example don't like plans I'm told because their engagements don't work that way; and some artists are very keen to establish their movement to surplant the old.)
hAl,
I do not agree with The Wraith's view about ODF (see my comment in Hoax 5 and his reply), nor do I agree with yours.
Why would Microsoft HAVE to use ODF as the INTERNAL format or NOT USE it at all ? To go in line with Rick's prediction, MS should simply allow to read and save ODF as one of the 30 or so currently supported format. People that prefer OOXML will make this their default format, Government and people that prefer ODF will use ODF.
Then the market will evolve, and will kill ODF or kill OOXML or merge them as suggested by Rick. This is the TRUE way for Microsoft to take care of interoperability and offer choice to their customer.
Luc: Yes, the details of which menu MS puts their ODF converter is interesting to some people.
By the way, are you the Luc Bollen who (LinkedIn says) works for Market Research for IBM in Belgium?
Rick, no I don't have any links with IBM, and I'm not a market researcher. I work as project manager and consultant for a Europe based company providing IT services and consultancy.
By the way, my interests in ODF/OOXML are personal, and not linked to the work I perform for this company.