Peter Sefton’s recent blogs are some of the few sensible, non-partisan things I have read about Open XML and ODF in recent times. He has some really good, detailed posts on Word 2007 recently and lists.
Peter is far from an anti-Microsoft partisan: indeed, he is probably a poster boy for the kind of developers Microsoft needs to attract with Open XML and Word 2007. He is one of the key players in the free ICE Integrated Content Environment project.
Peter says “the big point that always seems to get missed when people talk about word processing formats” is “use styles.” Styles represent a sweet spot between crappy hodgepodge unusable office documents and retargetable information as allowed (with more effort) by XML with all its information-protection mechanisms like validation.
Peter’s comments on list interoperability with ODF and Open XML are, I think, a real litmus test for how seriously we should take either (or both) as formats.
Obviously the people behind ODF and Open XML are very aware that XML formats office documents represent a bargain basement kind of interoperability. I spoke with Microsoft’s Jean Paoli for a couple of hours a week ago and had dinner with the ISO ODF editor Patrick Durusau in Seoul: Jean emphasized to me that Open XML was highly targeted at minute fidelity to replicate existing .DOC etc files, while Patrick is a long-standing member of the Topic Map community.
Open XML and ODF have to be considered as a great step forward each, I think, but only the first step nevertheless. Microsoft and ODF/Sun/IBM should work hard to make sure that they (and the various leading applications that create or use those formats, notably Office and Open Office) provide the basic functionality to allow the next step forward: higher level structure markup through styles.
And thence to rigourous markup. Rigourous markup is the name of the game, ultimately. That is where you can enforce (or at least confirm) that data and metadata are present enough to construct useful systems and archives. I think Schematron has a great prospective role here: it is the only standard schema language that lets you represent constraints overlaid on top of lesser schemas: for example, constraints that the styles provided in attributes conform to a particular pattern (in an architectural forms kind of way, such as HTML’s class attribute.) For example, to say that lists must be marked up using styles or particular styles, in the way that Peter Sefton recommends.