I think by now most people are pretty phlegmatic about accepting various assertions about Open XML and ODF on face value. The sky is not falling. The boy is crying wolf. A seive is full of holes no matter how loudly someone shouts that it is a bucket.

When, for example, one side says “Open XML normatively refers to MS’ proprietary WMF” and the other side says “Err, where? Not in the Normative Refences sections” and the first side says “Err, then there is an *implied* normative reference because a mention is made of it elsewhere as a possible kind of graphic that may come in from the clipboard” and the other side says “The ISO usage of ‘normative’ revolves around indispensibility: isn’t ‘possible’ the opposite of ‘indispensible’?…” disinterested observers may think Surely there is a more constructive approach? These silly examples are distractions from serious concerns.

So here is what I suggest, for national bodies reviewing Open XML: adopt a set of general principles and apply them (to Open XML, ODF, and whatever). When someone raises a specific issue, verify that the issue indeed is as claimed, find the general principle, and base your responses on that, with the particular flaw as an examplar. The tactic adopted by some activists is to read the draft text, think of the worst possible interpretation and ramification, then insist it is the case: the “normative reference” example is a good case of this. The trouble with this approach is that it won’t work; impartial reviewers will note that there is some kind of concern but that the actual issue raised does is not a problem. The result will be frustration and a lack of a “meeting of the minds”. Indeed the legitimate issues that underly some of the anti-OpenXML comments risk being unaddressed.

What kind of principles would there be? Here are a few off the top of my head:

Principle 1: A schema must allow standard data notations for atomic, embedded data fields, where the standards exists, and may also allow local, common, optimised or legacy notations.

Applying this to Open XML, for example, it would mean that where DrawingML uses EMUs coordinates, it also should allow inches, cm and points. And where Spreadsheet ML allows numbers for date indexes, it also should direct ISO 8601 dates. Do you see the difference between saying “Open XML should be banned because it uses EMU” and “Open XML should be improved to allow more than EMU”. The most important thing is that this is a superficial change to the exchange language, not to the underlying model: it doesn’t force MS to adopt a different model or require them to generate standard units. (That is a different issue: the issue of profiles or application conformance.)

Principle 2: A schema should allow direct representation of data fields, and may allow optimised forms as well

Applying this to Open XML, we see that the string approach taken by SpreadsheetML conforms: you can have text directly or index to a shared string table. Adopting this principle lets a National Body vet the issues: if someone says “This doesn’t look like HTML! Therefore it is bad!” the NB can say “We adopt the principle that optimized references can be allowed as long as literal content is allowed too”.

Principle 3: A schema language for compound documents should support an indirect or over-riding reference mechanism for entities or resource, and may disallow a direct mechanism.

SGML and XML DTDs have a mechanism called Entities that allow indirect references. This is really important for maintance of large documents, because it disconnects references from names: you can update a graphics file and a single reference. Applying this to Open XML, OPC meets the criterion. OASIS catalogs would also probably fit the bill.

Following from principle 1 and 2, an indirect reference mechanism should allow the standard notation (IRIs) but may also allow a local or optimized form. Applying this to Open XML, this principle would mean double checking that IRIs are allowed (I will check this sometime) in OPC; I don’t think that OPC uses a local, optimized or legacy form (I will check this sometime.)

Principle 4: Notations for legacy or obsolescent technologies may be included in a standard, but should be in an informative part, clause, namespace or annex.

Applying this to Open XML, the sections on VML would be marked “informative”.

Principle 5: A standard should be arranged as a modular, simply layered container, to allow plurality and evolution

I am not sure of the ramifications for Open XML: I need to check the part 5 of the standard, which deals with extenions and future-proofing. Certainly the use of MIME types in OPC follows this principle, but it goes more than that: could DrawingML be augmented or replaced by SVG for example? (I will check this sometime)

Principle 6: A standard core should be platform-neutral and may allow optional platform-dependent extensions, in a separate annex, namespace or clause where appropriate

I think Open XML is OK in this regard: it allows Word macros, Java, and other scripts, but these are not required and IIRC partitioned.

Principle 7: A standard should address a market requirement, and the availability of a standard for one market or set of standards does not preclude the development of a standard for a different market or set of requirements

In other words, no standard should be denied merely on the grounds of “My requirements are more important than yours”. In the case of Open XML, it means that “don’t ignore the elephant in the room” arguments —that the needs for level-playing field basic document exchange by governments and suite vendors (ODF’s supposed sweet spot) trump the needs of integrators, archivists, and so on for Office’s format to be standardized— would be rejected. (Not rejected from all consideration of course, but relegated to their proper place, which is for legislators, regulators, CIO policy makers, and profile makers, not ISO.)

Whither Interoperabilty

When a standard followed the kinds of principles above, it allows both full-fidelity (the main principle behind the design of Open XML) to meet round-tripping/API-replacement/archiving requirements, and it sets the stage for interoperability between different systems: this is where in addition to the broad requirements of the standard, specific limitations are imposed so that all the different kinds of local, legacy, optimized, common-but-non-standard, and platform-dependent notations, media types, scripts and so on are avoided. ODF has just as much need for these kinds of profiles as Open XML does, as far as document interchange goes, by the way.

It is a kind of paradox: an “open” data format must be extensible, but the more that extensions are used, the more that a closed range of applications will be able to use the document; a document format that is “open” in the sense of having a fixed definition that allows guranteed document interchange is actually must be a “closed” (non-extensible) format! The solution? The long-standing policy of SC34 is to standardise “enabling technologies” and to leave profiles to user groups and industry consortia: XML itself is an example of this. ISO SGML allows many different delimiters; the industry consortium W3C picked a particular set of delimiters and features, added some internationalization features, and re-branded their profile “XML” which gives simpler interoperabilty.

In the absense of these kinds of principles, what we have is a line of argument that reduces to “Microsoft is bad, therefore anything they do or make is bad”, even when Microsoft is forced to backflip and to start doing the opposite of what they previously did: in this case, abandoning closed, binary formats. Ten years ago, Bill Gates was saying they would be crazy to open up their file formats, now they are doing it. If users and, most importantly, system integrators, keep on encouraging them to further open up and adopt a more modular architecture, it bodes well for where we will be in ten years time. The future is mix and match.