March 2008 Archives

AddThis Social Bookmark Button

The IETF has promoted the next revision of XML standard to recommendation status. Among the improvements for Xml 2.0 are no more schemas, reduced processing frameworks, an expansion of namespaces, and automatic transformation of tags to other formats such as jsOff, CSV, Sql, and a compressed binary form based on two’s compliment.

Eric Larson

AddThis Social Bookmark Button

Bill recently commented on another small flare up on the REXML front. It is too bad that Ruby doesn’t have a better set of libraries for XML. As Bill mentions, Python does a great job with XML. He mentions ElementTree, which is definitely better than something like pure DOM. Lxml is another option, which actually implements the ElementTree API and includes some pretty slick objectify functionality. Ian recently performed some rather unscientific, but still interesting, benchmarks on some Python libraries for parsing HTML. Ian found lxml to be quite the performer. There is also the 4Suite and Amara toolset that provides a very comprehensive suite of XML tools including an entire XML/RDF based document repository and full featured XSLT engine.

It makes me wonder why the Ruby community have not stepped up with some better options. The Python community is very similar in that XML has not been a hallmark of the community as compared to Java or .NET. One argument could simply be time, since Python has been around a bit longer. No matter the reason, I think it is time for the Ruby community to consider stepping up and producing a healthy alternative to REXML. My first steps would be to start with the libxml bindings and go from there. Lxml and Amara have both proven that utilizing a fast C library for the grunt work pays off in the end.

Lastly, I want to make it clear that REXML is still a pretty great tool. It meets the needs of many of its users, which is more than many software projects seem to accomplish. With that in mind, lets not stop there when we can do even better to make Ruby a great language for working with XML.

AddThis Social Bookmark Button

One valid answer to the question in the title would be: I’m both into linked-data and RDFa. Hey, but that’s not the answer you are interested in, right? We’ll have a look into both and find a better answer by the end of this post. Oh, right, by the way, let me introduce myself shortly. I’m new to xml.com and I try focusing on Semantic Web stuff.

In the beginning, there was the URI. Kingsley recently wrote about it, coming from the plain old untyped @href hyperlink. Then there was RDF, not so well known, and still often confused with one of its serialisations, namely RDF/XML. But there are other ways to deploy RDF as well. In a couple of weeks, presumably, RDFa will be finalised by W3C. RDFa is all about delivering structured metadata in HTML. Much as microformats, RDFa uses attributes to ‘hide’ - or, more technically: embed - metadata in HTML.

Coming back to URIs: The hyperlinks basically were the success factor of the Web as we know it. Typed or semantic links are expected to be the same for the Semantic Web. TimBL wrote up the so called linked-data principles a bit ago (URI for everything, HTTP URI, RDF properties). An example might help understanding both RDFa and linked-data; compare

this page is under <a href="http://creativecommons.org/licenses/by/2.5/">CC 2.5</a> license
this page is under <a rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/">CC 2.5</a> license
 

The key is the rel="cc:license" bit. This is actually a piece of valid RDFa (telling that this content is under a certain license) and equally is a typed link. It overloads the simple @href hyperlink and let’s an agent (be it a search bot or a syndication site) interpret and follow it properly. I think you get the point, right? To sum up: RDFa is the way doing linked-data. Coming back to the initial question, I guess the main point is that both are manifestations of the real-world Semantic Web emerging these days. While in the last couple of years most of the people involved in Semantic Web stuff maybe thought ontologies and reasoning are the most important issues to deal with, it’s a bit like building a marvellous roof and finding out one day that there are no walls, and not even a foundation to put it onto.

Rick Jelliffe

AddThis Social Bookmark Button

Patrick’s forward-looking post mortem is worth a read by everyone involved in standards over the last year.

M. David Peterson

AddThis Social Bookmark Button

Update: via a recent follow-up comment from Rick Jelliffe, we have ourselves our QOTD,

If DIS 29500 mark II has been accepted, then the narrowness of the victory needs to be something that Ecma and Microsoft take very seriously: standards maintenance needs to be a budgeted, normal cost of doing business. They should be aware that they are being thrown a lifeline, to some extent. If this becomes a one-off publicity stunt, as is the dire warning of MS’ competitors (and therefore, their own publicity stunts!), and timely, real maintenance is not performed, I would expect OOXML would be de-standardized at ISO.

[Original Post]
Open XML appears to clear ISO standard vote | Tech news blog - CNET News.com

Early reports Sunday indicate that Office Open XML (OOXML) appears to have enough votes to be certified an ISO standard. An official tally is not expected until Monday.

Some of you may have noticed that I decided a while back to ignore the whole OOXML/DIS 29500 debate here on XML.com. Two reasons: 1) Too much cost, not enough gain. 2) Rick Jelliffe had things covered from top to bottom, someone *MUCH* more qualified and capable than I to provide a proper perspective of what was going on and what it all meant.

M. David Peterson

AddThis Social Bookmark Button

I’m just getting back to Salt Lake City after spending the last 4 days in Seattle/Redmond at the Microsoft Technology Summit. Had a *GREAT* time, meeting, for the first time, a few folks that I’ve known through email and/or user groups/mailing lists and/or industry reputation (the good kind ;-) for quite some time. I hope to do a proper summary of the entire MTS08 event before the weekend comes to an end, but in the mean time…

Rick Jelliffe

AddThis Social Bookmark Button

[UPDATE] I thought I’d give some graphs for the results of the ballot-changes of DIS 29500 mark II. These are the results as at Wednesday, and I think they are the finals. (There is one non-P NB whose vote I am not sure of: I have shown it as abstain though it could be accept.)

Here is a graph of all the votes case, showing the change from the initial ballot until now (as far as it is publicly known). This is based on all the NBs who voted. (However, this is not the count that is used to determine success…)
Graphic with estimates of total votes, showing that the absolute number of accepts has risen to the mid sixty percents while the absolute number of rejects has lowered to the mid ten percents

At ISO/IEC JTC1, national standards bodies (called NBs) nominate what kind of participation they are interested, for each of the multiple subject-oriented Steering Committees (SCs). They can nominate in two classes: Participating Members (P-members) are supposed to maintain an active interest, attend meetings, and vote on all the standard drafts that come up. Observing Members (O-members) can vote, but they don’t have any obligations to show up to SC meetings.

Here is a graph of all the votes case, showing the change from the initial ballot until now (as far as it is publicly known). This is based on the NBs who are O or P members (However, this is not the count that is used to determine success…)
Graphic with estimated  voting ratios for P and O NBs

Here is the vote when you just look at the P members (as far as it is known.) Note that “abstain” votes have a very particular meaning in ISO: it does not mean “reject” or “protest”, it means that the voting body could not decide, or is happy let the consensus of other NBs determine. There is no shame or difficulty with an NB voting abstain. (At earlier stages of drafts, there are “No with comments” votes: these often are “conditional yes” votes, which can explain how a “reject” vote can become an “accept” vote. At the current stage, however, no means no.)
Graphic with estimated  votes for P-member NBs

Finally, now we have seen the big picture, we come to the real numbers that count. There is a negative test and a positive test. First, no more than 1/4 of all NBs who vote can be negative (ignoring abstains). This has not been reached (i.e. not enough rejections: the all-nation acceptances are over 75% on the following graph.)

Graphic with estimated  voting ratios accept to reject for all NBs, showing over 80% acceptance rate

Then there is a positive test: at least 2/3 of all P members who vote should vote for acceptance (ignoring abstains again). This has been reached (i.e. enough acceptances: the P-nation acceptances are over 66.7% on this graph.)

Graphic with estimated  voting ratios accept to reject of P-member NBs, showing over 70% acceptance rate

So OOXML has been accepted, seemingly by 24 to 8, which is enough of a margin to avoid “hanging chad” clawback games.

It is clear that most NBs think DIS 29500 mark II makes a credible and acceptable or useful standard, but there is a substantial and active minority that does not.

Rick Jelliffe

AddThis Social Bookmark Button

The Java Community Process is the mechanism Sun set up to develop and evolve Java “in Internet time”. It brings together “a cross-section of both major stakeholders and other members of the Java community”. A group of experts make the initial draft, then “Consensus around the form and content of the draft is then built using an iterative review process that allows an ever-widening audience to review and comment on the document.”

The result is a specification, a reference (proof of concept) implementation, and a technology compatibility kit (tests).

One specification I have been interested in for a while is JCP 296 the Swing Application Framework. The JSR (Java Specification Request) was approved in May 2006. There is an implementation at Java.net.

However, I cannot figure how to find the spec. Looking at the JCP site, there is everything about the spec, but no actual link to it. Looking at the implementation site, again no actual link to the spec. This strikes me as an entirely odd way to do business. What are they trying to hide? :-) Whatever it is, they are doing an excellent job of making sure that no-one finds it.

Looking at the site, it seems JSR 298 is marked “in-progress”. That means, I suppose that it is still a committee draft, that has not been released. After 18 months? So much for Internet time!

It seems like in order to see the draft, I will have to sign up to be a JSP member. For an individual member, it is $0 which is nice, but I have to send a fax to the other side of the world and await them to fax back a password. Or I can fax and send hardcopy by courier.

But even then, I don’t actually know that the JSP 296 draft is available for community review. The status is given as “In progress” but there is no mention of this status on the JCP description page. Presumably for the last 22 months the draft is being written. I presume a draft exists, because it has software that claims to be an implementation.

What is interesting is that this is the opposite of the ISO process. At ISO using the normal rules, it is the early drafts (working drafts, committee drafts) that are given the most exposure and can be floated around openness, and only the very final draft standards that are supposed to be controlled (to reduce interoperability problems where people write systems according to different drafts rather than the final standard, and for the standards that are published commercially by ISO and standards organizations for cost recovery.)

While I am generally in favour of committee room secrecy, to prevent intimidation and silly marketing point-scoring and to disenfranchise armchair experts, and while I can understand that drafts can change substantially so you don’t want to have old drafts floating around, openness is better. But after 22 months, and after there is an implementation, to have no actual draft casually available is not “Internet time”, is it?

M. David Peterson

AddThis Social Bookmark Button

If there was any single issue with EC2 that was harder to overcome than any other — at least mentally if not physically and/or technically — is was that of not having access to a static IP that you could rely upon being there regardless of what machine it was mapped to.

That has now changed…

Amazon.com: Homepage: Amazon Web Services

We are excited to announce Elastic IP addresses and Availability Zones, two features that were among the top requests of Amazon EC2 developers. These new capabilities allow developers to achieve greater reliability and redundancy for their applications in the cloud, especially hosting websites. Unlike traditional static IP addresses, Elastic IP addresses can be dynamically remapped on the fly to point to any Amazon EC2 instance. Also available is the ability to launch instances in multiple Availability Zones, each with its own reliable, physically independent infrastructure, which allows developers to build fault resilient web applications through simple API calls.

Of course if there was any other single issue that was the source of significant pain and/or worry it was that of not having the ability to guarantee against single (hardware) server meltdown, or in other words, there was no ensure that if you have multiple instances running that these same instances were not running on the same physical piece of hardware. As per above, that has now changed as well.

*SWEET*! :D Thanks, AWS!

Rick Jelliffe

AddThis Social Bookmark Button

There has been so much disinformation put out about the limited review time for OpenXML, that it might be salutory for people to revisit a review of the Open XML draft I put on this blog dated Thursday May 25, 2006.

You read it: May 2006. That is 22 months ago! Not “5 months”, not even 9 months as the claptrappists say. June, July, August, September, October, November, December, January 2007, February, March, April, May, June, July, August, September, October, November, December, January 2008, February, March.

To the people who are saying they have not had enough time in 22 months, I have no sympathy: you should have been reading my blog! :-)

I want to go through all of what I said then. I think it holds up really well.

A new draft of Open XML came out on my birthday. 4081 pages of PDF, and very impressive for anyone who has worked on specification and standards. Two things stick out: first how horrible XML Schema fragments are when stuck inline to document structure; second, how the implementation-neutral tone of the introduction is at odds with the elements for various kinds of Active X embedded objects. I suspect people would be a lot more comfortable if the elements for Active X embedded objects were in a different namespace, and gathered into an appendix of some kind. Antiques and curios. It will be interesting to see what the extensibility strategy will be (it hasnt been released in this draft.)

By halfway through the Ecma period, the spec had doubled in size with extra material from its original submission of about 2000 pages from Microsoft. In the subsequent six months it increased by the same amount. So much for the idea that Ecma TC45 merely rubberstamped the original submission from Microsoft.

The comment about the horribleness of XML Schema fragments is still one I’d make. The BRM at least made them non-normative, but it did not agree to remove them entirely. I expect when people see the new generation of multi-format standards that some SC34 people are championing, where you can turn on and off normative sections, we can see the end of this clutter at reader request, which is perhaps the sweet spot.

The comment about Active X of course later became a mantra, with various demands that either DIS 29500 should have no normative reference to proprietary binaries or that it should more (to bring them under the OSP). But it was an important issue that was addressed during the BRM and can benefit from continuing vigilance. The idea of gathering legacy proprietary elements into some kind of appendix is exactly what happened, at least for the compatibility elements, at the BRM. (I don’t know that many of the participants at the BRM would have been comfortable with namespace-based notions of conformance, I didn’t get the impression that using namespaces or schemas as tools was on many delegate’s radars, no disrespect intended.)

The extensibility strategy came out as a separate part, with no significant trouble as a technology. Though some people have subsequently discovered that extensibility and “openness” (meaning guaranteed receipt) do conflict: this is something I have repeated talked about: the need for profiles. On the general subject of extensibility and interoperability, Joel Spolsky has another good article this week: Martian Headsets

On the technical merits, well actually I dont know if they matter much. I say potato. Exporting to HTML or XHTML gives people base-level interoperability for most documents, which neither ODF nor Open XML will challenge; at the high end the solution is exporting to XML using a domain-specific schema (e.g. S1000D for military & aerospace) and not ODF or Open XML at all; in the casual middle we will have ISO ODF available, perhaps as the interchange format of choice, as well as ISO Open XML (if it is accepted) for when you need to track MS Offices capabilities closely. I think there is substantial value in a standard XML format for MS Office documents even within organizations that will mandate ODF for interchange and archiving. The availability of the alternatives reduces the need for ODF or Open XML to be the one true interchange format.

I think I still agree with everything there. (By technical merits, my point is not to do with the state of the draft, but about doctrinaire views on optimal technology which are ultimately subjective, and the benefits of plurality.)

I still think ODF is the appropriate format of choice for level-playing-field document interchange, especially for governments, though it seem ODF 1.2 and 2009 are the more realistic time-frames for this. And Don’t forget about HTML!

Probably coming from the industrial publishing background biases me here: the need for dumbed down interchange formats is real sure enough, but the need for intricate close-to-the-metal feature-exposing typesetting feature access is also important for different contexts. Word’s binary formats and RTF’s weaknesses have long held Microsoft’s applications back from being happily usable in serious industrial publishing systems (or, at least, have often held back the people who adopted them.)

+1

Rick Jelliffe

AddThis Social Bookmark Button

Three programmers gathered at the next cubicle to mine yesterday, clucking and snorting as is their want. I looked over to ask what was going on. “A bug in Java” they said. The problem was with ZIP files, specifically some differences between ZIP files made by different methods.

They had some files with non-breaking spaces (U+00A0) in the file name. Not something that I would do myself, but the number of people who want to use non-ASCII characters in their filenames is surely now much greater than the number of people just content with ASCII-only names. Aha, so file this under internationalization (I18n)!

The problem was, it seems, that WinZIP stored the filenames using the system default encoding. But Java would read the filename using UTF-8. So sometimes ZIP files parts would have the non-breaking space, and other times the same file saved a different route would have 0xFF at that position. Now this is the kind of behaviour and problem that you would expect a decade ago, but I was surprised it still occurred.

Checking through Sun’s bug database, we find that this bug (or its clone) is actually the second most requested (2008-13-28). The engineer who evaluates the problem gives the excuse that Sun decided to use UTF-8 for JAR files (which use ZIP) and seems a little surprised to discover that ZIP may actually be created by other systems to.

Looking at the bug report, we also find it was first reported 07-JUN-1999. Almost nine years ago. The bug report says it is only reported up to Java 1.4.2, however I cannot see anything in Java 1.6 that addresses it.

So what has happened? Several things:

  • Apache put out a zip implementation as part of Ant that supports different encodings. So people who needed it can use that.
  • Since September 2006 the ZIP spec has formally included a bit to state the the file name is stored using UTF-8.
  • It seems other manufacturers have increasingly used UTF-8

So for almost 10 years the Java version of ZIP has been broken for internationalization purposes, the fix seems to be caught in limbo (are they waiting for non-UTF-8 encodings to go away, perhaps?) , and so people are forced to go to other implementations. WORA undermined! Indeed, this seems another example where Java is simply too large for Sun to maintain adequately.

But what about this angle: the current ZIP spec has an appendix on file names and encoding it says

The ZIP format has historically supported only the original IBM PC character
encoding set, commonly referred to as IBM Code Page 437.

Which means that Sun’s policy of merely writing UTF-8 is now going against what the ZIP spec says.

Software maintenance and juggling issues on a budget are not easy. However I think it is more than plausible that had Sun gone ahead and submitted Java to ISO for standardization a decade ago, this issue would have been fixed long ago. Because ISO National Bodies give very high precedence to issues such as internationalization, accessibility, modularity, and conformance. So the lack of proper encoding support in the ZipEntry API would undoubtedly have come to the fore in the very first round: Japan never lets this kind of thing slip, for example.

By exactly the same token, if the ZIP format has been put through as a standard, proper encoding support would have undoubtedly been raised as part of the first review. Standardizing either would have been good enough to have a technical fix agreed on, published and pressure applied for a fix ahead of the demands of corporate featuritus. But standardizing both would still be best.

After Sun backed off last time, leaving so many people who had participated feeling burnt, it is hard to see that standards people won’t be deeply suspicious of them. And Sun people may not be keen to submit even to a “bullshit process” based on pragmatism and incrementalism. But Java would clearly, IMHO, be in a much better position today if it had been standardized. And so would ZIP.

Standardization as a kind of audit

What standardization of a living technology gives stakeholder companies is more than just bragging rights and ammunition to shoot their rivals with and to confuse procurement people with, tempting as those things may be, it also give an objective audit program dictated not from the corporate POV but from (to a greater or lesser extent, depending on interest) the market and relatively disinterested third parties. Any long-term software project gets encrusted in the personal politics and ideosyncrasies of the development team, and needs a circuit-breaker. This is a view of standardization as a kind of major technical audit, particularly of the documentation but also of areas that are becoming more market-critical: standards use and compliance, openness, responsiveness, accessibility, internationalization, integratability, testability, and so on.

These are all things that established technologies need. Now of course you can get audits in each of these areas by hiring experts. That is good, but you don’t get the breadth or provable transparency that National Body participation can bring. And expert opinions still have to get evaluating the context of the power relationships of the company, the very same relationships that allowed the problem to arise (these might be as simple as CJK requirements not having an adequate champion or I18n not being a profit center that can demand changes.) And you can get benefits from using boutique standards bodies in which vendors or their representatives can have voting rights: W3C, Ecma, OASIS, and so on. That is good too, but it does open to domination by one side or the other.

Which leaves the ISO family (e.g. ISO/IEC JTC1) as being effective forums for this kind of audit. People who think that ISO standardization is always a pushover should consider the current OOXML debate: you have MS and friends on one hand and IBM and friends on the other both pushing as hard as they can, and yet as I write neither can establish clear dominance. And these are the largest players in the world. Whether DIS 29500 mark II passes or fails it will be because national bodies decided on technical issues, not pack alliances, as far as I can tell. I am sure that neither MS nor IBM is feeling comfortable at the moment: and this is the strength of the ISO kind of procedure, regardless of the outcome.

We have all had enough experience of open source to be aware of its strengths and weaknesses now. Making something open source does not automatically mean that bugs and so on will be fixed. No silver bullet. As I wrote in this blog a couple of years ago in Sun should open source Swing

it is not enough to Open Source something: the mechanism for speedy response to bug fixes and releases is crucial too.

And neither will auditing a technology by making it a standard. Nothing is automatic. But Error-full systems emerge from single-strategy maintenance regimes and the dinosaur systems such as Java and Office are full of examples of this. The ISO standardization process has many qualities to commend itself for large companies as a tool for shaking things up and circuit-breaking. And we still need an ISO standard for ZIP too.

M. David Peterson

AddThis Social Bookmark Button

Placeholder for ongoing notes from the Microsoft Technology Summit…

Rick Jelliffe

AddThis Social Bookmark Button

I have been pretty disappointed in the new operating system distros I have been trying out recently. In the last three to six months there has been:

  • A horrible install of a new Mac where the Expose feature caused windows to run away when I tried to click on controls near the edges of windows. It was like some kind of demented joke or game. (The user, who was previously a dedicated PC user, now loves the Mac and thinks it is much simpler.)
  • Today I tried twice to install the new service pack for MS Vista, only to have the install fail with no useful message.
  • An attempt to install a mainstream Linux on my new PC failed when it could not detect the keyboard. I had to get the new box because another install of a newer Linux from another distro was disastrous for performance on my quite old box.
  • I got too bored to continue with another mainstream Linux install, where the DVD instructed me to first burn the image to a bootable CD.

So instead I have installed a recent Solaris Developer, from the DVD of some Linux magazine in the newsagent.

This is one of the easiest installs I have had. (I could only install onto a partition on the main disk, install onto a partition on the secondary disk failed with a bogus message about user accounts. No biggie.)

The system boots up, SAMBA works fine and detects most things I want to detect. (It is interesting that it only detected one printer on our network, however Vista only detects the other one, so that is not so bad.) It has Firefox and Thunderbird, which are what I’d use anywhere, and StarOffice, which is good enough for now; I cannot really use it or OpenOffice for making presentations until Impress gets tables in v3.0. It comes with Java installed, and Netbeans, though I’ll be downloading Eclipse for compatibility with the workgroup here.

The desktop is a nice GNOME and really uncluttered and to the point.

Best of all, it feels like UNIX. Not a half-assed wannabee, or a messy child’s toyroom, the way some Linux distros seem to be. But lots of GNU goodness. I still have to see how it copes with some issues like updates (which was the only real flaw I found in Mandrake Linux, that I was happy with for a few years.)

So I really like Solaris. It seems to suit what I want and expect better than any other OS distro I have come across yet.

But it has one big problem: the screen graphics are super ugly. In fact, so repellent as to make it unusable. I have a 1440×900 LCD monitor and this is not one of the built-in types supported. No problem, I thought, I’ll just change the appropriate xorg.conf (or whatever is the equivalent) file. But I cannot see how to do it: it looks like it is hardcoded or something. So I have some other resolution, with a half inch dangling above the screen and unreachable. And the fonts are ugly and thick: even when I turn on anti-aliasing and play with the LCD settings it makes little real difference. Unless some kind reader can make a good suggestion, it just doesn’t compare to what I have been used to under Windows, Mac or even Linuxes.

I really hope I am doing something wrong, because apart from that Solaris really seems to fit the bill for me. Maybe I have to resurrect the old CRT monitor.

[Further Adventures] I tried to install a different card, only to have a hardware problem, so I switched back to the original new card. Oops, now the thing doesn’t boot. Checking though, for some reason the BIOS had switched around which of the two hard disks to boot from. I don’t understand how this could have happened. In fact, I don’t understand why it can happen either, because I thought both hard disks would be checked for booting in any case. But swapping the order of booting from disks fixes the boot problem, and I am online again.

I looked through the X windows logs, and sure enough the VESA driver only has a limited number of screen resolutions available, and 1440×900 is not one of them. Sigh… So to use Solaris I have to either go down in resolution to fit the monitors we have, or buy in a new monitor. I was finding the wider screen really useful for Eclipse, so I guess I will have to search for something else. All this is taking a frigging long time: I expect to live for three years on a single installation, so having to go through four or five large an problematic installs is wearing me down.

Rick Jelliffe

AddThis Social Bookmark Button

Readers may care to amuse themselves with this double think: Arnaud Le Hors Clarifiation about ASF and OOXML in which he says

In case anybody misunderstood my blog entry “Let’s be clear: The Apache Software Foundation does NOT support OOXML“, I did not mean to imply that the ASF has any official position one way or another regarding OOXML.

Err, except that the title of the blog was ASF does NOT support OOXML!

even stranger, le Hors is responding to a claim that he made up himself

OK, I’ll admit that nobody has claimed otherwise. Yet. But in these days and age you are never too prudent. It wouldn’t surprise me to see this or other similar fancy claim being published eventually.

So le Hors makes up a claim (that someone is saying ASF officially supports OOXML), then decries it (that ASF does not support it), then is forced to retract the decrial (that ASF has no position), then claims that that he never meant to imply what he had said in plain words in his own headline! O what a tangled web we weave! But quite a funny example of the mentality that seems to have possessed some people: truthy rather than facty.

I only read Arnaud’s blog because I got a mention. He repeats Groklaw’s decade old story about MS secret dirty tricks to maintain control of its proprietary binary standards as de facto standards, and somehow vaguely tieing me into it: daft given that I have been so open and my concern is with helping (force) MS out of its market-dominating proprietary standards. He also mentions Patrick Durusau’s change of heart I see, and repeats the IBM mantra handed down in IBM VP’s Sutor’s Critical questions that dissenters should expect their reputations to be at stake.

As part of IBM’s commitment to intimidation, le Hors reaches a new low, which kind of offsets the chuckles I mentioned at the top. Speaking in the context of me and other experts who dare dissent

the consequences when being caught to have failed to disclose any relevant affiliation could be far greater than they currently are. I’m not excluding judicial prosecution here.

Where are their heads at? People who wonder why I spend so much of time on OOXML issues in this blog, which is time spent at my own cost (and it is a real cost: I could be doing paid work instead of writing this) should recognize that it is largely in response to this kind of bullying and intimidation, that I saw glimpses of early on as the suits started to invade ISO in 2006.

M. David Peterson

AddThis Social Bookmark Button

Jeff Barr’s Blog ďż˝ Save the Rovers!

Apparently NASA plans to cut one of the two Mars Rovers from it’s budget. But Jeff Barr has a plan to help save it…

The second plan is a bit different. What if we just raise the freaking 4 million bucks ourselves and simply give it to the team? What would happen if we showed up at NASA HQ with a big plywood check (backed up by real money of course)? Could they take it, and would this make a difference? Does anyone know?

I just registered savetherovers.com and would be happy to use it as part of an organized effort to do something remarkable. I would also be willing to pitch in $100 as my part of the $4 million. That means we need just $3,999,900 more.

I’m in Jeff! Anyone else care to join the effort? I’ll contact Jeff and see how he plans to facilitate the volunteer/donation effort and report back the result…

Stay tuned!

Rick Jelliffe

AddThis Social Bookmark Button

Patrick Durusau has a few more items on his website. Always worth a read for anyone interested in getting more than the party lines. Here is some of his latest TOC:

Rick Jelliffe

AddThis Social Bookmark Button

I was told recently that of the 250 or so fast-tracked standards that Ecma has successfully had accepted by National Bodies at ISO/IEC, only three of them have failed. I thought it would be interesting to read up a little more on them.

Ecma (shooting the messenger)

Ecma makes standards on a wide variety of subjects, and has particularly strong involvement with the European and Japanese computer hardware industry. In a response to a comment on another item, I posted this list, which is of the current groups and chairman’s affiliations, to give an idea of its scope:

  • C# (Chairman from Microsoft)

  • ECMAScript (Chairman from Mozilla)
  • Business Communications (Chairman from Siemens)
  • Near Field Communications (Chairman from Sony)
  • High Rate Short Range Wireless Communications (Chairman from Sony)
  • Environmental Design Considerations (Chairman from IBM)
  • Accoustics (Chairman from HP)
  • Electromagnetic Compatibility (Chairman from Intel)
  • Optical disks and disk cartridges (Chairman from Toshiba)
  • Universal 3D (I3D) (Chairman from Boeing)
  • Holographic Information Systems (Chairman from Fujifilm)
  • OOXML (Chairman from Microsoft)
  • XPS (Chairman from Global Graphics)

Now I knew that the C++/CLI effort had failed (for what seems good reasons to me.) But I was not so sure of other efforts.

I found this article, from 10 years ago: Sun Uses ECMA as Path to ISO Java Standardization which I will look at in more detail in a moment. But there is an interesting passage halfway down the page:

In 1996 Microsoft Corp was able to shoot down another ECMA standard, the Public Windows Initiative, at this stage, thus preventing it from becoming an ISO standard. The PWI was a Sun effort to get Windows APIs put into the public domain. … Microsoft was able to mount a successful campaign against PWI at ISO on this issue.

What do we learn from that? That Ecma was happy to serve as a neutral forum. That Sun was happy to try to make use of the Fast-Track procedure when it suited them, for competitive reasons. That in fact IP buy-in from the critical stakeholder is necessary. And that MS has made a 179 degree turn on standards since a decade ago. (I am always amused at how often anti-OOXML material will, when it fails in a current objection, resort to decade-old material as if it were fresh and compelling. The company then was fleeing standardization; now they are participating and allowing significant changes. You do not have to trust or like them to acknowledge that.)

Control of the API

ISO standards are a very scary proposition for large companies. Many of them are not comfortable with any position other than dominance and stability. The control of the API is terribly important to them, and they regard loss of control of the API as a risk (whereas it can be a circuit-breaker and new-market enabler.) This is one reason why all the large companies try to favour the member-based boutique standards bodies: W3C, OASIS, Ecma, because there is more chance that they can establish a beachhead and make participation at those bodies unattractive or futile for their competitors. The need for stability is sometimes stronger than the need for dominance: when you see calls for “equilibrium” to be maintained in a market, you know that is a buzzword for maintaining the status quo. (And it is not always the market leader: it can be a smaller player in fear of losing their share just as much.)

It goes in cycles. The wheel turns and sooner or later the big companies are forced to deal with ISO and national bodies, and they find this lack of control very unpleasant. Sooner or later they find some reason to split back to more dominatable bodies, and they jump ship.

It is not all venal (or even venial) or negative though: for example, look at SGML: Sun’s Jon Bosak (and many others) were unhappy with the way and speed that SGML maintenance was proceeding and we went to W3C as a forum for making a simple profile and addressing a lot of peripheral issues, and XML in turn became the foundation for the update of SGML. There is always an interplay between what the boutique, specialist bodies are interested in, and what the national-body-based regimes such as ISO are interested in: industry activity is actually really important, because it clarifies what the ISO groups should be doing.

The downside is that when these large, usually-US-based multinationals hop over to their boutique bodies, they have to try to justify their jump by slagging off at ISO/IEC. This is a predictable behaviour: it has happened in the past, it is happening now, and it will happen in the future. Some parts of the complaints are often reasonable, some parts are often merely self-serving, but it is not a new behaviour.

Ecma and Java

Now back to Java. Originally Sun put up Java to become an ISO standard using the PAS process (the fast-track process that ODF used) using the Open Management Group (another boutique group) as the submitter. Then Sun changed its mind and decided to submit it to become an Ecma standard (and thence to ISO on Fast-Track) because

In examining our standardization options, our primary goal always has been to preserve the industry’s substantial investment in evolving and using the Java technology,” said Dr. Baratz. “By paring the collaborative Java Community Process with ECMA’s proven standards process, we can achieve international standardization while preserving rapid innovation and cross-platform compatibility.

According to this article Sun chose to go with Ecma, because it was flexible enough to allow maintenance to continue on through the Java community process as it stood then. Other articles suggest that one reason for Sun’s reluctance to be involved at ISO was their strong desire to keep effective control. One particularly interesting aspect of the article is that it mentions the potential danger from Sun’s point of view of HP, Microsoft and so on doing exactly what Sun had attempted to do with PWI: make up their own version of the standard and submit it to ISO!

Of course, what Sun was concerned about was Microsoft’s attempts to destroy Java’s Write-Once, Run Anywhere promise by grafting on their own graphics primitives into J++ and splitting the market. This is of course how IBM put a nail in Java’s coffin for the desktop, by doing exactly the same thing with their SWT graphics library, as used in Eclipse: it is not a part of standard Java and Java applications that use it are not WORA applications.

The fight between Sun, IBM and Microsoft over their effective graphics libraries shows a couple of things that are very instructive. For a start, it shows that they all try to use standards for their own competitive purposes. It is no news: the challenge it to try to use the standards process to channel them into behaviours that benefit society and the market.

It also shows the futility of non-layered standards. The WORA spiel is really compelling, and it is something that I bought into with my company Topologi, but all systems that have to grow need to support what I call Organic Plurality. Systems with modularity in the wrong spots die but can cause problems in their death throws: it seems that with Java, the graphics interface was exactly such a spot, unfortunately for the vision. (For another aspect of this, see The Software World of 2010: Its about the Suite.)

But thirdly it shows that the big players have been involved in these kinds of standards games for years. For a while, and under the noxious impact of the MPEG group, the large companies got excited by the idea that they could use standards bodies to become revenue-generators by standardizing on Royalty-bearing technologies.

Pigs at the trough

In the middle part of this decade, there were attempts at OASIS for this, and many of us spoke out against the large companies trying to do this, and we were successful. For people with short memories, the background of this was the attempts to get non RAND-z technologies adopted for DRM proposals: the major pigs with their snouts in the trough at that time were ContentGuard (ex Xerox), Microsoft and IBM, all the usual suspects. (Readers may also be interested to note that Patrick Durusau got involved in the OASIS DRM effort, on the side of the angels: he has a very hard-headed attitude to all the large companies, and not one that endeared him to Microsoft or IBM.) By 2004, the OASIS DRM group wound up without getting this endorsement for the non-RAND-z technologies. RAND-z won!

David Berlind has quite a good article on why a non-RAND-z standards organization is a “patent shelter” and not open: it is great that OASIS has straightened up here, and I hope SC34 continues its long-standing RAND-z policy. But it is especially great that companies like Microsoft, IBM and Sun, which a few short years ago were all excessively concerned with trying to keep control and use standards as patent-shelters are behaving well now. However, just because Microsoft, IBM and Sun have little credibility in the world of standards for altruism’s sake, it does not mean that they should be blocked from participating legitimately in standards. To the contrary, we need to have institutions to allow these behemoths to act as good citizens: RAND-z standardization is a great vehicle for a behemoth!

The futility of monocultures

Back to our Java story. In late 1997, SC32’s Java study group had recommended that Sun should submit Java through the “more traditional” processes. Sun eventually did shift to use the Ecma route, but apparantly out of fears it would lose control. Then

.In another effort to block other companies and interests from developing Java platforms that do not meet its strict guidelines, Sun Microsystems on March 1, 2000, declined an offer from ECMA to standardise Java. ECMA, which is a standards organisation in Geneva, Switzerland, denounced Sun because the company refused the standardisation proposal. TechRepublic

Industry gossip was that Sun wanted to make their source code a normative part of the standard and they withdrew when they found it would not be possible through Ecma (or ISO or anywhere!): nice try fellows! I’d love to get some confirmation or another angle on this. But clearly the issue is one of control: integrity, interoperability are all nice side-effects. The trains always ran on time under Mussolini: we should not pretend that centralized control and monocultures do not have some benefits.

However, when we look at the way large companies act with respect to standards bodies, one very large question should arise: it is a variant on Adam Smith’s aphorism (or was it G.B. Shaw) that every profession turns into a conspiracy against the public interest. If monopolistic, cartels and collusive behaviour are undesirable (I don’t use “wrong” here because it carries a moral implication which distracts people from the point and lets them drink from the waters of Lethe from the sweet cup of self-righteousness) because they result in sub-optimal market operation.

So why are standards allowed: surely they are collusive, and interfere with the market?

Public policy

The traditional answer is that public policy encourages standards because and as far as they create markets. When the Torx screwdriver company got its hexagonal screwdriver heads adopted as a standard, they may have been wanting to encourage a market in screws not competitors in screwdrivers, but they were creating a market none-the-less. OASIS lawyer Andy Updegrove, who I criticize a lot for his flakey reporting and bias, has really good legal material at his website which quotes the (U.S.) Fifth Circuit Court of Appeals decision in Consolidated Metal Products v. American Petroleum Institute in 1988:

A trade association by its nature involves collective action by competitors. Nonetheless, a trade association is not by its nature a “walking conspiracy”, its every denial of some benefit amounting to an unreasonable restraint of trade. In particular, it has long been recognized that the establishment and monitoring of trade standards is a legitimate and beneficial function of trade associations.

One key aspect of the setting of standards is that they cannot be needlessly exclusionary: this is why there is always the need for multiple boutique bodies, because when a company is unable to get satisfactory inclusion of its technologies or requirements because existing members have “stacked” the process against it (and it should be noted that this is a negative stacking aimed at blocking: there seems to be no such thing as stacking a standards body in favour of a legitimate technology, quite the reverse: a standards body is there to foster agreements) then that company can go elsewhere. The need for a market in standard technologies requires a slew of supporting markets, including a competitive market for member-based standards organizations. (It’s turtles all the way down, as the joke says!)

When we get to ISO/IEC JTC1 we run out of competitive standards bodies. At the international level, there is quite a clear difference between the kinds of work that, for example, IEEE takes on and the work that ISO takes on. So if allowing plurality rather than blocking is at the very core for justifying standards (I mean voluntary technical standards used by industry, not regulations or which side of the road to drive on) as market-creators and preventing standards from being feet-in-the-door for cartels, what happens at the apex, at ISO/IEC JTC1 for example, when there are no competitor bodies?

The answer is simple: plurality. ISO/IEC cannot be in the business of allowing cartelization, since the only justification for standards is because they actually prevent cartelization by creating markets.

Trapping a bear

From this light, I hope my support for OOXML getting standardized even though I recommend ODF for public government documents, becomes clearer. The need to support plurality goes to the very heart of the mission of international standards bodies. It is one thing to speak of technical issues, it is another to blanket state “We already have a standard that is good enough for us, therefore you don’t need the standard that you think would meet your needs”. Because that is just code for “We want to prevent your technology for operating in its market by limiting the market to our favoured technology”. That kind of blocking behaviour needs to be exposed and rejected.

The large US multinationals have always been trying to use standards bodies to compete, and they have always shopped around, and none of them like giving up control. The recent defection of some of the leading lights of the Open Document Foundation away from ODF springs out of exactly this issue: the charge that Sun has tried to keep too much control. They all try to play this game, it is not new.

So what can we do? We have to be like bear trappers. The bear is bigger than us, has an off-putting odour, and a taste for honey. But when the bear wanders into a cage, you don’t say “Oh, Mr Bear, you are too big” or “Oh, Mr Bear, you stink” or “Oh, Mr Bear, all you want is to raid the honeypot, such a naughty and greedy animal does not deserve to be trapped!” You close the trapdoor and jubilate. The history of these large companies is that they all try to find the route where they can maintain the maximum control, and very often they will get skittlish at the amount of control they have to give up. Even Ecma, which is polloried at the moment for being some kind of a rubber-stamp, would have required giving up too much control for Sun with their Java effort: and you would not want to think that Ecma were necessarily the most accomodating here.

A lot of the anti-OOXML material over the last year has been along the lines “Don’t you know how bad MS is” spouted by companies who have been playing exactly the same kinds of games. Think SWT, think DRM, and so on. But standardization can be a real game changer: one of the few game-changers on the horizon. The chance to capture a large mass technology into the review and influence of the international standards organizations comes very rarely and IMHO is not a chance that should be squandered on petty ideological or competitive points. Open Source millionaires and closed source search engine companies, all of them are in the same boat as the rival office suite developers: competitors with vested interests to block the development of multiple markets.

The thing is that competition between these kind of standards is not just good, it is essential. I have just been looking at the new feature list for OpenOffice 3.0, due mid year, and it finally includes tables in Spreadsheets. Now it has been incredible to me that this has not been there before: I don’t know how you can make a presentation without tables. But tables in spreadsheets was not something encouraged by ODF before OOXML came on the scene. (It is not a feature suggested for spreadsheet applications in the informative feature table in ISO ODF, in particular.) And the recent changes in OOXML have surely occured in part to catch up with ODF: it is not one sided. The competition is forcing each technology to be improved in places that their original champions did not consider important.

Given the utterly toxic relations between the various players at the moment, which makes any talk of sitting down at the same standards body ludicrous, what we need is frog race. Rival technologies whose stakeholders are attempting to leapfrog each other, but with each jump taking them closer to the goals we have set: open standards, with better QA, harmonized and mappable where possible, supporting plurality, extension and adequate profiles, with decent validation and test suites. The anti-OOXML side tries to claim that the best way to openness it through enforcing a monoculture, but the experience of the last two years, and the substantial improvements in the ODF and OOXML technologies that have occurred and are pending are clear indications that standards need to harness the competitive energies of the stakeholders rather than dissipate them in prolonged committee-room chicanery aimed at maintaining the current “equilibrium”.

M. David Peterson

AddThis Social Bookmark Button

Twitter / migueldeicaza

if you tweet and nobody is following you, did you tweet?

Kurt Cagle

AddThis Social Bookmark Button

I have recently accepted the position as Site Editor for the XML.com site, becoming responsible for the content appearing throughout the site as well as helping to guide functionality and look and feel for this particular portion (and to a certain extent the other sites in the O’Reilly Network). Having contributed to xml.com for several years, I feel honored to get a chance now to steer the editorial direction of the site, but I also need help doing it.

What I’m looking for right now, more than anything, are bloggers interested and passionate about XML and who would like the forum of XML.com to share these ideas. Given the breadth of the XML field at this point, what I’m looking for in terms of skills or expertise is equally broad; specialists (and generalists) in:

  • XML Data Technologies (XQuery, LINQ, XForms, etc.)
  • Semantic Web, both formal (RDF Stack) and informal (micoformats, folksonomies, and so forth)
  • User Interface, User Experience and RIA Components (AJAX, XUL, Silverlight, Flex, CDF/WICD, etc.)
  • Publishing and Syndication (AtomPub, Office Formats, DocBook, DITA)
  • SOA Services (SOAP, WSDL, Messaging and Marshalling, ESB, etc.)
  • XML Data Modeling (Schema design, taxonomies, methodologies)

These are currently unpaid positions, though we’re working on plans to change that, but the site is widely recognized as being one of the pre-eminent authorities on XML technologies on the web, and we hope to provide as much editorial freedom as possible to all of our bloggers.

So if you are interested in writing a regular blog on the hottest trends in XML, give me a shout at kurt@oreilly.com with what you’d like to do and, if you have any, some samples of writings on the web.

Rick Jelliffe

AddThis Social Bookmark Button

IBM Vice President Bob Sutor is continuing the campaign against DIS29500 mark II on his blog entry Critical questions for national bodies considering OOXML/DIS 29500. I have tried to make this blog about his questions, and not in any way about him, though it is difficult to write in a way that keeps this distinction as sharp on the page as it is in my mind; nevertheless towards the end I found myself struggling not to spit the dummy.

Dr Sutor’s questions revolve around two premises that he wants us to buy into. The primary premise can be characterized as this: it is better to have no standard than an imperfect standard; unless DIS 29500 has no problems at all you should vote NO or ABSTAIN That this is rubbish should be obvious to everyone: there are always trade-offs and room for improvement. His secondary premise is along the lines of If you support it, you should prove why your support is not suspect. To say this is nasty crap is to denigrate nasty crap.

Primary

Lets make it straightforward. Here are each of Dr Sutor’s questions, with my comments.



* Was this specification appropriate for the Fast Track process? If not, it should not be approved in such a process and you should ABSTAIN or vote NO.

No Bob. The question is “Is this revised draft good enough and useful enough to be accepted as an International standard, in the reasonable expectation that there will be a good, agile and aggressive maintenance program and in the knowledge that there is work currently being performed on harmonization and testing?”

The issue of the procedure used is irrelevant to the desirability and usefulness of the final draft standard. You can take up the issue of Fast-Track with JTC1. You can get your National Body to lobby JTC1 to not allow standards of more than 100 page to be fast-tracked, or some other concrete proposal.

Think about it for a second! There has been an enormous amount of international scrutiny of the OOXML draft, far more than you would expect in a slow-tracked standard. You can look through an OOXML file and see by even the most trivial inspection that the standard does provide the lion’s share of documentation on the XML and notations in the OOXML file; and you can go through the Editor’s comments as accepted at the BRM and see that the lion’s share of National Body comments have been taken seriously. Objectively, this standard has been really well reviewed, and the process has not prevented this review. NASA sent a man to the moon in about 8 years, for goodness sake.

Saying that there is a lot more work to be done is saying nothing: there is always more work to be done, and the sooner OOXML is a standard and under maintenance, the sooner these things can be addressed. The BRM addressed the big picture issues that need to be right before a technology is on the books: the organization, the conformance classes, the normative status of schemas, the conformance language like should/shall, and the need to have a clear slots for handling legacy/compatibility/deprecated kinds of issues going forward. The big picture is completely good enough for DIS 29500 mark II to be an ISO standard. The BRM dealt with small-picture issues, such as typos and wrong examples and a lot of word-smithing issues that are also needed for a draft to be good enough quality as a standard. And the BRM also dealt with some key internationalization, accessibility and modularization issues that are also a reasonable bottom line for a standard.

Which leaves a myriad of middle-level issues. A number (hundreds!) have been identified and resolved through the BRM. I am sure that many of these issues require more work, which is why moving on to maintenance rather than farting around in draft-standard limbo is the best way to go.



* At each stage of this process, was sufficient time allowed to develop contradictions, completely review the specification in its entirety, generate all appropriate comments, review all proposed resolutions completely and explicitly, and fully review the updated document? If not, you should ABSTAIN or vote NO.

Yesterday I quoted Jim Melton, the editor of SQL, on the XML-DEV mailing list, and I will put the same quote, because unless people realize that the vote to become a standard is the start of getting a standard perfect, not the end they will be sitting targets for FUD: in particular the kind of disinformation that says “This is our only chance, it has to be perfect.” Here is what Jim Melton said:

Or perhaps most people were somewhat intimidated by the prospect of (thoroughly) reviewing a 6,000 page document. To put this in perspective for those who know SQL’s size and complexity, the sum of all nine parts of SQL is about 3950 pages. A ballot on SQL frequently receives several thousand comments, and we’ve been balloting versions of SQL for 20 years!

In fact, virtually every large spec I’ve ever had the “pleasure” to review leads to “thread-pulling”, in which every page yields at least “one more” bug, and following up on that one leads to more, and following up on those leads to still more, etc. I would personally be stunned if 30 dedicated, knowledgeable reviewers of a 6,000 page spec on its first public review were unable to find at least 3,000 unique significant problems and at least 40,000 minor and editorial problems. But that’s just me…

No Bob. The question is “At each stage of the process, was this draft treated fairly and any differently from other ISO standards?

And another question is “How can I encourage my National Body to participate in the continued improvement of this standard (and ODF!) at ISO/IEC JTC1 SC34? It is very important and worthwhile, and it is so great that this we can proceed steadily on with no artificial deadlines using maintenance.”



* Have all your comments been fully and correctly addressed? Are the changes reflected correctly everywhere necessary in the specification? Have you verified this? If not, you should ABSTAIN or vote NO.

No Bob. The question that should be asked is “Have enough of your showstopping issues been approved now to the necessary extent that the advantages of having a standard outweigh the disadvantages.?” It is a balance, a tradeoff. And “Have the championing stakeholders (ECMA and Microsoft) demonstrated a preparedness to engage and make improvements?” for which the answer is objectively yes, based on the BRM at least.



* Is this high quality technology? If not, you should ABSTAIN or vote NO.

No Bob. I’ll visit the quality issue later. But the question is “Does this standard fairly attempt to meet its scope statements?” The purpose of DIS29500 is different to the purpose of IS26300 for example: the underlying technology is a fact at loose in the world: hundreds of millions of people use Excel and Word and Powerpoint and the scope of the standard is not to say “This is the best approach in the world” in some abstract way, or “This is the best thing for data interchange” but “This is the information that these applications produce and consume.” Bob wants the data formats that millions of people use to be kept proprietary, to be caught up in some limbo of committee work and red tape: by his actions, he wants to opposite of openness.



* Can you can say that you completely understand the specification that emerged from the Ballot Resolution Meeting (BRM), with all its changes, and that it is now a very high quality specification? If not, you should ABSTAIN or vote NO.

No Bob. The question is “Does your national body believe that it would be better to have a standard using DIS29500 mark II than to not have a standard?”

Now I certainly agree that National Bodies should give a reasonable amount of technical scrutiny to any standard they vote on. But if you don’t have the technical ability to do a reasonable amount of scrutiny, then you ABSTAIN. And voting NO is completely the wrong vote: if you are ignorant, you don’t impose your view on other nations, you keep quiet. And you never, ever, ever try to block something merely because it does not interest you: you respect other National Bodies enough that if they want something but you don’t, they can have it. That is the way that consensus works at ISO.

I’ll deal with quality below.



* Are you fully confident that no additional problems were introduced at the BRM that your national body would insist must be addressed? If not, you should ABSTAIN or vote NO.

No Bob. The question is “Are you aware that your National Body can notify ITTF and SC34 of any perceived editing SNAFUs and that they will be discussed and, if true, fixed?”

The FUD over all these additional errors that will be introduced is premised on their being no procedures or maintenance to fix them. But it falls down, because in almost all cases the texts that are being introduced in the new draft are completely spelled out. For most of the Editor’s dispositions, people have had months to think about the text; indeed in many cases the Editor’s comments are just direct applications of the wording that the National Bodies originally suggested.

One of the big unknowns is the move to ISO-ese “should/shall” conformance language: however, since this move in no case changes the normative impact of a sentence (though in a few cases it might make the normative impact of a requirement more stark, it is to be hoped) it should not be a cause for any freak out. But it should be pointed out that ECMA did in fact make available at the BRM an unofficial version of the draft with all the Editor’s response changes already there: so any national body can certainly avail themselves of that.

(Now I do agree that the Fast Track procedure is sub-optimal after the BRM: a longer time limit for editors to get his work done would be reasonable so that NBs can vote on the consolidated text, and to relieve ITTF of their overseeing role. But it is not in itself a reason to vote NO or ABSTAIN or YES.)



* As an international standard, does this specification inappropriately favor a single vendor and its products? If so, you should ABSTAIN or vote NO.

No Bob. The question is “Does this specification address more than the petty competitive rivalries of large US multinational corporations and give for the first time a voice to national bodies to provide the kind of documentation and maintenance openness that a market-dominating technology should have? Does this help the users in the world who have to, for whatever institutional or client reasons, use or integrate with Office and its files?”



* Are you 100% confident that there are no intellectual property problems that would prevent anyone from fully and completely implementing everything in the OOXML specification? Do you have this assurance from experts who are not from Microsoft or in their financial ecosystem? If not, you should ABSTAIN or vote NO.

No Bob. That would require that every person who voted had a law degree: no standard is voted for on that level. The question is “Are you reasonably certain that the intellectual property issues of OOXML are no different from those of other similar standards with similar licenses for which there has been no problem

Lets get realistic here. The people who get sued for IPR infringements on software are the people who have made big enough money to be good targets. And they hire fancy lawyers and duke it out.

The mention of “fully and completely” implement is the giveaway here. Bob’s comment is not interested in end users, sitting there creating Excel formulas: these number in their 10s of millions and have no voice and absolutely no chance of suffering IP attacks: they are consumers. And Bob is not interested in system integrators, who are faced with integrating into existing and future Office sites just as a fact of life. No, Bob’s comment is interested in making IBM’s niche requirements into something that should dominate other considerations. If Bob were remotely serious about this, why hasn’t he raised exactly the same issues about ODF? And IBM’s license is patently similar to Microsoft’s: it is a bravado performance to criticize something as inadequate that is almost identical to your own company’s license!



* Has the process as you have seen it been without undue and inappropriate influence by the supporters of OOXML? If not, you should ABSTAIN or vote NO.

No Bob. The question is “Has the process resulted in a standard that is good enough to start and get improved by maintenance at ISO? Has the normal ISO processes resulted in the various sides making their cases, problems being looked at, and serious attempts been made to resolve issues?

I don’t think people are fools Bob. They can see that the readiness of the NOOOXML people (who Bob is happy to share a stage with and happy for his staff to link to) to label any setback for their cause, no matter how trivial, as a de facto sign of undue influence, process iregularity, corruption and bribery, makes these claims completely without credibility. Where is the smoking gun Bob? When you look at these claims, what do you find: ooh some business who would benefit by having the standard goes to a standards meeting and says “This would be useful for us” and that is a sign of corruption?

And Bob, how on earth can you talk about undue influence when you later write Are you willing to stake your professional reputation on that action? Are you really saying that people who dare to disagree with you can expect to have their professional reputations pilloried? I have on several occassions had experts say to me “I want to support it but I am scared that I will get slammed the way you have been”. They feel intimidated by the visciousness of the attacks. That these continue on, that they are repeated and amplified through the sockpuppets and into the community, makes it a reasonable expectation. But Bob are you really saying what it looks like you are saying: that people who hold a different opinion to you will be ruined professionally? Is this a warning or a threat? What is the difference between this and the old FUD “No-one ever got sacked for buying IBM“?



* Have the principles of balance and equilibrium in the standards setting process been violated to the benefit of OOXML? If so, you should ABSTAIN or vote NO.

No Bob. There is no principle of balance and equilibrium. You cannot object to a standard just because it might upset your company’s marketing plans. The key question for a standard is “Is there a market requirement for this standard” which there clearly is; if there is not a requirement for the largest market dominating application to be roped into the standards process and allowing a non-corporate-dominated governance of the standard, then there is no need for almost any standard.

A standard is an agreement. It is an agreement that is encouraged by public policy because it creates a market. Where there is no universal agreement on a single standard, you have multiple standards. The Allied Tubemakers case is a clear example of a disruptive standard: the metal tube manufacturers tried to block the plastic tubes, but the other technology was ultimately allowed even though it disrupted the business plans of the existing participants.

Equilibrium my foot. There is absoluteness no business for a standard to be turned down because it might disadvantage some commercial enterprise or sector, not Microsoft, not Sun, not Red Hat, not pinmakers, and certainly not IBM. That is anti-standards and pro-cartel thinking.

What is vitally important for the standards process is not “equilibrium” but “equity”: does every standard get treated the same regardless of its champions and opponents, and regardless of campaigns of FUD and vilification. The FUD and vilificiation campaigns are no more a reason to vote NO as they are a reason to vote YES. You vote to accept a standard because we would be better off with the standard on the books and under control than being proprietary and with no formal, neutral mechanism for public institutions and stakeholders to influence that technology.



* Were rules broken or changed during this process? If so, you should ABSTAIN or vote NO.

No Bob. Ah, the hanging chads. The question is “Is your National Body aware that if they think rules have been broken, they should raise this issue with JTC1? If a delegate is concerned, are they aware they can raise the issue with their National Body?” You vote for a standard on the technical issues.

There is no such thing as a protest vote on a standard, if your National Body is acting in good faith. The final draft text is adequate or it is not adequate: that is the question.

I was told that at the JTC1 meeting on the Gold Coast, Australia, late last year, the issue of irregularities came up, and no actual evidence for anything wrong was found. What you need is a bank account, or a confession, or something like that, to prove corruption.

I make a joke that a year ago there was only a handful of people in the world who were experts in JTC1 operations, and now there are hundreds of thousands! But when people who have never read the ISO and JTC1 Directives, and never participated in any meetings, opines on procedural irregularity, I hang my head in embarrassment for them.

The excessive interest in rules and procedure just shows a kind of win-at-all-costs mentality. The essence of an international standard is agreement and working through issues and problems, not trying to trump people by procedural vexations. The standards process is a kind of mediation, not a kind of court room or election.



Secondary

You cannot keep a pig out of muck. In we dive, with Bob demanding we swear on the lives of our children. I cannot read this in any other way that a resounding thumbs up to the atmosphere of presumption of corruption and vilification on anyone who disagrees with his company’s policy.

Hmmm, Bob is IBM vice President for Open Source and Standards. He sets the tone for his employees. It would be great to hear Bob say whether it is official IBM policy that, for instance, anyone who supports DIS 29500 mark II as a standard should be prepared to stake their professional reputations.



* If you voted YES on this, are you willing to stake your professional reputation on that action?

Is this a threat, Bob? (Cut to picture of de Niro from Taxi Driver Are you looking at me?)

If anyone in the world has “staked their professional reputation” it is me, but it is not that I am particularly obsessed or thrilled by OOXML but because I refuse to be intimidated. If DIS 20500 is rejected, I will think it is a missed opportunity but I don’t see that it will affect my work or career at all. If DIS 29500 is accepted I’ll have a beer, but it is not the main game for me.

Perhaps I am lucky to have enough runs on the board that this is only a minor issue and not at all a main part of my career: and I do think it is great that the big boys are competing over standards for XML file formats! Something no-one would have predicted a decade ago, despite our hopes.

I recommend people who are worried about this ignore it. You will get a far greater reputation for integrity by standing up to bullies and intolerant people than by kowtowing to them. (I can understand discretion being the better part of valour of course.) You find out who your friends are and you have the courage of your convictions.

It would be interesting to know who exactly is going to cause our professional reputation to be dragged down if we do have your own angles? Is it bluster, or is Bob announcing the formation of some team of crack character assassins, poised with poison pen ready: the Lotus Ninjas perhaps? :-)



* If you voted YES on this, can you personally attest to the high quality of the OOXML technology and the standards process it went through?

There is a great ISO standard on Quality, ISO 9126: software product quality. I much prefer the old version rather than the 2000 version, it is more pithy, but standards do get maintained! What IS 9126 is about it making subcategories of “quality” which are concrete enough that you can rank their importance and figure out how to measure them. It give three classes which can be used to classify these more concrete subcategoroes: internal quality, external quality, and quality in use.

Quality-in-use can only be determined over time, when the software (in this case, the standard) is deployed and used. We cannot say much about quality-in-use at this time.

Internal quality, in the case of a standard, would relate to wordsmithing and editorial issues, as well as how complete the schema was, whether the normative references were complete, how clear each sentence was, and so on. DIS 29500 mark II is not bad on this score.

External quality relates to how well it can be tested against external criteria. Now in the case of documents, this is where validation fits in. We can test documents using the schemas. We can test formulas against the BNF productions too, and even automate this. So all XML-based, schema using standards provide a significant base-line of quality: that is why SC34 (and W3C) has spent such a big effort in developing the modern schema languages. They objectively allow a significant amount of testing that documents do conform to the specifications; and a standard that has clear, objective, automatable, verifiable conformance, has reached a certain level of quality just by that. Now testing application conformance is a much more difficult proposition, which is one reason I (and some others) are not particularly keen on application conformance (this effects ODF and many other standards too): without a way of automating testing (or, at least, of expressing the constraints in a purely declarative form) it becomes a matter of human judgement whether the standard is enough quality.

However, I say that with a sharp proviso: a standard is not written for a layman or a novice: they are written so that a person who is aware of the application domain could read them. You don’t learn typsesetting from DIS29500 or IS26300. You don’t learn the mathematical properties of the spline or the delights of postfix programming languages from the ISO standards for PDF.

When considering “quality” of a standard ISO 9126 we have to consider it against its scope. The quality requirements for a standard that said “This standard is intended to express all information in all documents in the world” or “This standard is intended to express the most crystalline and pure forms for office documents” are different from a standard that said “This standard is intended to express the information concretely found in the documents made by a common application.” In this last case, “high quality” will relate to accuracy and completeness, regardless of the underlying technology. A “high quality” medical textboook will not say “Let us assume the body is symmetrical, because it would be offensive to left-handed people or go against aesthetics to allow a bias like that”: instead it will try to state what is in the body, asymmetry, naughty bits and all, because its purpose is to reveal. To speak of quality without relating quality to purpose and scope is to speak nonsense.



* If you voted YES on this, will you publicly explain why and also detail any current or planned commercial interests you have in common with the supporters of OOXML?

Would you like all our business plans too?

But seriously, I have been open, and it has gotten me into constant trouble from self-righteous liars and over-competitive boors and flambant trolls “with God on their side”.

But what about you Bob? Will you reveal your salary and bonuses if the anti-OOXML campaign succeeds? And also for your staff? And also for any lobbyists, for example European ones who may have written white papers for Lotus products but now hide this? What about discount rates for hotel rooms, and other largesse? Readers may not realize it, but it is actually really easy to find scandal and point the finger if you try. I don’t know how many blogs I have pulled before publication after deciding not to descend to that level: there is no shortage of ammunition but it just makes heat not light. And most of all, because it is wrong. (Fans of old things might consider here the ninth commandment: Thou shalt not bear false witness. And I Cor 9:10 on slanderers/gossips/revilers.)

What Dr Sutor is doing here, is to reinforce this meme that anyone who is in favour of XML must be having secret business dealing with Microsoft. When you read through comments on his and his coterie’s blogs, it is a really common theme: because there is no possible way that anyone can reasonably be in favour of OOXML becoming a standard (which is not the same thing as requiring that anyone should use it!) therefore the only explanation is monetary gain, whether illicit or licit. It is so offensive and narrow-minded. I want to quote what ODF editor Patrick Durusau wrote recently:

Granted, I have a number of issues with the current OpenXML proposal but experts do disagree in good faith even within open standards development projects. If a proposal cannot progress until we all agree, then we risk proposals being held hostage to whim and caprice.

Dr Sutor’s questions are hardly attempting to discourage the a priori suspicion of deals: avarice, greed, corruption, business, whatever. Why require evidence? Thoughtcrime is enough: if there is someone who says anything in favour of accepting DIS29500 mark II, the onus should be on them to prove they are not involved in dealings, nefarious or otherwise.

When IBM Dr Sutor writes about “commercial interests” I do not believe he is seriously asking for more transparancy from participants. I believe he is just trying to set up the expectation of deals (sinister or otherwise) to propagandize non-participants, so that if there is an unfavourable outcome the line can be spun in the popular imagination that “The participants had made deals”. Fewer marketing opportunities than “We tried and we lost.”



* If you previously did not support OOXML but recently changed your mind, will you publicly and in detail explain why you did this?

Ah. Of course, because anyone who dares have a different POV is suspect…

I think it would be great for people to state why they think it would be good to have a standard. But it is just sick to want this to be in an environment of accusation and suspicion.



* Do you personally feel that OOXML helps the ISO and IEC “brands” related to quality of technology and process?

This is the game plan of “If I don’t get what I want I will take my bat and ball home”. If the JTC1 process doesn’t deliver what they want, then bringing the organization into disrepute will work instead. It is just like if an expert doesn’t say what you want, you try to bring them into disrepute. I am sorry, but it just seems so cynical to me…if that is the kind of dog-eat-dog world that big business corporate types have to live in, they should do us all a favour and keep to themselves.


Am I being too extreme here? Paranoid perhaps? No, I think I am being anti-paranoid: the paranoia that says that anyone who thinks that DIS29500 mark II is acceptable as a standard must have been got to, bought off, or have some kind of deal!

Well, the thing is that these spin doctors know exactly the toxic environment out there. They know when they make a little ripple it gets caught up into a tsunami. They know you shouldn’t falsely shout “fire in a crowded theatrer.”

Knowing this, why doesn’t Dr Sutor write, for example:

* If you voted YES on this, will you publicly explain why and also detail any current or planned commercial interests you have in common with the supporters of OOXML? And if you voted NO on this, will you publicly explain why and also detail any current or planne commercial interests in rivalry to the supporters of OOXML? And of course, no-one should be so rude, so insecure, so crass, so toxic, so venial, so mean-spirited, so drunk on their own self-righteousness, as to assume without evidence that anyone who has a different position from theirs to be guilty of anything other than having their own brain.

I’d be happy to co-sign an open letter that says that. How about it Bob?

Rick Jelliffe

AddThis Social Bookmark Button

IBM/Lotus’ Rob Weir has a timely blog up entitled How many defects remain in OOXML? Timely, because of course, the clock is ticking on the OOXML vote, so this is coming up to his last chance to throw some mud. This is a subject I am interested in, and have blogged on before, so I think it might be useful to make a comment.

The Set Up

First lets look at the set-up material:

DIS 29500, Office Open XML, was submitted for Fast Track review by Ecma as 6,045 page specification. (After the BRM, it is now longer, maybe 7,500 pages or so. We don’t know for sure, since the post-BRM text is not yet available for inspection.)

Longer? Well what has happened is that

  1. Normative schemas (with structural improvements to run better on the open source XSD validators) that were in external files are now included in the text: there is no change in the amount of information in the standard despite the extra pages! In fact, because at the same time the schema fragments in the draft are now (post-BRM) informative, there has actually been an decrease in the amount of normative text.
  2. Non-normative material on accessibility has been added, again not requiring the kind of review of thought that normative text requires.
  3. Extra explanatory material requested by NBs has been added, but this text was specified in the Editor’s responses or explicitly by the BRM, it simply isn’t the case that NBs don’t know what this text is: see the BRM outcome documents.

I have blogged before against the simplistic use of page length: That diagram (Let me ring your bell), and I refer interested readers to that.

Next, comes:

Based on the original 6,045 page length, a 5-month review by JTC1 NB’s lead to 48 defect reports by NB’s, reporting a total of 3,522 defects.

Now what you might not realize from this is that the 5-month review is actually a title or nickname for one phase of the review, not the actual time limit. The initial text was released in December 2006, and national bodies didn’t actually submit their ballots until September 2007. So National Bodies had 9 months, not 5. (And interested parties could have participated for the prior year-long process at ECMA, which included a public draft.)

The total of 3,422 defects sounds impressive, except that most of them were duplicates, many just cut-and-paste duplicates by lazy or novice reviewers who somehow were under the misapprehension that in ISO process the squeaky wheels would get the most oil. ECMA grouped them into 1027 unique issues, however my estimate was that many more could be grouped together (this is borne out by the repetition of answers within the Editor’s disposition of comment) to about 750 really unique issues.

Next comes the material on a defect count per page. (To give an idea of why this is an area where simplistic use of numbers will be actively misleading is, of course, that adding the extra pages of schema material will actually cause a reduction in average the number of errors per page, without decreasing the absolute number of problems.)

I have blogged before On error rates in drafts of standards and I refer interested readers to that. Note that I give an estimate of the number of errors that your would expect to be caught (in one pass) at about 1,000, which was exactly what we have. In particular, note (ISO SQL Editor’s) Jim Melton’s comments, which I will repeat

Or perhaps most people were somewhat intimidated by the prospect of (thoroughly) reviewing a 6,000 page document. To put this in perspective for those who know SQL’s size and complexity, the sum of all nine parts of SQL is about 3950 pages. A ballot on SQL frequently receives several thousand comments, and we’ve been balloting versions of SQL for 20 years!

In fact, virtually every large spec I’ve ever had the “pleasure” to review leads to “thread-pulling”, in which every page yields at least “one more” bug, and following up on that one leads to more, and following up on those leads to still more, etc. I would personally be stunned if 30 dedicated, knowledgeable reviewers of a 6,000 page spec on its first public review were unable to find at least 3,000 unique significant problems and at least 40,000 minor and editorial problems. But that’s just me…

Under that kind of criteria that our Big Blue friend is proposing, the ISO SQL standard which is one of the most widely implemented and important and mission-critical of all ISO IT standards would not be of high enough quality to make the grade! Next Mr Weir says:

If we believed that the 5-month review represented a complete review of the text of DIS 29500, by those with relevant subject matter expertise, then we would have some confidence that all, or at least most, defects were detected, reported and repaired.

Did you see the sleight-of-hand there? The outcome “repaired” is not the only possible outcome! The big possibility that Wier misses is that a defect can be allocated to maintenance: the ballot to become a standard is not the end of the process but merely the start! But absolutely no reference to this. Why? To panic people into assuming this is the last and only chance to get things perfect.

(Weir does have another post Contra Durusau, notable for a really sleazy reference to Seattle. He takes an unrelenting anti-maintenance line, rather surprising in the light that the same arguments can apply to ODF which is his alternative. It does not suit his argument that there are many standards with successful maintenance.)

The Trick

One of the constant themes over the last year has been the theme of panic. QUICK: You only have one month to find contradictions. QUICK: You only have five months to find defects. You only have a few weeks to evaluate the Editor’s comments. Every person has to read or review the whole standard. Every national body needs to have an explicit detailed position on every issue. And so on. Always under the assumption that the current stage is the last and only chance for change.

It every case this panic is has been unnecessary FUD-mongering, because at ISO there is always the scope for improving a standard. [The normal caveat that you want to get it as right as possible first time because you cannot bolt the stable door once the horse has bolted does not apply with the same strength as with a from-scratch standard because the horse has already bolted. In fact the horse has been off and running for the last 20 years! So “getting it right” relates to documentations and harmonization rather than the general shape.]

What happens when a draft gets accepted as a standard? It gets subjected to the normal committee maintenance procedures. There is indeed a special step which can be taken where a standard gets deemed stabilized and so not subject to maintenance, but there is absolutely no way that IS29500 (or IS26300) are candidates for that yet!

Maintenance sounds a dreary word, but what it means is that National Bodies (and liaison bodies) can submit to SC34 defect reports. And I would hope there are a backlog of these issues: a trouble with a stretched out Fast-track such as we have had is that it means there is in effect six months where Defect Reports have to sit on the shelf waiting until the standard is accepted before being processed. That there have been more defects or improvements discovered since the ballot was taken is not a source of wonder or horror: of course there will be more issues discovered: how could it be otherwise?

But it is a complete mistake, and at worst disinformation, to think that defects remain outstanding, that the standard is set in stone at the time of voting. Indeed, ISO ODF is largely predicated on there being ongoing maintenance to fill in the gaps and fix problems that are found. The thing is that standards based on deployed technologies do not need reviews based on “is this technology bogus and unimplementable” in the way that blue-sky standards do: in the case of Open XML and ODF and PDF you can open up a file and look at it and see whether the big and middle picture is workable. (And you can go further and validate the XML with the schemas, for fine-grained and objective compliance testing, of course.)

At ISO/IEC JTC1, the rule is that the Editor has to handle defect reports “promptly”. (”Promptly” needs to be measured in quarters of years, it won’t be weeks. But it won’t be years or decades, which is how long some bugs have persisted in Office without the circuit-breaking of National Body scrutiny.) SC34 participants have been discussing many issues relating to getting maintenance agile and pro-active, and National Bodies who are interested in document standards need to get involved.

What you have in the ISO process is equivalent, if the NBs want it, to a Ballot Resolution Meeting every six months in perpetuity. Defect Reports can include detailed suggestions for change, and it is even possible to bundle them as Draft Amendments and get that fast-tracked.

There is a lot of talk about “ECMA should resubmit it for another fast-track” or “ECMA should resubmit it for slow-track” and so on. I regard a lot of this talk as disingenuous, because it is frequently suggested by commentators who you know are not interested in corralling OOXML into a standard no matter how technically excellent it can become. It looks like a compromise but it is intended to block progress not help it. Now I have no general objection to standards taking years to complete, but for a deployed technology the correct process is the maintenance process not the committee draft process.

Every standard that gets adequate review will have reams of defects reported. That is just as much a function of the intensity of review as the underlying quality of the standard. Indeed, you could use a reverse metric: any standard which does not have at least one defect per 6 pages reported (for example) should be suspected of having inadequate review. DIS 29500 has had thousands of people reading it and reviewing it. Thousands, not hundreds. A big swathe have been dealt with, a big swathe has been dealt with partially and can be improved further; and there is a big swathe of issues that are not defects at all but extra features which clearly belong to maintenance not initial review.

But the idea that this is it, this is Microsoft’s only accountability moment where they get a pass or fail is propaganda, not the ISO process. It is completely true that the maintenance procedure needs continued interest and continued pressure, but it is not true that this is the last chance to improve the standard as if it will be frozen for all time.



Update

In comments below, ISO SQL editor Jim Melton has clarified his comments. I was glad to see him say Please note also that I have taken no position at all on the merits of standardizing the technology in the spec, nor even the merits of the technology itself. What Jim says, however, is that he would expect a full multi-year review of a new 6,000 page spec to almost certainly reveal upwards of 5000 unique issues.

I have three responses to that. First, that Ecma 376 already had a year of review before ISO, so it is inappropriate to count the number of issues as from a de novo standard: we should be open to the possibility that in fact we did not find thousands more problems because they are not there. (However, Jim’s original comment about pulling threads is really appropriate.)

Second, that the error rates in a standard have to be tied to the number of normative pages not just the raw page count: OOXML is unusual as a standard in having so much repeated and non-normative material: indeed, Patrick Durusau in 20 hours was able to condense the WordProcessingML material by 74% to 452 pages: assuming that the other parts have similar rates that gives us about 1500 normative pages, which by Jim’s metric should reveal only 1250 unique issues. Compare this to the approx 1,000 issues that were dealt with (and the large number of issues dealt with en masse such as fixing ISO-ese shalls and shoulds and fixing examples) and the review is actually looking pretty good even on Jim’s metrics, isn’t it!

And my third point is the same one I have said elsewhere. The maintenance process is the best place to deal with remaining issues. If you look at some of the FUD lists floating around of new issues, you see an indiscriminant grab-bag of new feature requests, denials of the scope of OOXML which emphasizes legacy features, function changes, as well as (hopefully) some errors proper. These are not showstoppers, but they all should be dealt with sooner rather than later because of their importance. And sooner means by maintenance of the standard, not by pre-standardization faffing around and fillibistering.

Update 2

A website picked up on this exchange and quoted Jim’s

You’ve written 6000 pages of specification largely in secret (and, I understand, recently added over 1500 more pages) and given the world five months to read, absorb, understand, review, critique, and establish informed positions on it.

So I think it is useful to restate the problems with this.

  • 6,000 pages The pre-BRM draft standard (DIS 29500 mark I) had over 6,000 page plus several hundred more for schema files that were not printed in the text. However, the text of a standard has normative parts which state actual requirements and informative parts which give extra information to help users. Estimates from the editor of a “rival” standard is that about 75% of the content of DIS 29500 mark I was informative or could be condensed to that without loss. The additional pages (and I have seen no reliable count that it is 15000 pages: that seems just puff) is mainly due to taking schemas that currently are normative and putting them into the standard; however, at the same time repeated fragments of schemas in the draft text are being made informative, so actually there is net decrease in informative material.

    So really what we have is a standard of about 1500 normative pages (perhaps 2,000 pages including schemas) with about 4500 pages of additional information to help explain it. The attempts to use the blanket figure 6000 disguise both that the text has an enormous amount of material to aid understanding but also to allow inflated views of the amount of work needed to find errors in the normative sections. Furthermore, there is an enormous amount of repetition, so review comments from one section often applies without change to other sections.
  • Secret Actually, Ecma put out a public draft for comment.
  • Five months No, the “five month period” is the nick name, and it actually took nine months until the ballot. So not 5 months to review 6,000 normative pages, but 9 months to review effectively 1,500 normative pages. What is the difference: well let us remove 1 month for administrative palava, the difference is 6,000/4 = 1500 pages per month and 1,500/8 = 187 pages per month.
  • Five months No, actually there was an additional period after the ballot where National Bodies could look at each other’s comments and participate in the Ballot Resolution Meeting: which takes it to over a year in total, not including the previous year of development at Ecma
  • read, absorb, understand, review, critique, and establish informed positions But every individual National Body does not need to have a definite opinion on each individual issue: abstain is fine on issues that are not of interest or are outside the expertise. I don’t know how the ISO SQL Steering Committee works, but in SC34 national bodies try hard not to act outside their competence and are careful to abstain rather than spoil the process: they find the best experts they can and encourage development of national expertise and awareness of their particular national interests: Japan on internationalization, fonts and formal schemas for example. The review happens not because everyone involved knows everything, but because collectively and cooperatively all the issues get adequate coverage. For example, there may only be three or four National Bodies with deep experts on maths, and several more with general experts who can get the drift pretty well, and a few more with industry contacts and other liaisons, and that is more than adequate for review.
  • Given the world SC34 has been operational in one form or another for almost 25 years. People who are interested in this area have had a long time to get involved, learn the procedures, get national committees going, participate in various standards to learn the ropes and make networks. Both when ODF and OOXML were first proposed for fast-tracking there were good signs for people who were interested to get involved. The idea that somehow DIS 29500 has been foisted on an unsuspecting and unready public shifts the responsibility away from the people who should have been participating and up-to-speed. If a National Body (or government or other stakeholder) ignores developing skills and experts who will be ready to participate when the time comes, of course they will not have enough time: but it is their fault! If you are running in a race, arrive late, and the starter’s gun goes off while you are still putting on your shoes, you cannot complain “I didn’t have enough time!”
M. David Peterson

AddThis Social Bookmark Button

So I’ve been invited to attend the Microsoft Technology Summit in Redmond next week which, from what I understand, is focused as an interactive conversation between MSFT technology/product owners and a group of ~50 or so technologists from around the industry. As per a recent email I received regarding the event,

Please plan to openly discuss your views and opinions. While having respect and tolerance for others opinions, we encourage you to be vocal and open with your opinions. The MTS is a non-NDA summit, so we also encourage blogging, web posting, etc.

I’ll most definitely be blogging the experience as the week progresses and as such am definitely keen to hear from community members any questions you’d like to have answered, topics you’d like to see discussed, opinions you’d like to express, etc. In this regard, please feel free to leave either a comment below or email me directly.

Also, if you’re in Redmond next week and would like to get together for lunch and/or in the evening after any of the planned events (they end between 9:30 and 10:30 each night) please let me know! I arrive on Tuesday afternoon and leave Saturday afternoon. The event ends @ ~noon on Friday, so I’ve got a solid day to play. Will probably head downtown to hit some shows Friday night, so if you’re into that kind thing and you’re free on Friday night, hit me up! :D

Update: In case any of you are interested in the topics/speakers such that you can guage what type of questions/comments/etc. to make, the conference agenda follows (I wanted to gain verification that publishing this list was kosher, thus the reason I didn’t supply this until now.)

Rick Jelliffe

AddThis Social Bookmark Button

This is an open letter to all companies who achieved market success in the 1980s and 1990s with PC-based applications.

The recent controversy over ODF and Office Open XML at ISO shows both that there is substantial interest in document formats, and that there is also substantial commercial rivalry. I do not believe I am on my own in thinking that the writing is on the wall: the days of private proprietary formats, especially binary formats, are numbered and perhaps have already expired.

There are of course many millions of documents archived in these older formats, and it will be a major challenge for archivists to figure out workable and cost-effective strategies for maintaining or grandfathering these documents into newer formats, especially more-or-less lossy standard formats.

Corporations who were market leaders in the 1980s and 1990s for PC applications have a responsibility to make sure that documentation on their old formats are not lost. Especially for document formats before 1990, the benefits of the format as some kind of IP-embodying revenue generator will have lapsed now in 2008. However the responsibility for archiving remains.

So I call on companies in this situation, in particular Microsoft, IBM/Lotus, Corel, Computer Associates, Fujitsu, Philips, as well as the current owners of past names such as Wang, and so on, to submit your legacy binary format documentation for documents (particularly home and office documents) and media, to ISO/IEC JTC1 for acceptance as Technical Specifications.* Handing over the documentation to ISO care can shift the responsibility for archiving and making available old documentation from individual companies, provide good public relations, and allow old projects to be tidied up and closed.

The recent controversy over Office Open XML and ODF has occurred in part because both were submitted to become International Standards, which is appropriate for living formats. However, there is still a substantial public interest that would be served by existing documentation of legacy formats being submitted as Technical Specifications or Technical Reports, which, as classes of documents that are less than a standard, will be less controversial but still useful for putting this valuable information onto the public arena. As publicly available specifications, ISO/IEC would make the material available free on their website: free access is a very important outcome.

For nations where the 17 year patent time applies, there seems little reason why formats from 1990 and before could not be quickly submitted and dealt with in this way. However, given the enormous benefits that openness brings in increasing the size of the pie, I suggest that even recent formats, for example formats before 2001, should also be submitted to ISO as Technical Specifications in this way with some appropriate RAND-z IP covenant or license.

Examples of these formats that spring to mind include:

  • All Microsoft Office binary and text and media formats, including RTF and Visio
  • All IBM/Lotus binary and text and media formats, including Visicalc
  • All Corel formats, including WordPerfect

Furthermore, I call on archiving and regulatory bodies to investigate encouraging and supporting this kind of activity. As well as office document formats, there are substantial legacy collections of financial and engineering documents which would also benefit from the same treatment. It should go without saying, but the Macintosh, Amiga, OS/2, and applications on the many different versions of UNIX may also have hosted popular applications whose documentation may be in danger of being lost unless it is lodged with a suitable formal international technical library, such as ISO/IEC.

The ISO/IEC Technical Specification is a good, low-fuss medium for making sure that older formats do not disappear, and without requiring costly rewrites or changes.

*Contact your local national standards body for advise on this, or your local SC34 committee member. Do not get too caught up in whether the document is a Technical Report or Technical Specification.

M. David Peterson

AddThis Social Bookmark Button

Digg - Lawrence Lessig needs your help

Having some personal experience with the benefits that come along for the ride when volunteering your time to the — and I’ll be blunt — the single most important mind of the 21st century, I can assure you of the fact that you will gain more in return than what you will give.

The above is a personal quote. Please forgive my follow-up comment to Jain Berlor. But to be quite honest, I don’t give a damn. Lawrence Lessig deserves greater respect than what Mr. Berlor seems to be willing to both give and recognize.

I won’t stand for it.

If you can help, please do. As per above, I can assure you of the fact that what you will gain in return is greater than what you will give.

Rick Jelliffe

AddThis Social Bookmark Button

Here are three questions: they are not the same:

  • Does being pro-ODF require you to be anti-OOXML?
  • Does being pro-ODF require you to be against DIS29500 mark II being accepted as an ISO/IEC standard?
  • Does being anti-Microsoft require you to be anti-DIS29500?

A lot of the FUD over the last year has based on the idea that if you are anti-Microsoft you must be pro-ODF and if you are pro-ODF you must be anti-OOXML and if you are anti-OOXML you must be against the acceptance of DIS29500 mark II. It is George Bush-like simplification Either you are with us, or you are with the terrorists that tries to excluded any middle ground.

However, you may find yourself in the position of wishing that Microsoft, since it does not seem to be going away, would behave itself; and you might believe that ODF is great because we need a good if sometimes lossy interchange format; and you might believe that is a good turn of events that MS is documenting and opening up its formats even though you may not be necessarily convinced whether it is by baby steps or manful strides; and you may find yourself thinking that while you might not use an eventual IS29500 yourself, other people surely will, and they would benefit from the stability and openness that a continuingly maintained and reviewed International Standard would foster.

You are not an extremist in this, but actually are being mild, accommodating of others, and reasonable. However, people with your kind of middle-ground views are being accused of everything up to (and including) corruption. For example, here is an comment that is quite typical, on a well-known anti-OOXML marketing website concerning a someone well-known who is one of the biggest experts in the field (I am removing the name, because I don’t want to participate in repeating the slur):

You are of course very careful to avoid insinuations of foul play (almost); but I can’t think of any motivations for XXXXXX’s postings apart from naivety or subterfuge. And for someone as apparently experienced and knowledgeable as XXXXXX, naivety surely isn’t an option.

and later

I don’t know what XXXXXX has received in exchange for his “personal opinions”, but I hope he values it more than his (now irrevocably shattered) credibility.

The logical fallacy here is called the fallacy of the excluded middle: the idea that reasonable people might reasonably disagree is not allowed. And thence to the imputation of corruption. This fortnight I have seen had four friends (and myself yet again, I must be so busy) separately accused in this way in various forums.

Now sometimes it is easy to write something that in retrospect reads in a different way to your intent. Everyone makes mistakes. But the responsible thing to do is to withdraw the comment and to take such personal attacks off any websites under your influence, and so on. And to refuse to link to sites which are not responsible in this kind of way.

The speed with which any differing opinions to the party line on OOXML are labeled corrupt should ring alarm bells. As my dear old Dad used to say about political speeches: “Argument weak: shout like hell!” Some of these guys will say anything.

Please guys, a bit of self control. If you are happy to see people’s careers ruined due to differences of opinion on a file format, where are your heads at? (Credit where it is due. I was pleased to see that the moderator of that website called the commenters into order to say no more personal attacks. Let me be Mr Glass Half Full and say: Well done. However, as Mr Glass Half Empty, let me say that it is irresponsible to keep such libels online.If it is so bad or libelous a subject that new posts will not be accepted, why are the existing posts still up?)

Rick Jelliffe

AddThis Social Bookmark Button

In the markup world, the jargon is that inline markup is the tags that delimit ranges of text in a document (e.g., Plain Old XML), while out-of-line markup is where the structures and labels are in one place but the subjects of the structures and labels is in other place (e.g., XLinks). Of course, you can have XPaths which drill down to some piece or bundle of information with inline markup, but where there is out-of-line markup there is potentially another XPath that can drill down through the out-of-line markup and end up labelling the same information.

What may not be obvious is that a web system that uses the PRESTO is in effect using URLs that act like XPaths on virtual out-of-line markup. “Virtual” because no actual tree is ever explicated (necessarily): notionally PRESTO uses resolver rewriting.

That good markup practice is to directly markup the information without fluff and tricks and in as pleasant a way as possible is universally acknowledged; and that there are many kinds of information structure where the markup cannot be a neat model of the data such that all elements represent objects of the same analytical importance is also widely known and regretted. (Think of the distinction in XSD between the components (the objects of the schemas) and the tags used for each component, for example. Or the *Pr containers in OOXML. )

A PRESTO URL should give the view in terms of the (conceptual) components, not the specific tags used if the resource is stored as an XML document. And not necessarily every tag, certainly. But every concept (every significant concept) should have a URL, even if there is no representation available or only a pretty crappy one.

So if in PRESTO a URL represents a kind of XPath to a virtual out-of-line markup view of some data, then it is possible to have a virtual schema for that virtual markup: in effect, you could have a schema for the URL. For example, given the virtual schema (as RELAX NG compact syntax here):

  element address {
     element tent { text },
     element oasis  { text },
     element wadi { text },
     element desert { text }
  }

which would allow PRESTO URLs like

   http://www.eg.com/address
   http://www.eg.com/address/tent
   http://www.eg.com/address/oasis
   http://www.eg.com/address/wadi
   http://www.eg.com/address/desert

In PRESTO, these should be available regardless of how the data is stored, because the idea is to model the user’s conceptions. (And if an exact match is not available, to provide the best fit. This certainly creates a task allocation between front-end and back-end systems that may not be workable for some organizations or tasks. No sweat.)

But what about cardinality? Here is a schema more typical of literature:

   element law {
       element title { text}
       element part * {
            element title { text } ,
            ( element p { text } |
              element list {
                  element item  { text } +
              }
            )*
         }
    }

The Xpath for accessing a particular part’s title would be /law/part[2]/title so the PRESTO URLs would need some kind of convention.

In PRESTO we *might* have URLs for

     http://www.eg.com/law/
     http://www.eg.com/law/title
     http://www.eg.com/law/part
     http://www.eg.com/law/part2/title
     http://www.eg.com/law/part2/p3
     http://www.eg.com/law/part2/list4
     http://www.eg.com/law/part2/list3/item4

Now, I am not sure I understand the issues well enough to say which system for indexing is absolutely best. But I think the advantage of http://www.eg.com/law/part2/title over http://www.eg.com/law/part2/title is that it is probably a more common case that your system is interested in /law/part[2]/title rather than all titles of parts /law/part/title. But it is a matter of the particular use case and the consequent virtual schema.

(Another possibility is just to bite the bullet and allow XPath syntax directly in the URLs, with appropriate percent escaping. For example http://www.eg.com/l/law/part%5B2%5D/title. Is this reinventing XPointer? Well, in a way, except that in Xpointer you are locating a file then drilling down according to the actual markup: in PRESTO there information is merely hierarchically accessible according and you are using the Use Case concepts to zero in on the information.)

M. David Peterson

AddThis Social Bookmark Button

… are *bound* for greatness.

Update: And for those who seemingly believe that their “power”, “wisdom”, “intellect”, and “”character”" enables them to overpower and therefore overcome the same collective power, wisdom, intellect, and character held by those of us who are listening, watching, communicating, and promoting that in which we both will and, more importantly at this moment in time, *WILL NOT* tolerate as a human culture, let the wrath that is the power of our tightly connected yet loosely coupled Internet communities take its toll upon you. [UPDATE: *PLEASE NOTE*: As per my follow-up to Len** below, “let the wrath that is the power of our tightly connected yet loosely coupled Internet communities take its toll upon you.” is *NOT* referring to wrath in the physical sense and instead in the social Internet networking sense. In other words, “Share this with your friends such that they can better understand the subject matter and judge for themselves how they might choose to react with their *alliance* to any given candidate.“.

** Please see my SPECIAL NOTE below. Thanks![/UPDATE]

Please watch/listen/and share the following with *EVERYONE* you might come encounter with. Thanks!


[Original Post cntd.]

So just added Barack Obama’s Twitter feed to my “following” list, soon thereafter to find the following in my inbox,

Hi, M. David Peterson.

Barack Obama (BarackObama) is now following your updates on Twitter.

Check out Barack Obama’s profile here:

http://twitter.com/BarackObama

Best,
Twitter

So two things,

1) He has either a bot, or an intern/staff member who is adding anybody who adds Barack Obama to the their following list to his following list.
2) Who really gives a damn, because the bottom line is that even with an audience of around 15k followers on Twitter — or in other words, not even enough people to win him a spot as the mayor of a small city — he’s still paying attention to what’s happening out here in the *real* world.

As per point 2, that kind of stuff matters to me. Maybe it does to you too?

** SPECIAL NOTE: While I certainly disagree with a lot of the comments Len (Bullard) has on this particular subject matter, he also happens to be one the smartest, most well versed individuals I know, and the one person I believe most capable of convincing me that my current line of thinking on any given subject matter we might be debating could very well be wrong. And at very least his thoughts/comments require I step back and think things through a bit more. Whether you agree or disagree with his opinions, his blog is a wealth of knowledge and thought provoking dialogue. If you haven’t already, I would encourage you to subscribe.

Rick Jelliffe

AddThis Social Bookmark Button

One question that comes up really regularly when I have been yacking about the PRESTO approach with people over the last month, is that people don’t see how Objects fit into it. They get Persistent URIs, they get REST, but the Object part is not so obvious. (Actually, I have had several people email me that they approach is one they have been tending towards in their work too.)

One reason, of course, is that the term Object-Oriented is generic and used for a family of related ideas, rather than being a single neat idea. But the PRESTO idea is that the public URLs should reflect an object-oriented modeling of the data and systems, and that you should have URLs for every object in your system even if there is no satisfactory representation of that resource.

Wikipedia says that an object can be viewed as an independent little machine with a distinct role or responsibility which is a good start, but I have always thought a key value is objects was that they can help model the system according to concepts according the users/developer’s/domain’s mind or usage. The aspects of being an object that PRESTO is interested in are encapsulation (the idea that entities should be self contained, with data and methods tightly coupled) and introspection (the idea that you can ask an object about its contents: methods, children, etc.). [UPDATE: Oi! NOT INHERITANCE, NOT RPC, NOT INTERFACES, NOT COUPLING STATE, NOT POLYMORPHISM] Bjarne Stroustrup has commented recently that problems which can be composed into a hierarchy are good candidates for Object-Oriented solutions (sorry, no reference here: it was in a Linux magazine I was reading today, maybe Linux Developer…has a Sun Solaris distro on the DVD.)

In pattern terms, PRESTO is a Facade pattern applied to URLs. In terms of UML, we might see PRESTO as saying that public-facing URLs should be constructed based on some entity analysis such as Use Cases or Package Models.

But the key way to think about it is just basic object concepts. The PRESTO approach says to form URLs so that each “directory” in the URL is an object, and its contents are sub-objects, data or other resources. Methods are not expressed as queries, but declaratively by identifying their result: so you don’t say http://www.eg.com/document/?getGraphic but http://www.eg.com/documents/graphic which then allows you to say http://www.eg.com/documents/graphic/title and so on.

Of course there are often many alternative ways of organizing or categorizing data. Which is why you appeal to use cases to guide you in which the best form is. Indeed, you might have alternative PRESTO URLs for the same data resource.

One piece of software that is highly useful for implementing a PRESTO system is the Tuckey UrlRewrtieFilter which is good for Java-based web servers. We are finding that Rregex-based URL mapping makes the whole thing quite easy and painless, in particular when retrofitting a PRESTO facade on top of an existing web site. The difficulty is largely where it belongs: in figuring out which objects are most interesting or obvious to the users. This is where modeling the particular Use Cases or even Configuration Items comes in.

Rick Jelliffe

AddThis Social Bookmark Button

The story so far

  • In the 1990s and earlier, Microsoft was notoriously prominent in its desire to keep its binary formats proprietary: it provided RTF for text-based interoperability but RTF did not allow full round-tripping of data.
  • In 2000, Microsoft started providing XML data dumps for spreadsheet data and each subsequent version MS Office has used XML more, with the Office 2003 providing quite full support, to the extent where now the default save formats, on the Windows platform at least, are all XML-in-ZIP file, the latest generation with the name Office Open XML (which people often write as OOXML.)
  • In 2004 a European Union agency recommended to MS that it should continue down the XML route and open up its formats by submitting them to some international standards body. (At the same time, a recommendation was issued for OASIS to submit ODF to ISO.)
  • In December 2005 Microsoft founded a technical committee at the ECMA standards body, TC45, which worked for a year and released ECMA 376 in December 2006; during this time the specification, which included much text based on documentation for the older binary formats, grew from about 2,000 pages to over 6,000 pages. A public draft was issued in mid 2006. (At the same time, around December 2005, OASIS submitted ODF 1.0 to for ISO consideration using a variant fast-track procedured: it was accepted with scant National Body review in mid 2006.)
  • At this time (December 2006) ECMA 376 was submitted to ISO/IEC JTC1, the international standards organization, for “Fast-Track” adoption as a standard: the fast-track process is used for standards which have been drafted at other organizations, and enter the process as Final Draft International Standards. At this stage, National Bodies had about eight months to review the standard and come to an initial position. Many National Bodies invested significant effort in attempting various reviews, however this period was also characterized by the raising of many spurious issues. (In early 2007, an update to ODF called ODF 1.1 was released at OASIS but not resubmitted to ISO, with improved accessibility features.)
  • In September 2007, the initial ballot of National Bodies resulted in a significant number of “No with comment” votes, which triggered a Ballot Resolution Meeting (BRM). The BRM had been widely expected, due to the expected large number of comments. in the ISO process, a “No with comment” has also been called “Conditional Yes but many journalists and commentators at this stage preferred oversimplification to reality. Over 3,000 individual comments were received, however the majority of these were repeated form-letter comments part of an organized campaign, rather than coming from fresh National Body Reviews.
  • In mid January 2008, the Editor for DIS 29500 released a promised Disposition of Comments document, containing suggested fixes from ECMA for addressing the National Bodys’ issues: these ranged from simple acceptance, to alternative approaches to rejection of the issue, with their justification for these. ECMA had bundled the issues into about 1000 different responses. I wrote earlier, The Editor’s Disposition of Comments …is usually the starting point for comment resolution, and, given that most comments are uncontroversial, is often the end-point too.
  • In early 2008 Microsoft releases the binary format documentation under its OSP covenant, and promises the mappings between the binaries and OOXML: this seems in direct response to requests for this from NBs, though the mappings are not in-scope for DIS29500’s text.
  • In late February 2008, a week-long Ballot Resolution Meeting was held in Geneva, Switzerland. It was attended by 120 individual delegates from about 34 different National Standards Bodies. The outcome of the meeting was a series of editor’s instructions to allow a new draft of the standard to be create: usually these instructions are completely specific though there may be some general ones, for example to use one term rather than another globally. (At time of writing, March 2008, OASIS has been working on ODF 1.2 which is slated to improve several important ODF weakspots, in particular relating to formulas and metadata. It is mooted for re-submission to ISO during 2008.)
  • The results of the BRM are available online and
    National Bodies now have one month (end of March 2008) to decide if the changed draft meets their requirements. For the new draft to pass, it will require 5 National Bodies (of the “P” class), to switch from Abstain or No votes (remembering that No with Comments may mean “Conditional Yes”)
  • Of the 1027 Editor’s responses, the BRM addressed 189 responses by specific resolutions and discussions of the BRM, and the rest using a paper ballot where each National Body in attendance voted: this accepted 825 of the Editor’s recommendations and rejected 13. (The issue of a paper ballot had been abstain on issues of lesser interest to them.
  • If the new draft is adopted as a standard, it does not remain static but can be “maintained” by the relevant ISO/IEC JC1 committee, SC34, Document Processing and Description Languages. Procedures exist for National Bodies to submit Defect Reports, which again attract the Editor’s attention and National Body voting acceptance, so the kind of process seen at the BRM becomes an ongoing effort, if there is enough interest by National Bodies.

The upshot is that, if DIS29500 mark II and ODF 1.2 both get accepted as standards, by the end of 2008 we should have two standards which together can thoroughly cover the field of representing current and legacy office documents, each representing one of the two dominant commercial traditions, with both under active and significantly open maintenance to fill in the remaining gaps and to repair pending broken parts, with clear cross-mapping to allow interconversion, with an increasing level of modularity so that the can share their component parts, and at least with a feasible agenda of co-evolution and other kinds of convergence.

And if we play our cards well, both traditions will have significant competitive motivation to accommodate the technical requirements of their competitors. Viola, harmonization? (ViolĂ , harmonisation?)

The big picture changes

The “big picture” changes very often concern issues of conformance and modularity.

  • The draft is being split into 4 Standards,
    1. Fundamentals
    A large standard for the core of OOXML
    2. OPC
    Open Packaging Conventions: the details on using ZIP and referencing
    3. Markup Compatability and Extensibility
    4. Transitional Migration Features
    ContainsVML and features not recommended for new documents. Problematic terms like “legacy” and “deprecated” have now been avoided.
  • Six document conformance classes have been created: Core and Transitional classes for WordProcessing documents, Spreadsheet documents and Presentation documents.
  • Six application conformance classes have been created: Base and Full classes for word processors, spreadsheet and presentation applications.
  • The scope sections have been clarified.
  • Normative references are to be complete.
  • Use of standard formats for syntax: BNF
  • Use of standard measures for typesetting lengths
  • Use of standard format for dates
  • Use of IANA/ISO names for language and countries codes
  • Development of a prefix mechanism for spreadsheet formulas, presaging a full namespace modularity system like Open Formula’s.
  • Encouragement for applications to save equations as MathML even if they also save in the OMML maths.
  • Many casual references to MS-tradition technology removed and replaced by references encouraging W3C technologies for interchange

The small picture changes

The small-picture changes frequently are aimed to make the draft more “ISO-ish” and therefore make maintenance and future development at ISO/IEC JTC1 easier.

  • All known typos will be fixed
  • All known errors in examples will be fixed
  • All schema fragments will be marked informative to prevent clashing
  • ISO standard conformance language will be used: shalls and shoulds

The middle picture changes

The changes from the BRM usually relate to either correcting bugs or better documentation. Additions to functionality tended to be limited to providing better accessibility and better internationalization, rather than completing or expanding the general feature set. The Editor’s Disposition of Comments clearly tried to reduce the amount of gratuitous breakage of documents or applications, and the explicit resolutions of the BRM continued this policy IMHO.

  • Accessibility features to support better tabbing (in the fashion of HTML’s tabinfo) and table labelling. An informative reference to guide developers in accessibility features is being added.
  • Multiple changes to support right-to-left writing, half-width character terminology and less US-centric artwork and measures
  • The schemas have been re-written to be more compatible with the frailties of various XSD implementations. The XSD schemas will be included in the text as annexes with line numbers. There will be both Strict and Transitional schemas, following the model of HTML. The RELAX NG schemas have been regenerated accordingly and much improved: many people may find them preferable to the XSD schemas.
  • Hundreds of clearer explanations of multiple elements and functions.
  • Almost all bitfields will be replaced by specific attributes. (The bitfield which accords with ISO Open Font remains.)
  • Fixes to the CONVERT() function and a mathematically proper ceiling function, ISO.CEILING() for spreadsheets
  • A mechanism to prevent applications from executing files with incorrect types, to prevent viruses
  • Strings may not have non-XML graphical characters in them
  • Different hashing algorithms

Plus hundreds more.

Other Issues

Many other related issues were also discussed in the hallways at Genva. For example, the German DIN standards body is preparing a cross-mapping list to match features in OOXML and ODF: there really is very little information on this currently, despite the confident assertions that ODF can/cannot handle everything that OOXML does and vice versa. The Italian standards body is seeking to work on conformance suites for testing: obviously the schemas and BNF grammars allow validation testing of instances for document conformance, so I presume the test suites will be more concerned with application conformance. ISO/IEC JTC1 SC34 has been making various preparations to establish an effective and responsive maintenance regime: ODF could also benefit from this effort.

With over 1,000 changes, I certainly will have missed out some items of interest. Will these be enough to sway the necessary five National Bodies? The changes certainly provide objective extra information favourable to DIS29500 supporters, and the sheer number of changes suggests that ECMA is not going for a first-past-the-post strategy but trying to demonstrate a broader commitment to improvements even from antagonistic National Bodies. But though the anti-OOXML faction doesn’t have any new information to provide a counterbalance (discarding the frantic and self-justifying posturings over the BRM) I expect that they will try to explain their longstanding objections more carefully and acutely, since they do raise many good points.

Impressions

I thought the BRM went very smoothly, for a large high-stakes meeting, and I was happy to make some old and new friendships. In substance, the BRM was a typical ISO meeting of this kind: collegiality, druthers, voting, discussion, corridor meetings, rounding up supporters for measures, trying to track down definitive answers on technical issues, and so on. In accidents, it was very unusual due to size, content and ramifications not to mention the new blood pool.

I think we did pretty well in the Australian delegation, in getting many of our issues addressed completely and most of our issues addressed in part, but (like any standard!) the more you look the more holes you see. There are so many improvements that can and should be made by pro-active maintenance. At various times we had particular help from CA, MY, JP, UK, CZ, FI, US, and several others, so an unofficial thanks to those delegates from this delegate.

Rick Jelliffe

AddThis Social Bookmark Button

I’ve been trying to think of the best way of characterizing the basic classes of typesetting engines. Here’s roughly where I am up to.

There are basically three approaches used by typesetting systems:

Grids
The oldest approach. The page is divided up into grids, and paragraph gets injected line by line to fit between various gridlines. Further gridlines may be placed relative to positions in the paragraph (e.g. the end). In a grid system, tables and lists are really just an arrangement of paragraphs with particular grid relationships rather than being objects in their own right. Troff and Word 1.0 and XSL-FO regions are examples of this kind of approach.
Frames
The page is divided into linked (typically rectangular) areas and the text is poured into them. A table would be considered a frame of frames. Adobe FrameMaker and ISO DSSSL are examples of this approach.
Cells
Cells are objects which have certain fixed and variable properties, such as size etc, and have various relationships between other cells: TeX’s box and glue metaphor is a good example, but ideas of gravity or magnetism are also appropriate. Typesetting involves finding an optimal solution from a system or subsystem of cells. Cells may contain other cells, allowing hierarchical properties. The cell approach can allow very dynamic typesetting.

Each kind of typesetting engine has different ways to get the same kind of effect. Take the example of how a system knows when to break a paragraph at the bottom of the page, or move it to the next. A primitive grid system would have some kind of “requires” attribute on the paragraph, for example to say “This paragraph requires at least two lines free at the bottom of the page, otherwise cast off the page and start the paragraph on a new one.” A primitive frame system might have “widow and orphan” controls, which looked at how the text was spread between the frames. A cell system might have “keep with next” and “keep with previous” properties for each paragraph, and sort out which kind of breaking resulted in the least penalty.

Modern typesetting systems are rarely pure versions of each, of course: the needs for extra features, convenience and interoperability leads developers to graft or cherry pick approaches. For example, a copy-fitting system might be basically grid-based, but use a penalty system and feedback to rejiggle the grid settings for better fit. The extent (how many paragraphs, columns, pages, etc) and granularity (which objects, frames or grids can be rejiggled) plays a large role in determining how much human intervention will be required to achieve high quality typesetting. Think of a Yellow Pages directory: to get good results for these, you need to go beyond what is on the immediate spread but to previous (and therefore following) spreads as well, for optimal the placement of floating display material that keeps in sync with the current running heads.

And even within the same approach of system, there are many possible variations, which page designers will be very aware of. For example, when a paragraph says “Keep 1cm space after me” and the next paragraph says “Keep 2 cm space before me” some systems will work by adopting the greater (2cm) while others will adopt the sum (3cm). We might imagine that primitive grid systems could tend to the latter, while frame systems could tend to the former (and cell systems might do some negotiation or compromise: 1.5cm?) But at this level, it is every man for himself.

One feature of typesetting systems that dominates their design and capabilities is whether they are streaming or in-memory. A streaming implementation has very little lookahead (and probably very little memory of recent pages), and complicated typesetting will be performed by mixes of diversions (where text perhaps in some semi-processed state is stored for later use) or by multiple passes or by checkpoints (a range is read in-memory to allow various typesetting options to be tried and the optimal one put out, the range being discarded: to overcome the limitations of stream-based processing). It is quite rare to find systems that have typesetting rules allowing or using very significant lookahead: even cell-based systems try to localize properties to being object-properties (e.g. paragraph properties) or immediate-location properties (e.g. frame or page properties).

[UPDATE: I am removing any comments not on the topic of typesetting engines. Though of course I really appreciate the readers who defend me, please don’t post comments about individuals. There may be malicious hypocrites at loose in the world, but they can be exposed on other blog items! ]

Simon St. Laurent

AddThis Social Bookmark Button

I spend a fair amount of time providing technical support for friends, family, and the occasional local political campaign. Looking back over the past few years, it seems clear that I’m spending a lot less time helping people with Windows (thank you, Macintosh) but a lot more time helping out with various wireless network problems. Most of those problems seem to be caused by dying routers.

Hari K. Gottipati

AddThis Social Bookmark Button

Well, some people argue that iPhone is not a smart phone because of the absence of enterprise email service. I don’t want to debate whether iPhone is a smart phone or not, but leaving enterprise email, it was much better than a smart phone because of the ultimate browsing experience and worlds best touch interface. And to answer those who say iPhone is not a smart phone, Apple today added enterprise connectivity to iPhone software stack and sent a strong message to RIM that it intends to compete with Blackberry for the Smartphone’s market share. Shares of RIM dipped 3% following Apple’s announcement, to $98.71 (March 6th,2008)

Finally iPhone lovers can check their enterprise email on their favorite toy. Thanks to Steve Jobs and his team for bringing the enterprise connectivity to the iPhone. iPhone is reaching out to the enterprise community with push email, push calendar, push contacts, global address list, Cisco IPsec VPN, auth and certs, enterprise class WiFi (WPA2 / 802.1x), security policies, enterprise configuration tools, and the remote wipe. Indeed these are very good features for enterprise and these are the one missed in iPhone(enterprise point of view) in the past. When iPhone announced, looking at the price tag, I wrote that the price is in the range of enterprise without enterprise features, but I was wrong. Without enterprise features it attracted the crowd and surpassed the expectations. Now with the enterprise connectivity, it is going to go beyond the expectations.

The only question that I have is - is it going to meet/beat the expectations of Blackberry audience or is it going to tumble as Motorola Q. When Motorola launched Q, they had the big expectations of taking over Blackberry. But we know what happened. Blackberry uses their own push technology which is robust and secure, but Motorola relied on Microsoft Exchange email push(ActiveSync) technology and they even bought Good Technology to achieve this. I am not sure whether it is a failure on Exchange side or Q side, but it failed miserably to capture the Blackberry market. In fact Blackberry is adding new customers every quarter significantly.

In today’s press conference, Phil Schiller, Apple SVP said:
“Our customers have asked us to build in MS Exchange right into the iPhone — we have licensed ActiveSync for the iPhone.
Microsoft has come up with a much more advanced architecture, where the iPhone can work directly with the Exchange server in a more reliable and affordable way. We’re building Exchange support so you get push email, push calendaring, push contacts, global address lists, and the ability to remote wipe it.”

Even iPhone uses the same Exchange push technology that Motorola used for Q. But knowing Apple’s state of art software/hardware strategies, I believe that they do lot better than Motorola. At the same time, there is no indication of adding Lotus Notes email to the iPhone software stack in the near feature. Though Microsoft exchange is leading over the Lotus Notes in enterprise email, Lotus Notes has its presence. Blackberry supports both the emails and targets the whole enterprise. With just Exchange email, iPhone may not beat Blackberry completely, but it definitely shakes the Blackberry.

What do you think? Will it beat Blackberry?

Thanks to Dr. Kiran Mudiam for pointing out the Good technology.

Rick Jelliffe

AddThis Social Bookmark Button

I had a letter from a reader, R, today, which I thought I would share with you. R is a person seemingly paralyzed by fears, assailed by the uncertainties in life, and prone to the most violent of doubts. Anything I can do to help R out of this unfortunate state of mind, I think it is my duty to do.

R writes:

Dear Rick, I have been reading your blog now for about a year, and am a great fan. We do not always see eye to eye, but your continued moderation is admirable and an inspiration to all right-thinking people. However, I am really worried about this Geneva meeting: I cannot sleep due to worry that there has been no attempt to provide any extra security against viruses in Open XML. I thought I could trust my local delegation to pursue this, but I cannot see any progress

Well, R, I can report that the BRM has in fact accepted an proposal to reduce the scope for viruses in OOXML files: I am not sure whether I can talk about individual issues yet or not, but it is issue AU-9 or Response 12.

Rick Jelliffe

AddThis Social Bookmark Button

Imagine, if you will, that you are part of the delegation from a small country, lets call it Freedonia, to some standards organization, lets call it INSANE (Inter-National Standards Associations ‘N Experts), and you are thinking about how to get the best result on a meeting on a new XML format of an agricultural nature, perhaps concerning bovine methane: lets call it DOOF (Daisy Open Orifice Format).

Now know you don’t have enough slots to present all your issues. So, Keanu, what do you do?

If I were placed in such a hypothetical situation, here is what I might suggest, secure in the knowledge that mine was not the only opinion or approach.

First, I would say that there should be some sense of the priority and urgency of issues from the discussions by technical committees at the hypothetical national standards body, lets call it Standards Freedonia. That will give the basic shape of any ranking.

Second, I would say that you would need to consider gaming aspects. You would figure out which of your issues are also probably the high priority issues by other National Bodies, and adjust their rank down. You don’t want to waste your slot on an issue that will clearly be brought up five minutes later. As part of this, you would also think about which of your issues can piggybacked on other issues that you expect another National Body to bring up.

But the third factor, I think, is the most interesting. It is the people factor.

Lets divide users of the DOOF standard into three classes:

1) Implementers of DOOF office suites. These number in their hundreds. They typically are very smart, and often working with large code bases. They will not so much be interested in basic functional details, but the arcana that is difficult to get from reverse engineering. Their key requirement is completeness.

2) Developers of DOOF file back-end systems. These number in their tens or hundreds of thousands. They are not necessarily the brightest bulb in the drawer, but are clearly the most handsome and charming. They need good documentation on the basics, and are less interested in the arcana unless their incoming documents happened to have significant use of arcana. Their key requirement is clarity.

3) Users of DOOF suite applications. This is a class that can be easily forgotten, but it may number in the millions or even hundreds of hypothetical millions. These are not necessarily technical savvy people at all, and they tend to be underrepresented at the standards level, especially because of the predominance of INSANE experts and, in a full moon, various influxes of suits. But some parts of the DOOF standard may in fact be directly targeted at them: you might think of a hypothetical formula language for doing calculations, to pick an example at random. Without being patronizing, “just folks” use these to make important decisions. Companies use them for planning and product calculation. Builders use them for calculating parts and material. Err, I mean, on the farm. Perhaps a wind farm. Their key requirements are correctness and usability.

This third category has requirements which deserve to be taken with the utmost seriousness: the largest group, the most unrepresented group, and in a sense the most vulnerable group. The first group is well-organized and capable, but a niche group and you may not have many of such people in Freedonia, not even hypothetically. The second is not so well-organized, but larger numerically and their needs are worthy of consideration too. But numerically they pale by comparison with the just folks.

Based on these thoughts, you might look among your DOOF issues for one that has a high priority from Standards Freedonia, is not one you expect to be raised early by other INSANE national bodies, and which helps ordinary people most. I am sure other people would have other useful strategies too and ways of divvying up the population. For example, your fellow INSANE member from Snowdunderia might want to put the needs for capturing methane for heating as one of their priorities, measured in Bovine Thermal Units or BTUCOW, but be waiting for someone else to bring up an opportunity to pounce.

Rick Jelliffe

AddThis Social Bookmark Button

Many people don’t find abstention easy. Some don’t have the habit, some don’t see the point, some people are irrepressible, some people are used to having their way, and others think it is an attack on their rights and duties. Having hung around a few different standards bodies, it seems to me that one of the distinctives about ISO/IEC JTC1 is the role that voting abstain plays. Other standards bodies have it, but there seems sometimes a stigma or idea that abstaining from a particular vote represents a failure in expertise: a loss of face and an insult to pride. The worry that you need to be on top of everything, perhaps coupled with the paranoia that people are trying to scam you. But, as Clint Eastwood says, a man’s got to know his limitations.

Lets look review the Fast-Track procedure. The JTC1 Directives (which have sway here) allow National Bodies three kinds of reply on a standard: see s 9.8 (bold added by me; DIS means Draft International Standard, DAM means Draft Ammendment, NB means National Body):

Approval of the technical content of the DIS as presented (editorial or other comments may be appended);

Disapproval of the DIS (or DAM) for technical reasons to be stated, with proposals for change that would make the document acceptable (acceptance of the proposals shall be referred back to the NB concerned for confirmation that the vote can be changed to approve);

Abstention

Note that the only criteria countenanced under these JTC1 rules for approving or disapproving a fast-track standard is because of the technical content: it is or isn’t up to scratch. Editorial issues alone are not enough. However, any significant comments, even editorial ones will trigger a Ballot Resolution Meeting, where these things can get looked at: they don’t disappear into a black hole. Under the JTC1 rules, non-technical and non-editorial issues just don’t seem to be legitimate grounds for acceptance or rejection: the only slot for a National Body wishing to act in good faith to the JTC1 Directives but who have significant non-technical and non-editorial concerns is to abstain.

Now, a National Body that votes disapprove has a duty (JTC Directives s13.7) to participate at a Ballot Resolution Meeting (BRM). A Ballot Resolution Meeting has to be open to representation from all affected interests, convened in a timely manner, keeping in mind the spirit of the fast-track process. (JTC Directives s13.1) “The spirit of the fast-track process” does not seem to be a defined term.

Issues from National Bodies that arise after the deadline for the initial ballot (or after the BRM, or where the BRM did not go far enough in some desired direction or went the wrong way in the NB’s opinion, etc.) get handled by the NB raising defect notices with the Steering Committee looking after the standard (in this case, SC34, after the fast-track gets standardized. As well, NBs (and ECMA or other liaison bodies) can raise an immediate draft amendment, which can itself go through the fast-track procedure! (If an NB thinks the editor’s instructions have not been followed, they can raise the matter with the ITTF (the body responsible to make sure that the BRM’s instructions have been followed) who, as I am sure is expected of them, will respond with a service-oriented attitude of “Whoops! Thanks!”

A Ballot Resolution Meeting for a fast-tracked draft is unusual because what comes out of the meeting is a set of editor’s instructions. I have read some incompetent reporting on other websites that somehow a BRM’s result is an approval or disapproval of the standard in question. Never let the truth get in the way of a good story, I suppose.

My experience of ISO/IEC JTC1 is only through Steering Committees, Working Groups and a certain recent Ballot Resolution Meeting, on and off since the mid-90s. However I have also participated in multiple groups at W3C and observed OASIS and IETF. The thing that is interesting in JTC1 meetings, from what I have seen, is that there is usually a really strong idea that you do not block the minority interests of another national body, just because you have no interest. (I have seen a committee basically fall apart because one NB dominated and tried to block the legitimate and specific interests of another NB: what happens when NBs attempt this kind of selfish trick can be that the parties who were stymied lose faith and simply go to another standards body.)

An effective delegation at a meeting who have niche requirements will take care to remind other NB’s delegations that unless they have technical expertise in that area, they should abstain. Or if the niche requirement may be significant for broader concerns, an effective delegation will try to explain in or outside their meeting what the technical issue is. However, it is part of the gentleman’s agreement that you vote on the issues: a delegation with particular issues shouldn’t have to make a specific request for other NB’s to abstain on issues that they do not have an actual technical opinion on, any good faith delegation will attempt to do that anyway (though sometimes they may get lost amid all the other tasks.)

I have found that in the ISO meetings I have experienced, the contributions of the individual are really important. In SC34 you think of the contribution of James Clark for example. This was a theme of Martin Bryan’s memorable phrase standardization by corporation (e.g. see farewell report as chairman of SC34 WG1.) The system is geared to having deep experts who are highly sceptical, but who very willingly defer to others in areas outside their expertise. In fact, the ISO Directives (part 1 s 1.11.1, a splendid number) define a Working Group as comprising a restricted number of experts who act in a personal capacity and not as the representative of the…organization…by which they have been appointed however the JTC1 Directives nuance this (s2.6.1.2) WG members shall, where possible, make contributions in tune with their respective NB positions (which does not in any way stifle individual contributions, as long as the status is clear.) There is an interesting example in JTC1 Directives Annex J3.1, concerting the development of standards for APIs, which explicitly mentions that multiple kinds of experts are required. I am not saying that generalists or observers are not important in technical meetings, however, the meetings are technical and need technical people: governments wishing to participate more in standards need to be asking themselves what programs they have in place to develop and encourage the necessary range of deep expertise in order to be effective at this level. (And one of the best ways is to start to send experts to meetings, and getting them to review standards of different sorts, and to expose them to standards practices of different organizations to help them to be critical and functional.)

Technical experts are frequently ratbags, a (nowadays quite fond and) useful Australianism.

Macquarie dictionary (1991):
n. colloq. 1. a rascal; rogue. 2. a person of eccentric or nonconforming ideas or behaviour. 3. a person whose preoccupation with a particular theory or belief is seen as obsessive or discreditable: that Marxist ratbag. -ratbaggery, n. -ratbaggy, adj.’

but the ease of abstinence at ISO tames this tendency. I have read more than once that new people coming to the SC34 meetings are surprised at the level of helpfulness and collegiality that usually can be seen (and I think Ken Holman had a lot to do with achieving this tone.)

JTC1 groups try to act by consensus. But consensus is not unanimity, but is defined in part as a general agreement, characterized by the absence of sustained opposition to substantial issues…. To understand the role that abstention plays in ISO, I think you have to see how it dovetails into this definition of consensus: consensus is not an issue of achieving an absolute positive majority of all parties! In fact, JTC1’s view of consensus demands the ready availability of the option to abstain, otherwise NBs and participants will be forced to make decisions they don’t wish to or are not competent to or are not briefed to.

Voting “abstain” on issues at ISO is not a failure. Indeed, sometimes the briefs for delegations have instructions that require them to abstain. But experts who have to abstain can still be critically valuable to the process. Because of this, and because of the mutual spirit of accommodation and collegiality that usually prevails, abstention is easy and a more frequently used option than people used to other standards systems may feel comfortable with initially. But is it not for no reason.

Rick Jelliffe

AddThis Social Bookmark Button

I’m writing this sitting in the sun looking at the pool, somewhere tropical, en route from the exhausting ISO/IEC JTC1 SC34 DIS29500 BRM meeting (hoping for my lost bags to appear and with every flight delayed by up to 12 hours). And not an acronym in sight here!

Apologies to readers; I took down the rest of the article, because it was proper for me to report back to Standards Australia first. This is quite reasonable, I think. But several sites copied the following from caches:

I’ll blog some more, but the BRM clearly has succeeded in its formal aim, which is to produce a better text. Every response by the editor was formally voted on. The big picture issues were given extra time for detailed discussion, and the NBs had opportunity to raise their highest priority issue, in turn. It would have been great to have had more time to deal with more of the middling issues: where we would have preferred some variant or augmentation of the Editor’s response to our issue or where we didn’t like his answer.

The context of this was that the meeting was productive and calm:

The BRM went pretty much the way I expected: grinding through the issues, politeness, assertiveness, corridor sessions, strange bedfellows, a lot of newbies who made up for it with articulateness, candour and brains. In substance, it was a typical ISO meeting: issues, votes, different personalities and cultures interacting, some people happy, some people pissed off about individual results, limited time, stimulation, mind-numbing alterations to resolutions, convivial dinners with fascinating techoes, late-night study sessions and early morning drafting gallops. But in accidents it was very odd indeed: not just the size of the meeting and the size of the draft and the sewerage farm of disinformation surrounding it…what is atypical is the large number of non-technical delegates and that a few delegates seemed surprised that their delegations would have to figure out a position on each issue by the end of the week (which could be “abstain - we have no position”.) It is not as if they hadn’t been told!

And after that quote was material emphasizing that there is a maintenance process to fix outstanding issues and new ones that get discovered:

There are a lot of those, and they will have to go to maintenance, which really is the big issue: will MS continue these baby steps to openness or will it go soggy once out of the spotlight, which is not unprecedented by other standards stakeholder? Even after the final vote (assuming an acceptance vote, as seems likely) governments will need to keep the pressure on Ecma to continue working with SC34 and to get these outstanding issues addressed ASAP; it is not the case that unaddressed issues need to disappear down a black hole, but SC34’s only power comes from having strong government and user backing to give this maintenance the steroids it needs: this not only means monstering MS to continue through maintenance, but also (for governments) to provide adequate resources: staffing, delegates, and long-term support for participation at standards meetings.

I have more details at What is in the new draft of OOXML?. Brian Jones has a fairly detailed Narrative of the ISO/IEC DIS 29500 BRM Meeting that is very factual. I recommend readers take a lot of the other material on the web about the BRM with a large grain of salt.

Advertisement