August 2007 Archives

Simon St. Laurent

AddThis Social Bookmark Button

I mentioned this a a month ago, but that was, well, a month ago, and the deadline is tomorrow. The XML 2007 Call for Papers ends tomorrow.

Proposals need to include speaker information, a short abstract, and a suggestion for its track. We have four tracks this year:

  • Documents and Publishing

  • XML on the Web (I’m chairing that track.)

  • Enterprise XML

  • XML Training

Lauren Wood (the previous chair of this conference) has posted advice for proposal submissions that I heartily recommend.

Rick Jelliffe

AddThis Social Bookmark Button

Here is my free advise to headline writers: please use “Maybe” for the countries that vote “No with comments” on DIS 29500 (Office Open XML).

Those are effectively the four major votes that can be given on an ISO standard by a national body. As always, the best place for disinformation on votes is headlines.

An vote by a national body of “No with comments” is a “Maybe”, and not an absolute “No”. Looking at it more, I wouldn’t now go as far as Job Bosak’s comment that “No with comments” is the same as “Conditional approval”, however. What really matters is the particular comments: if they are doable or reasonable and inline with goals of the standard and the proposer’s conception of the standard, (and if no-one’s hair is on fire) then No means Yes. But if the comments are undoable or unreasonable or out-of-scope for the standard’s goals or depart from what is acceptable to the proposer, the No means No.

As in “New Zealand says Maybe!”, “India says Maybe!”, “Japan says Maybe”, “China says Maybe”, “Brazil says Maybe”, and so on. Is is not so difficult is it? (Now even then there is scope for variation: “New Zealand says Maybe but probably not” or “Japan says Maybe, but probably” for example. But that would require actually research.)

And for journalists struggling to write the story well, here is another big tip: the votes are on particular drafts and the technical and editorial issues in them. So when there is a “No with comments” vote, that is a vote on the particular draft — a book in progress — not on the underlying technology. A careful writer will distinguish between DIS 29500 (the book being voted on) and Office Open XML (the technology.) Sometimes this distinction does not make a difference, but sometimes it really does, especially in the case of “No with comments” where you may be in favour of having a standard for the technology but want some improvements in the draft. In that situation, treating “No with comments” as the same as “No” misrepresents the process.

Rick Jelliffe

AddThis Social Bookmark Button

You’ve probably seen it. IBM’s Rob Weir’s 2006 diagram comparing the number of pages of various standards versus the time they spent in committee. It makes its appearance unchallenged regularly: indeed IBM (business rival of Microsoft)’s Bob Sutor gave the diagram a prominent place in his blog this week with what, presumably at this last stage, contains the essence of IBM’s argument against DIS 29500 and Office Open XML.

At the Standards Australia meeting, the diagram was brought out again, and I protested that it was misleading, but seeing Bob’s blog makes me want to explain my criticism more. Here is the scary diagram:

spec-speed2.jpg

Digression

The issue of page count and book size is prone to publicity stunts. If you look at this web page, for example, you can see two different printouts of the open XML Spec, The first manages to fit in boxes under a man’s arms (and we don’t know how full the boxes are) while the second manages to be taller than a man! What can account for this doubling of size? Perhaps it is the magic of single sided printing and thick paper :-) (In the 1990s I was discussing a book with a publisher who said “it has to be 1.5″ thick, but if you don’t have enough material we will use thicker stock”! ) Say we have 6500 pages, and we print it at the maximum common paper weight of 105 weight Bond ledger, that gives us almost 3 metres of print out (10′)! But if we print it at the other minimum common weight of 16 weight, that gives us a tad over 50 cm (20″). On average paper weights, this should give about 64 cm (25″).

But back to the main story. I’ll deal with the issues I have with in reverse order of their seriousness.

Apples and Oranges

If you are using page size to compare documents, you really should make sure the documents are typeset the same. I moved the Open XML spec down from its extravagent 11pt body font and large heading spacing to follow the ISO standard 10pt.

Viola, I estimate that about 1,000 pages can be reduced by this. (Added: I estimate this because I tried it. I saved 800 pages on part 4 alone just by moving to 10pt and more typical ISO clause spacing. Technically, this is because there is so much display content and two-line paragraphs that get pushed to the next page, cascading with many paragraphs taking one line fewer.)

spec-speed3.jpg

Difficulty of Review

The diagram uses page size as a unit of preparation and review. However, not all pages are equal. A page that contains normative text requires much more review than a page of informative text. A page that contains auto-generated text requires almost no review at all: you sample enough instances to have confidence in the autogeneration and then skip the rest.

Now this is especially relevant for DIS 29500, because it contains enormous amounts of non-normative/tutorial text and of autogenerated boilerplate. ODF editor Patrick Durusau this week tried a an experiment where he removed this fluff, and he reduced the WorkdprocessingML specification from about 1880 pages to about 600 pages (and he thought it could go a few hundred pages more!) Most standards avoid tutorial and non-normative material because it increases the tedium of the review process and confuses readers. A good tutorial is usually a bad standard, and vice versa. DIS 29500 is a really extreme example of this.

So lets say that only a quarter of the text is normative and non-autogenerated (based on Patriclk’s results, and considering the impact of the normative Part 3 and so on, And that the non-normative text and autogenerated text takes about 1/3 of the review effort. That means that, effectively for review purposes, the document requires only half the effort for the number of pages.

So divide the effective page size in half. (The legend “Number of pages” becomes “Review effort expressed in terms of equivalent number of normative pages”)

spec-speed4.jpg

Time spent in Review

Now lets look at the other axis. Wier’s numbers here seem to be based on the time spent in committee before coming up for a vote. That might be interesting a year ago, but it is positively misleading a year later. Why is it still being bandied about like this?

In the case of ISO fast track standards, there is the whole review process by ISO that is omitted: the informal discussions with SC34 before submission, the 1 month administrative review period, the 1 month contradictions response period, the 5 month technical review period just coming to an end, and the ongoing review where each national body looks at each other’s comments over the next five and a half months before the Ballot Resolution Meeting in Geneva, which I expect to happen. That is a full year.

So add an extra 370 days there.

spec-speed5.jpg

Nature of Review

The work that a committee does in compiling or creating a standard for a pre-existing technology is very different from what the work that a committee does in creating or augmenting a standard. When the proprietary Torx screws became an ISO standard, one can imagine that the committee had little to do. By contrast, the committee that produced the ISO PDF/X standard had a bit more to do, but still no where near what they would have to do if they were developing a standard fro scratch.

The work is review and discussions of policy, relieved of what-ifs and who-needs-this? As a completely conservative estimate, lets say that development of new material takes half the time, and review takes half the time.

Since we are measuring this in pages, lets be conservative and say that this relieves the committee process of 25% of its workload, and express that in effective pages.

spec-speed6.jpg

Since we are looking at the workload of a committee, what about where a committee doesn’t have to author much, but is presented with a selection of workable drafts from the pre-existing documentation of a product? That is obviously a lot less work than writing for scratch, especially for the editor.

So lets say that this makes a committee 25% more effective, and express it in effective pages as before.

spec-speed7.jpg

The other standards

Now, of course, to compare apples with apples, we would have to do the same procedure to the other standards, and they would move in the same kind of direction to a greater or lesser extent. But none of their shifts would be anywhere near as much as Open XML’s because it has the quintuple whammy of typesetting, fluff, the BRM, the lack of need of development, and pre-existing editorial material.

Furthermore, these other standards are not standing still. ODF has moved to ODF 1.1 with 1.2 in with works.

I have two other additional reasons why I think the diagram (or, at least, the way it is used) is misleading.

Ex nihilo?

The first reason is related to the last segments above. It is really not fair to compare a markup language for an old technology with a markup language for a new technology merely on the basis of the committee time. Microsoft moved into documenting text formats for its standards when it purchased RTF from DEC around 1990. A lot of the documentation in Open XML is adapted directly from the RTF and DOC documentation. Its basic strengths and weaknesses are well-known and long documented.

There have been perhaps fifty different versions of the .DOC format, on six different operating systems over the last twenty of more years. To ignore this history and just use committee time as the metric seems to me to miss out something important. A new standard does not come with all this prior work (and baggage).

I am not sure how to diagram this. Perhaps a line indicating the time the technology and documentation was in development before the start of the committee process? Lets date that from the advent of RTF rather than from the first .DOC format.

spec-speed8.jpg

VML is a particular issue here: it was introduced into IE 5.5 and presented to the W3C committee. To ignore that early development and attempted standardization work seems to miss something important, again which is why I think we have have to be careful not to be mislead by the diagram.

Separate Technologies

Finally, my other problem with the diagram is that people use it to say “this is so big it cannot be reviewed”. However, Open XML is made from five or more completely distinct sublanguages: OPC, WordprocessingML, SpreadsheetML, PresentationML, DrawingML, VML, and then the extensions mechanism of Part 5. One person iis not expected to review a whole standard, it is done in co-operation with a committee. India is a good example here: they had separate task forces working on each of the three major application schemas.

So while the size of the draft in total is large, it can be decomposed into smaller sections and reviewed. There have been over 2200 people involved in national standards bodies reviews, I am told: that is a lot. If I was being as free with numbers as some people are, I would say that this represents about three pages per person! But of course, that would be just as flawed logic as accepting Rob’s diagram at face value.

So lets divide up the specification into its parts, and see where they fit on the chart. I’ll take into account the extra time for review, but just use the current raw page count for OPC (part 2), and the individual languages of Part 4 and 5. We get a diagram showing the size of each distinct (and therefore separately reviewable) sublanguage in page size of the current draft.

(If you select “View Image” or the equivalent in your browser, you will be able to see this a bit more clearly: the OReilly formatting system may get in the way here.)

spec-speed8.jpg

And finally, lets have a look at what happens when we look at these separate languages, but get rid of the fluff as I suggested in the submission I sent to my national body for their consideration on the Australian vote. For WordprocessingML we will use the number that Patrick Durusau found when he stripped out the fluff: about 800. For the other largest four, we will just say that half is fluff, being conservative. (Actually, in my submission I want to remove some lists of examples such as border art to another part, but border art is hardly taxing on the reader.)

So this is a diagram of the estimate page count of normative pages in the component language standards of Open XML, against the time spent in Ecma and ISO development and review (and assuming a Ballot Resolution Meeting).

spec-speed11.jpg

Note that this diagram does not include the “effective size” considerations above, so the position of the new items can be compared directly with the other pieces of data on the page, as apples to apples. To the extent that the other issues raised above apply to each language, their star would move left (and up); however, for a good comparison the other standards mentioned would also have to have their position adjusted in accordance to the same factors: however, as I mentioned, because the other technologies consist largely of normative material, the adjustment would not be as great; the other technologies might also need to have ISO process time added too, I don’t know whether Rob’s numbers include that or not (the effect would be add six to twelve months in an upward direction to some of the blue points.)

Bottom Line

So that is seven reasons why I think the diagram is misleading. Or, at least, why the diagram itself does not give data that is particularly useful for anything other than mindless sloganeering.

What I don’t understand is why people are not on to these kind of tricks. Big standard, ooh scary. Have people never heard of Adam Smith and the division of labour? Have people never changed font size and had a different sized document as a result? Do people think that all text is equally taxing for review? Do people think that adapting a standard from pre-existing text is not easier than writing (and indeed) developing the standard from scratch? I suspect that many people see that on the original graph the OOXML point lies so far to the right, and because pages are easily countable, they don’t have any alarm bells ring.

So let me ring your bell, if I may: what the original diagram tells us is that the standard has a lot of text. And that one stage of its life in a committee took about a year in 2006. both those things are such a partial piece of the picture (where is 2007?) that while they are of some sensational value, the diagram can be misleading.

M. David Peterson

AddThis Social Bookmark Button

Fedora Commons - About - News

Fedora Commons today announced the award of a four year, $4.9M grant from the Gordon and Betty Moore Foundation to develop the organizational and technical frameworks necessary to effect revolutionary change in how scientists, scholars, museums, libraries, and educators collaborate to produce, share, and preserve their digital intellectual creations. Fedora Commons is a new non-profit organization that will continue the mission of the Fedora Project, the successful open-source software collaboration between Cornell University and the University of Virginia. The Fedora Project evolved from the Flexible Extensible Digital Object Repository Architecture (Fedora) developed by researchers at Cornell Computing and Information Science.

Nice! Congratulations, Fedora Commons!

The press release continues,

Rick Jelliffe

AddThis Social Bookmark Button

Vote “No”? But aren’t I supposed to be Microsoft’s biggest fanboy? Well, what I mean is a conditional approval, not a rejection. There are some things that can be fixed and should be fixed, and an ISO Ballot Resolution Meeting is the best forum to make sure it happens.

I’ve been quite active in the debate on adopting Office Open XML as a standard,* and this blog has frittered away many bits on explaining why (because it would be useful in my industry, which is industrial publishing and markup, and we have been demanding it for a long time) and why many of the specific reasons given against OOXML are flimsy (how many self-assured people have raised “autoSpaceLikeWord95″ who have no idea what a fullwidth character is, for example?) But not all was plain sailing: along the way I have pointed out several flaws that I thought needed to be corrected. A mild diversion has been to look at the various claims of bribery or faulty procedure bandied about.

On my travels, when I have been asked about how National Bodies should vote, I have always said that there is nothing wrong with a “No with Comments” vote, if the comments were doable. Indeed, this is exactly the vote that I have recommended to my national body, Standards Australia.

The actual list of comments I sent is here. Please note that these are just one person’s comments, not the official position. I have no idea how Standards Australia will vote, but I strongly urge them to vote “No with Comments”, specifically with my comments. I have tried in the comments to address many of the issue that people raised, and to limit the comments to issues that are relevant to Australia (which Standards Australia is quite keen on.)

Now when reading these comments, please realize that the intent is to state the technical and editorial position as clearly as possible. (When I say something is unacceptable, that is only in the context of the suggested fix to make i acceptable, not any claim that something cannot be fixed by the normal BRM process,) The whole point of these comments are that IMHO the big flaws in the standards are fixable (and fixable by the current processes) and that the edge-cases are not critical and can be left to maintenance.

In my comments I have attempted to expose the principles behind the comment, and to limit them to comments relevant to Australian industry. I definitely concentrate on getting the high-level issues right: the name of the standard, the organization of it, the conformance section, the over-abundance of non-normative text, the need to allow standard notations, and a future-proofing issue. My view is that getting these high-level issues right takes the sting out of the tail of many individual problems and edge-cases, and addresses many of the technical issues that people have raised piecemeal,.

Rick Jelliffe

AddThis Social Bookmark Button

The deadline for the National Bodies to vote on DIS 29500 Office Open XML (fast-tracked from Ecma 376) is coming up on September 2: this is the vote that comes at the end of the 5 month review, which is 7 months since the draft was submitted, and probably not the end of the line. Here is some of the news, as a companion to my next blog item, which is about what I recommended to Standards Australia for our vote.

There are not many informed ideas of where the votes will go. The US body looks like it will vote Yes (I predicted an abstention there) as seemingly will Germany (I would have predicted a “No”). India looks to be voting “No with Comments”, which I am pretty happy about since that I commended that vote to them when I was there.(Note that that news story gets it wrong about the impact of a “No” vote—a simple”No” and “No with Comments” are utterly different beasts, as Jon Bosak has prudently pointed out. (But even a “No with Comments” may not, in effect, be a conditional yes if the comments are impossible to fulfill: obviously no National Body will put in merely vexatious comments, they don’t want to wast time, however, they will state their requirements in a clear way that allows rapid resolution of issues.)

It seems likely that there will be a Ballot Resolution Meeting: if there are not enough “Yes” votes and enough “No with Comments” (i.e. ‘conditional yes’) votes, then a meeting is scheduled to see what technical changes need to be made to satisfy enough national bodies’ requirements. A meeting has been scheduled for Feb 25-19 in Geneva, with UK’s Alex Brown appointed as convenor.

There is a last minute frenzy on the contra side: IBM’s spokeman is claiming lots of alarming shenanigans without actually giving us the benefit of any details: names of countries, parties, dates, anything tangible. Stephane Rodriguez is complaining that he has to look at the schema and documentation when editing Open XML files; his new blog is notable for the number of times it says that there is a problem with OOXML but actually refers to the some implementation issue in Office 2007

Behind the scenes, Patrick Durusau (the ISO ODF editor) has been working on a really interesting and useful project. While he is not keen that people use ISO Open XML, he is keen that the quality of ISO standards should be maintained and he sees OOXML as a way to get MS’ technical requirements on the table to help future ODF improvement (whether by cherry-picking, mix-n-match or knowing what to avoid.) I suggested to him a time ago that one approach to fixing DIS 29500 would be radical surgery: removing all the explanatory and non-normative material. At the moment it is far too tutorial. That is fine for the Ecma version, but gets in the way of an ISO-quality standard. I had also suggested that the schema fragments were otiose too, and that the 11pt body text should be 10 pt. . So Patrick has gone ahead and stripped out the fluff from the WordprocessingML chapter and with tighter formatting he was able to go from 1874 pages to 607 pages without altering the technical content!

Knowing Patrick, I expect he would not release his Open-XML-Lite because having extra drafts floating about in public just provides more fodder for the lunatic fringe, however he sent me a copy and I think it is great. It is a real proof of concept that there is indeed a workable spec lurking inside DIS 29500. I would really urge National Bodies to include comments that request or require that the non-normative material in DIS 29500 be removed. That will make maintenance, editing and use much clearer. The Ecma TC45 got it wrong here; or, at least, they went with the “friendly” view of a standard, where it is best that a large ISO standard avoids being too tutorial.

I am hoping that once the vote is over, the PR considerations of the big boys will take a back seat: with a couple of “Yes” votes from some large countries, MS has its marketing material to say that Open XML is credible; with a couple of “No” votes IBM has its marketing material to say that ODF is the way forward; and with enough good comments and a sharply-run at a Ballot Resolution Meeting, the baying mobs will lose interest; and the nerds can get down to improving the shortcomings exposed in both OOXML and ODF., If things go as the process is geared to make them go, IS 29500 OOXML and IS 26300 ODF should ultimately provide a really useful pair of technologies.

Kurt Cagle

AddThis Social Bookmark Button

It was perhaps inevitable - having turned the geospatial Earth into an animated, zoomable extravaganza, Google has turned its gaze skyward. With Google Sky, the tens of thousands of Hubble based images (as well as those of more prosaic Earth-bound telescopes) have been knitted into a seamless fabric that lets you explore the universe in myriads of ways - from zooming in on the Pinwheel nebula to charting the luminescent clouds of the Eagle hatchery.
Rick Jelliffe

AddThis Social Bookmark Button

This decade has seen a tectonic shift in technology: the new information applications which are succeeding are those in which information is based on simple topics; the new document major document formats are those which allow the packaging of a topic.

The organization of information in to simple interlinked topics, typically something that can be described in a single phrase, is the common factor between such seeming disparate but succeeding technologies as the web-based Wikipedia, Amazon, Google, Ebay, Flickr, MySpace, YouTube, blogs, RSS, but also has had strong impact in non-WWW areas: the ITIL Configuration Item, the SCORM Learning Object, the S1000D Descriptive Module, integrated UML systems, for example.

The difference from the WWW in general is that though web technologies indeed encourage small pages, their is no necessity that pages are about one topic in particular. So the WWW is an excellent basis for implementing topic-based systems, but not itself one. Similarly, RDF may allow resources to be linked, but these are not necessarily at the level of topics. Another way of looking at topics is that a lack of topicality is what makes an poor index item poor.

There has been a decade long process at ISO SC34 to make and develop a series of standards based on topics, for example the Topic Map standard, IS 13250. This is good technology (like Xlink and RDF) to look at when considering how to implement a topic-based system.

The rise of Topics represents a great challenge to operating system and desktop suite vendors. When we look at Windows, or Mac or Linux window managers, we see that they really interact with the user at the wrong level. They say that the topic the user is interested in is applications and files. But how many people nowadays start their computer interaction with a web browser pointed to Google? There are still people whose organizing topic of interest in their computer interaction is the file or application, of course, but they have been swamped by people who are interested in the topic.

There are interfaces which organizes the user with different topics: most notably the Sugar interface of the One Laptop Per Child ($100 computers) in which the primary metaphors are the person (and their private activities and journal), the neighborhood, and the group (and group activities and bulletin board.) The interaction topics are “people, places, objects, actions”. But as with the desktop, these are not topics in general, just the topics of one domain (a fairly compelling domain, that of children and communities).

Indeed, we can see the large successful web applications as being topic-based interfaces each for particular domains and scopes. A lot of the Web 2.0 or Social Interface systems talk focuses on the human or social or write-able web aspects; my question is this: should we think of Topics as the “how” and the social aspect as the “why”, or should we think of the Topics the “why” and the social aspects as the “how”?

Moreover, should Linux, Windows, Mac and all seriously respond to the rise of Topical Interfaces by ditching the desktop metaphor? I tend to think yes: in terms of my supprt/runner/plug-in model topic interfaces belong at the “suite” level, and a desktop interface is just another suite.

One reason I found (and still do find) the Windows desktop so cumbersome to use compared to the a UNIX shell or the old Mac desktop was that it never seemed to provide me with the topics I was interested in. When the topic was “Installed programs” it lets me look at a menu from the start button, but not all programs are there; I have to switch to a completely different system, the file explorer, and look in Program Files and figure out from the files and directories what applications are there. We have to fight with the army we have, not the army we want, but we won’t win unless we have the army we need.

Topical Interfaces have eclipsed the Desktop Interface and are severely challenging the central position of the file., because increasingly the value of some information is in its linked-in-ness to some larger system. From this point of view, the recent trend (JAR, WAR, EAR, ODF, Open XML, SCORM, etc) to use ZIP and therefore package together all the files needed for one application session can be seen as an attempt to turn documents themselves in to a container for a bounded topic. OOXML’s Open Packaging Convention (OPC) represents the high-point (though not the state of the art, for which see RDF and ISO Topic Maps) in this trend, adding a linking and typing mechanism (relationships) within the ZIP package, However, the moves to make a platform out of the office suite and out of the Web browser (and the various Java Rich Client Platforms such as Eclipse, NetBeans, and so on) fall short of providing the integrated, topic-based interfaces.

The two worlds need to converge: we need Topical Interfaces which lets us navigate between and within topics and perform transactions, but which also allow each Topic can be bundled and shipped around as a document.

Kurt Cagle

AddThis Social Bookmark Button

I don’t normally like using this column for promoting my other projects, but I’m weighing this against the fact that I actually have some interesting news to pass on. Thus, my apologies for the self-aggrandizements - I think you may find it worth it.

First, I have recently significantly upgraded the XForms.org portal. While I still support the forum, the role of the portal has expanded to become a general resource for anyone working within the XSLT, XForms, or XQuery space, and I’m expanding this into the Semantic Web realm as well. From XForms.org, you can find relevant blogs from the web, news articles, job listings, and linked resources, and I shall soon be adding calendar listing s of conferences and other events. I’ve also simplified the interface, such that commonly requested features such as the most recent aggregate blogs are available with one click in a simple interface, and specialized listings are no more than two clicks away.

Uche Ogbuji

AddThis Social Bookmark Button

I’ve heard it 1000 times since ‘97. “XML, it’s just plumbing”. Maybe, but it hasn’t really felt that way in past years. Too much was still unsettled, and and there were too many people who were not interested in letting things settle (including me). On the rebound from 2 very enjoyable XML conferences, XML Prague and Extreme Markup Languages, it does finally feel to me that the era of absent-minded XML pipe-laying is upon us. I think that’s a good thing, especially now that XML is well enough established that few people choose to build edifices without it. This does mean that we have established what I’ve always characterized as a basic writing system for data integration, and now the really fun stuff can begin as the philosophers and politicians work on libraries to suit their schools (yeah, I know I’m starting to pile up the metaphors, and why not?)

Mike Hendrickson

AddThis Social Bookmark Button

Boston and Cambridge

Ignitebostonlogo

Summer is flying by and as we usher in fall, we wanted to give all New Englanders a heads-up that we are having a second Ignite Boston. The second Ignite Boston will take place on Thursday, September 6, from 6 to 10pm at Hurricane O’Reillys. Yes that is right, Hurricane O’Reillys. No, it’s not Tim’s office after FOO Camp. We’ve picked a venue that is more acoustically-oriented and should allow everyone to hear what is going on.

And we are planning to mix-up the format a little bit. There will be some short “launches,” followed by lightening talks, and a couple of other ideas that we will inform you of in the coming weeks. Let’s show our tech colleagues around the country that Boston/Cambridge have a vibrant tech community that gets involved in talking about cool new technologies and ideas. Not to mention that it is a social event to get to know other developers in the area.

If you plan to attend, email IgniteBoston at oreilly dot com for the chance to win $300 worth of O’Reilly books of your choosing. You must be present to win.

If you are interested in connecting with some of the folks who attended the first Ignite Boston, we have a social network set up for this purpose. You can reach our Crowdvine network here.

Another reason we wanted to announce this event this early, is so those of you who would like speak for five minutes on something cool, new, or exciting you can get into the queue sooner rather than later. Please submit your idea/s here:

Presentation Guidelines

  • Be no longer than 5 minutes.
  • Be on an innovative topic (no sales pitches, please!).
  • Be viewable on a PC [a MacBook Pro with Powerpoint, Keynote/has remote control, and PDF] with standard AV equipment.

To submit a proposal.

For anyone that’s never been to Ignite, you may find it useful to see a talk or two. Here’s a link to a good example [but poor audio quality] from the first Ignite Boston talks.


Technorati Tags: , , , , , ,

M. David Peterson

AddThis Social Bookmark Button

Update: *EXCELLENT* follow-up post from Wladimir in which he closes with the following,

I guess I need to thank Danny for so many great articles in such a short time. On the other hand, maybe instead I should remind him that denial-of-service attacks are illegal, even in the USA.

I’ll let you come to your own conclusions as to what that last sentence is referring to, though I will point out the fact that no matter who you are or what you believe justifies your actions, while blocking ads is not a crime, DOS attacks and other forms of Internet harrasment and vandalism most certainly are.

If you are guilty of any such crimes, please don’t turn yourself into the authorities (our prisons are filled with too many people who shouldn’t be there in the first place), but please stop, think, and then find ways to get over whatever it is you are hung up on in a peaceful manner.

Thanks! Our Internet will be a better place if you are willing to consider the above request.

Update: Wladimir Palant, the *WONDERFUL* developer behind the *WONDERFUL* tool AdBlock Plus recently left the following comment that I thought the rest of you would find interesting,

Thank you for this article, it is real fun to read it. Btw, the numbers you were asking about - I don’t have exact numbers either but it seems that no more than 2% of Firefox users have Adblock Plus installed. Which makes this campaign as ridiculous as ever.

Of course one can only assume that after all of this attention, the number of AdBlock Plus users have increased, but not so much as to drastically change the above percentage to the point where any of the legitimate sites on the net in which use ad revenue as their primary support are going to be noticeably effected. In fact if you think about it, it’s quite possible that, while ever-so-slightly, the reduced cost in bandwidth savings from those who have no interest in the ads being displayed will *more* that offset any potential loss in ad revenue.

In fact, if you *really* think about it, if all of the people in which had no desire nor willingness to click on the ads presented on your site were to install AdBlock Plus there’s an ever-so-slighter (is slighter a word? Probably not, but today let’s make it an honorary word just for fun ;-) possibility that the net result will be that of increasing your cash flow instead of decreasing it.

Okay, maybe thats a bit of stretch, but if nothing else it’s definitely something to consider. Of course if it turns out this theory were to actually hold any water you would have none other than Wladimir Palant to thank for your decreased cost structure and therefore increase in monthly revenue. And according to the following forum entry from about this time last year (which was in response to a question regarding Wladimir’s preferred charity), here’s how you can thank him for your new found cash cow, ;-)

I don’t favor any organization, feel free to choose the one you like

Edit: On the other hand… I do favor one organization: http://www.mozilla.org/foundation/donate.html

Seems reasonable to me. :D

Thanks, Wladimir!

Update: NOTE: For those of you who first read this update at the top of my last post, here it is again but this time at the top of the correct post! ;-)


I *LOVE* this comment from an article linked to from Yours Truly (a handle, not a self reference ;-),

Upon clicking the link to http://whyfirefoxisblocked.com/ I was met with a blank page. Interesting, I thought to myself. Let’s check this out in more detail… I bet they want me to wipe the dust off my Internet Explorer and access their site that way. Admit defeat? Go back to using Internet Explorer? Hardly. I simply opened a new tab in Firefox and went to Google. In the Google search field I entered the search term: site:whyfirefoxisblocked.com and then loaded the conveniently offered “cached” version of the page in question. It loaded smoothly in my AdBlockPlus-enabled copy of Firefox.

Absolutely *CLASSIC*! :D Thanks for the laugh, Yours Truly! Of course the real test would be to do the same for the site that you would have been redirected from, but two things,

1) Why waste any more of your valuable time.
2) The spirit of your hack is most certainly in place, which leads to one very important observation,

As mentioned already: Don’t Fight the Internet! There’s fame (the good kind) and fortune and good times for all in whom find ways to embrace the way the web *truly* works, not the way you think it should work. And if anything this is the point of the entire post.

Update: Based on the evidence that has been mounting up in my inbox and in comments I’ve done a quick research project and have come to the same obvious conclusion that everyone else has: That the content that follows that now has a strike through is more than likely a completely bogus attempt at justification. My apologies to each of you that were simply following Digg, Slashdot, Reddit, and other links for proliferating the garbage that is being fed from this guy.

Oh, and Danny, (AKA Jack Lewis),

You know what, nevermind. Why even waste any more of my time.

No wait, I’m sorry, I do have something else to say: You are not a victim of terrorism. You’re a victim of yourself.

Best of luck to you.

Oh, and one other thing: If you are bothered by the ads on this or any other site and would rather read this or any other *FREE* content without being bothered by ads you find annoying: I’ve heard that Ad Block Plus is pretty good. Of course you’ll need Firefox if you don’t already have it, but if you’re interested in my opinion, Firefox is as good as a browser gets.

In fact, maybe even better.

Enjoy your ad free Firefox browsing days, everyone! The content here on O’ReillyNet is free to read however you might choose in whatever browser you might choose. If you choose to reprint it (beyond that which can be considered fair use) please do so under the terms of the Creative Commons by-nc-sa. Otherwise, do what you want. That’s your right.

And as always, thanks for reading! :D

Update: via a comment from Danny Carlton,

It’s my site, and if i want to control how people view it, I’m not letting a bunch of terrorists force me into changing that–and when you attempt to change someone’s behavior by threat of harm, you are a terrorist. The vile, obscene emails and phone calls, they attempts to shut down my server with DOS attacks and bandwidth eating programs, are all acts of terrorism, and it’s really interesting how many people who seem to get offended at being called “thieves” have no problems acting like terrorists.

Folks, I don’t care who you are or what it is you think you’re accomplishing, as far as I’m concerned anyone who involves themselves in this type of activity is absolutely as Danny specifies,

A criminal.

That’s absolutely shameful to do that kind of crap. You mind not be a criminal for blocking ads placed in the content you read, but you’re certainly a criminal if you take part in any of the crimes mentioned above.

Whoever is involved with the above: STOP!

It’s not funny. It’s not cool. And it certainly isn’t justified. It’s stupid. It’s illegal. And it needs to stop.

[Original Post]

Don’t fight the Internet! I promise, you’ll lose.

Why FireFox is Blocked

The Mozilla Foundation and its Commercial arm, the Mozilla Corporation, has allowed and endorsed Ad Block Plus, a plug-in that blocks advertisement on web sites and also prevents site owners from blocking people using it. Software that blocks all advertisement is an infringement of the rights of web site owners and developers. Numerous web sites exist in order to provide quality content in exchange for displaying ads. Accessing the content while blocking the ads, therefore would be no less than stealing. Millions of hard working people are being robbed of their time and effort by this type of software. Many site owners therefore install scripts that prevent people using ad blocking software from accessing their site. That is their right as the site owner to insist that the use of their resources accompanies the presence of the ads.

Here’s the thing: If people are going out of their way to block ads via Ad Block Plus do you honestly believe they represent a significant percentage of the +/-2.5% of the people who actually ever click on web ads in the first place? Wait, hold up, I think you answer your own question in the next paragraph down, but first let me take a quick moment to point something out,

M. David Peterson

AddThis Social Bookmark Button

Don’t you just love Jeffrey Zeldman? I know I do for the simple fact that he has no problem saying it like it is and in many cases he’s right on the money,

Jeffrey Zeldman Presents : What crisis?

The glacial pace of the W3C has given browser makers time to understand and more correctly implement existing standards. It has also given designers and developers time to understand, fall in love with, and add new abilities to existing standards.

So the glacial pace can’t be the crisis. Maybe the problem is lack of leadership. One worries about the declining relevance of The Web Standards Project. (Note the capital “T” in “The”–people who believe in standards should also believe in and follow style guides.) One has worried about the declining relevance of The Web Standards Project since 2002.

Nicely stated! Of course, just a paragraph or two above Jeffrey asks the question,

M. David Peterson

AddThis Social Bookmark Button

As per a comment I made to a post from Eric Larson to the internal Vibe* mailing list regarding the usage of Mercurial instead of Subversion for our RCS,

Of course maybe someone will come along and create a BitTorrent-based Darcs or Mercurial plug-in. Now *THAT* would be cool! :D

My point was in relation to the fact that with a decentralized RCS (which in most cases creates an exact copy of the repository with each checkout), as the size of the repository increases so does the cost of hosting that repository with each new checkout. But if a BitTorrent plugin were to suddenly surface?

Like I said, “Now *THAT* would be cool! :D”

Anybody care to become the *WORLDS BIGGEST ROCKSTAR CODER*? This would certainly be one way of becoming just that. :D

M. David Peterson

AddThis Social Bookmark Button

Dare Obasanjo aka Carnage4Life - Google Working on Social Network Aggregator

What I find more interesting is being able to bridge these communities instead of worrying about the 1% of users who hop from community to community like crack addled humming birds skipping from flower to flower.

Rick Jelliffe

AddThis Social Bookmark Button

Schematron is an ISO standard (ISO/IEC IS 19757-3) schema language for expressing assertions about the presence or absence of patterns in a document, usually using XPath. ISO standards are supposed to contain verifiable statements about some technology. And there is an schema for ISO standards (refer to How to write your own ISO Standard. So why not combine them? Executable specifications may provide the best form of verifiability!

I’ve made a little stylesheet that converts Schematron schemas into ISO Standard annexes. Each pattern becomes a separate clause, and assertions are treated as constraints and report statements are treated as errors that must be reported. The stylesheet handles abstract rules and abstract patterns (though these are starting to go into XPath territory and so are borderline ugly), and the @see attribute. Phases are treated as conformance profiles. Diagnostics are stripped out, they might perhaps have some use in application standards rather than document standards.

As well as its assertions, Schematron allows quite a bit of rich text and titles. The stylesheet handles bullet and numbered lists, most kinds of inline styling. The output is validated against eh RELAX NG Compact schema from the draft TR that I was using. (I had to clean up numbered lists a little: the drft stylesheet provided its own autonumbering when using <ol>.)

So is this a serious idea? Actually, yes. Schematron was developed with the human aspect of schemas as a very high priority, unlike any other schema language that I am aware of. By design, it is intended to be useful for generating documentation suitable for domain experts rather than XPath developers. (I am working on a commercial product that provides this as part of a collaborative schema development environment; the betas look good.)

So I hope that as more organizations take up Schematron to specify part or all of their standards, they will adopt this kind of approach, so that they end up with standards with no gaps between what is required and what is validatable. Note that you can still make Schematron assertions even when there is no XPath to check it: so Schematron does not back you into the corner that other schema languages do, where you have no high level constructs to document constraints beyond the capability of the validation expression language: refer to Expressing untested and untestable constraints in Schematron.

The stylesheet and an example

Schematron Validation Reporting Language is a small language specified as part of ISO Schematron for representing the output of a validation,. It can then be transformed into lots of other uses.

First: here is the Schematron schema for SVRL, unchanged from the ISO standard except I added three IDs that were missing (the XSLT expects patterns to have IDs): Download file

Next, here is the XSLT script: Download file

Here is the output from the script, using the SC34 schema: Download file

And , here is that output then converted to HTML, using the draft previewing script from ISO. (The SourceForge project has an XSL-FO generator): Download file

As a bonus, here is a blank XSLT template with all the Schematron elements exposed, for anyone who wants to make their own complex pretty-printer/transformer for Schematron schemas:
Download file

The annex generated is, I think, pretty acceptable as a draft standard, especially since the schema was written as a real schema and not as text in a standard per se. Obviously some things can be improved, such as being consistent with ’should’ and ‘is’, but I think this is a viable, useful and efficient approach to improving the quality of standards for XML vocabularies and document types.

Uche Ogbuji

AddThis Social Bookmark Button

One reason I’m looking forward to Leopard is that unfortunately I’m a victim of the bug where my MacBook Pro 17″ occasionally reboots when I close the lid. Most of the time things are OK, but once a month or so I close the lid and I hear the “bong” chime of the computer restarting. When I open it back up (either right away or after a while) it starts back up as if I’d powered it on. Needless to say I lose any unsaved work, which has caused me to be even more annoyed at software that does no auto-save such as TextMate. It seems to happen in clusters, a few times in a few days, then fine again for another few weeks or so. Anyway here’s hoping Apple has a handle on this one either in Leopard, or in the hardware update to the MBP line that came out a couple of months ago. I’m provisionally happy enough with mine that I’m irrationally eyeing the 1920×1600 and 4GB RAM options in the latest (though the high res is apparently not available with the glossy screen. What’s up with that?).

Anyway, other references to the closed lid reboot bug:

* MacBook restarts when closing the lid
* MacBook Restarts when put to sleep

Update: s/Tiger/Leopard/g. Can’t keep the big cats straight.

Rick Jelliffe

AddThis Social Bookmark Button

You too can write your own ISO standard! Here are the steps:

1) Download the ISO/IEC Directives Part 2 Rules for the structure and drafting of International Standards. These give the general editorial guidelines. Read it all.

2) Download the documentation for the XML schema for ISO Standards, which is in Technical Report 9357-11. A good draft is available from SC34 Website. Read it all.

3) Download the Open Source schemas and stylesheets are available at SourceForge and embody a lot of the rules of the ISO/IEC Directives Part 2. They have been contributed to over the years by such people as Murata Makoto, Martin Byran, Ken Holman and James Clark and used in many standard: I used them for ISO Schematron for example. (If you want to use Word templates or whatever, these are available from ISO, but this is an XML list so it doesn’t deal with that.) Install and configure your production environment to use them.

4) Try to follow these writing guidelines:

  • When writing, think about clarity. A good rule of thumb is “Will this sentence be easily translatable into a language that does not have the words “the”, “a” and “it” or which does not have the future or past tense available?” and “Can a recent graduate understand this?” Note in particular that you must use “shall”, “should”, “must” in very particular ways, that you need to use the definitions section as much as possible, that you need to clearly distinguish normative text from informative text (which is not the same as required and optional/discretionary, and different again from the legal “Required Parts”), you need to be clear about different levels of conformance, and that you need to be careful with normative and non-normative references (see the Directives!)
  • Download any other standards in a similar domain, and try to re-use the phrasing and declarations from them. When writing, try to use the standard vocabulary that ISO suggests in standards such as IS 2382. If you use terminology that differs from these, make sure it is in your definitions section. Note that there are some trick words that have specialized meanings: so “define” is what you do, but “declare” is how you do it (loosely).
  • A standard should only contain verifiable statements. That rules out most adjectives, unless they are defined, and is why standards tend to have Germanic agglomerations of nouns. Where possible, try to specify the requirement in an executable form, such as a schema language, then use the text to fill in the gaps. Where possible, try to specify the requirement using a formalism, such as predicate logic or BNF or UML, especially if there is an unambiguous notation or a standard for these. Where possible use diagrams, however only use them if there is a common or standard diagraming type for which a reference is available.
  • When writing, avoid dependencies on other standards. Reference the most general version of other standards possible. Unless there is a good reason, allow the other standards to be maintained without this then making your standard outdated. Avoid specifying or summarizing other standards: completely in normative text, and as little as possible in informative text unless the other standard is not freely available.

5) Write your draft

6) Track down IP issues to the best of your ability. Also, try to have reviewed it for Internationalization, Security and Accessibility issues: the more that these are designed in from the beginning, the smoother things will be downstream. Most importantly, you need to show that there is some market (users) for this standard, that it is not some crackpot technology. One important thing that will influence reviewers is whether there is developer buy-in: is there an open-source implementation, is there some company willing to produce products that use the specification, and so on. If you want commercial buy-in, think about the carrots (an economic case why it would benefit vendors) and sticks (getting regulators or procurement departments to require it.)

7) Decide whether it should be an ISO/IEC International Standard, an ISO/IEC Internation Standard through fast-track, a Publicly Available Specification, an ISO/IEC Technical Report, a National Standard, a Consortium Standard, or just something on your own website. If you decide to take it through ISO you have to find or become a champion: you can go to your local national standards body and get them to propose it (or adopt it as a national standard first), you can find a friendly committee person on the relevant committee and get them to propose it from their Working Group, or you can find some boutique standards body that has liason with ISO (such as OASIS or W3C) and put it through their processes. You need to find an editor who is participating on the committees and can travel to enough meetings (See if your national body offers any travel subsidies; demand that the ISO working group use teleconferecing). You should expect that your draft may be substantially changed, especially if you have not written it according to stage 4). At this stage, remember that you are not alone: there will be other committee people and interested people around the world who can provide advice, only rarely crazy, and you cannot be too proprietorial: some parts of the standard will improve in your eyes, some parts will get worse in your eyes, but that it all OK because it becomes a collective effort. Especially remember that a really stupid comment from someone is undoubtedly a sign that your deathless prose is crap and needs to be fixed. Don’t take criticisms of the draft personally, and learn committee skills: how to challenge clearly, take the stated requirements of others seriously, and acquiesce gracefully—not understanding something or losing an argument does not involve a loss of face, but you have to give face when winning on an issue too. Don’t “play to win”; instead “play to win/win” (I am embarrased to write that!)

8) When a draft is produced, contact the various technical committees around the world to help answer questions. Actually, the ISO committee process itself provides a good forum for this; if you are fast-tracking you may need to do extra work to explain the draft.

9) Ask the committee to ask ISO to get the standard added to ISO’s free list. A standard that is not on the WWW is at a total disadvantage.

10) Assuming the vote on the Final Draft was “yes”, you now have your standard! Congratulations, that has only taken three years or so. Now you have to commit a little time over the next few years to maintain it and fix corrections that come up, and to try to get buy-in from the public. If you have a “grass-roots” standard like ISO DSDL (RELAX NG, Schematron etc) which do not fit into the plans of the military-industrial complex, then your expectations need to be modest and you need to think about how to encourage activity in the Open Source eco-system. Remember a good standard is one that meets its particular user’s needs, not one that takes over the world.

However, your name won’t be in the standard (unlike W3C or OASIS), or in the bibliographic entries. So don’t do it, or participate on committees, if you want to see your name on Amazon.

Rick Jelliffe

AddThis Social Bookmark Button

Over the last month I have been collecting examples for fun from the web where scuttlebutt on the websites of well-known commentators has claimed procedural or other irregularities at standards bodies or participants. I started this off on the luridly titled “Bribery Watch page, but it is more “Innuendo Watch.”

Here is a little map (drawn dynamically) with the countries mentioned in red.



Some of the claims have a French farce aspect. For example a mistranslation of “seat” and “chair” caused a great flurry.

However, one persistent theme is the idea that the industry people who actually want a standard should not participate in the standards process. Sometimes there seems to be some idea of neutrality floated, sometimes some idea that people who come late have less legitimate opinions than people who come early, othertimes that the process is flawed unless people are allowed late. But the basic idea is that if you agree with MS on anything or have had any business connection with them, they own you, perhaps even bribed you, and your every opinion is inappropriate. But never an acknowledgment that standards are community self-help efforts participated in, for the most part, by the parties who want to use the standard; and that the standards process is not a tool for cartelization.

Jim Alateras

AddThis Social Bookmark Button

James Snell has just published an article on developWorks, which illustrates how to use the Atom Publishing Protocol to publish Common Alert Protocol (CPA) alerts. CAP defines a XML data model for specifying hazardous alerts and notifications. The article uses the Apache Abdera implementation of APP to indicate how to publish, modify and delete CAP alert documents.

Rick Jelliffe

AddThis Social Bookmark Button

The licensing of IP for standards has four aspects: what the (case and statute) law says, what the standards bodies require, what the IP owner grants, and how the developer (adopter) is acting. Standards themselves never seem to have useful information about patent IP, and even their copyright boilerplate needs to be checked against licenses given by the copyright holder: W3C and ISO don’t like you copying their standards, Ecma does, for example.

law.gif

For an introduction to the legal aspects, see ConsortiumInfo.org, which is by a lawyer for OASIS. The Dell case is pertinent.

For an introduction to the standards body aspects, see Standards Law, which is by a lawyer for Microsoft. It has a reference to the ISO requirements. For the boutique standards bodies: OASIS, Ecma, W3C

For examples of the kind of grants that companies make see
Microsoft Open Specification Promise, IBM Open Source Portal, Sun’s OpenDocument Patent Statement. Adobe has not put their equivalent online if it has been finalized, as far as I can see. (Microsoft also has a “Covenant not to sue”, however this seems to have disappear from its website in a rearrangement of links. They need to get it put back online.)

So what does the user have to do with it? Some licenses provide particular conditions relating to private or not-for-sale use: the GNU licenses for example. Other times licenses are revoked if you try to sue the IP owner: these defensive patents are bargaining chips in legal wrangling.

One key term to understand is RAND: Reasonable and Non-Discriminatory Licensing. It is pretty much the bottom line for standards organizations. However, RAND licenses are controversial, and in the views of many of us, something that should be avoided by modern standards bodies in the age of Open Source and Free Software which, like standards, have strong counter-monopolistic and even communitarian aspects.

Another concept to understand is the Open Standard. Not all standards from standards organizations are Open Standards under anyone’s definition, especially older standards and standards which involve semi-scientific research and development (compression patents, for example) where the IP holder would only license a vital technology under RAND or not at all. (There is some creep on what an Open Standard is, to conflate it with Open Source or free implementations.)

And it should go without saying that someone cannot grant a license to IP they do not themselves hold. So all covenants and licenses only extend as far as the material in question. This is important for extensible formats such as ODF and Open XML, because the ZIP container allows any kind of media or binary file.

See the IBM material for a definition of Necessary Claims and Required Portions.

Uche Ogbuji