June 2008 Archives

Hari K. Gottipati

AddThis Social Bookmark Button

My saga on problems with GMail continue. Despite of the -ve feedback (”GMail is working fine“, “GMail is awesome‘, “Not sure why you are complaining GMail?” etc) to my posts, I continue to see the problems with GMail. I am not alone on the planet, lot of people are in the same boat(You can read the problems with GMail here, here, here, here, here, here, here, here and here). The problems are frequent and particularly when they release new features. Some times I feel that Gmail is rushing to release the features without proper testing. May be they think that it is OK to roll out the features with bugs as it is in beta. Until now it was my guess only, but it turned out to be a fact. Sergey Solyanik who worked on GMail revealed some interesting facts on Google procuts and culture after leaving Google.

In the last year, and slick as it is, there’s just too much of it that is regularly broken. It seems like every week 10% of all the features are broken in one or the other browser. And it’s a different 10% every week - the old bugs are getting fixed, the new ones introduced. This across Blogger, Gmail, Google Docs, Maps, and more.

It seems Google culture is focused on introducing the cool features, not focusing on quality. Does Google think that since it is free for the user to use, quality does not matter? Well, it may be free to use, but Google is making money off of it by placing ads.

The culture part is very important here - you can spend more time fixing bugs, you can introduce processes to improve things, but it is very, very hard to change the culture. And the culture at Google values “coolness” tremendously, and the quality of service not as much. At least in the places where I worked.

Incidentally his journey from Microsoft to Google was not as good as he thought and took U turn back to Microsoft. Also he explained why Microsoft is better than Google to progress in the career.

The Google Manager is a very interesting phenomenon. On one hand, they usually have a LOT of people from different businesses reporting to them, and are perennially very busy.
On the other hand, in my year at Google, I could not figure out what was it they were doing. The better manager that I had collected feedback from my peers and gave it to me. There was no other (observable by me) impact on Google. The worse manager that I had did not do even that, so for me as a manager he was a complete no-op. I asked quite a few other engineers from senior to senior staff levels that had spent far more time at Google than I, and they didn’t know either. I am not making this up!
At Microsoft, the role of a manager is far more obvious. A dev lead is responsible for the success of the feature and the health of the feature team. A dev manager is responsible for the success of the product and the culture of the dev team. A PUM is responsible for the success of the business, and interoperation of the three teams that work on the product.

Isn’t it bad for a company like Google not focusing on the quality?

Update: Slashdot is also discussing this from a different prospective “Some Developers Leaving Google For Microsoft” and on:

Everything is pretty much run by [engineering] — PMs and testers are conspicuously absent from the process. Google as an organization is not geared — culturally — to delivering enterprise class reliability to its user applications.

Erik Wilde

AddThis Social Bookmark Button

During the recent discussion of the OAI-ORE drafts (which use RDF), the claim was made that RDF is serialized in RDF/XML and thus could be considered an XML representation of the underlying data model. My response to that was that the RDF model is different from XML, and that it thus is pretty hard to process RDF/XML using XML tools, in particular when considering all constructs allowed by RDF/XML, and maybe even the possibility how to update RDF/XML data using XML tools alone.

I tried for some time to find a general-purpose RDF/XML parser written in XSLT, but so far could not find one. But Google is imperfect and i might not know the best places where to look. So here is my question: Is there a general-purpose RDF/XML parser written in XSLT? It has to support all the fun stuff allowed by XML and RDF/XML, such as weird uses of namespace declarations, XML Base, rdf:ID and RDF/XML syntactic sugar. It must accept anything that is valid RDF/XML. As a result, it should produce some form of normalized RDF/XML, but I really don’t care that much about the exact format (ideally, it should be XPath-friendly). The parser must be robust enough to produce the exact same normalized result for inputs that look radically different because of XML and RDF/XML syntax variations.

I am really interested to see whether such a beast exists, and if so, how big it is. My guess is that it’s not trivial to write such a parser, but it definitely is possible. After finding out whether such a beast exists, my follow-up question will be whether there is an associated function library that can then work on the parsed RDF model, so that the data can be traversed, queried, updated, and serialized.

Eric Larson

AddThis Social Bookmark Button

It is interesting to see the progression of free software along side the proliferation of the web. When I first started programming, I got involved with a web CMS I used in my contract work. I would write a new plugin or feature along with migrating a design to the software and afterwords, I would try to contribute it back. One time, the designer I was working with asked me to remove some of the project branding as well as a GPL notice on the login page. Some of the community found out and a rather long dialog started regarding whether or not I had violated the licensing. In the end, I came to compromise with the original author and we all moved on.

I still think I was right. I contacted the FSF about the issue and they confirmed my evaluation. My argument was that users of the site were not the same as users of the software. My clients had access to the source code and they were free to change it however they wished. I considered the client the actual users of the software. My removing of GPL licensing information from being visible in the login HTML did not limit or restrict the users freedom in any way and I continued to include the same copyright notices in the source of the page. Needless to say, while I disagreed with the author’s perspective, I had no problem coming to a compromise. The whole situation was far from heated and I personally think it was a healthy dialog for the community.

My situation made it clear there were still questions to be answered in terms of free software on the web. It seems the AGPL is one answer for this kind of problem and there are some folks actively promoting it as a means of distributing web based free software. While, I think the terms (as I understand them) of the AGPL would not work for something like a CMS, the concept of providing software over a network and considering the output as licensed has its value.

The best thing about the AGPL is that it provides a licensing option to help provide free software on the web in a similar way the GPL effected desktop software. I don’t think many web developers have thought about web applications in the same light as desktop applications. Installation challenges, database dependencies and server requirements have all made running web applications something left for developers and advanced users. Fortunately, server software is becoming common place and programming languages like Ruby and Python are helping smooth over the issues with distribution and deployment. It is good to know that as web applications continue to evolve, free software will evolve with it.

Erik Wilde

AddThis Social Bookmark Button

The W3C just published a new TAG Finding called Associating Resources with Namespaces. Here’s the abstract:

This Finding addresses the question of how ancillary information (schemas, stylesheets, documentation, etc.) can be associated with a namespace.

I don’t quite understand why the TAG findings are hidden on some badly named Web page. Some of them are pretty interesting documents, and yet they are not published on the W3C Technical Reports page, and the W3C Home Page does not link to them or publish news snippets about new findings. I think these documents should be easier to find.

Technically speaking, the finding talks about how to create namespace description documents, so that namespace names can point to helpful resources, rather than being abstract identifiers. The TAG finding breifly describes possible languages for namespace description documents (RDDL 1.0 and 2.0 and GRDDL), and describes a vocabulary of terms for describing the nature of resources being linked to in a namespace description, and what the purposes of these resources are. The definitions of these terms, though, are one-liners with little guidance to what that concept is supposed to represent.

What I am missing most (and what we were concentrating on when we were defining our own format for namespace descriptions in an e-government scenario) is the ability to associate namespace descriptions themselves, and make assertions such namespace x depends on namespace y. Or rather simple but really helpful pieces of information (in particular for developers) such as namespace x is usually associated with one of these two namespace prefixes, here is where you can find test data, or here is where you can find some example data.

AddThis Social Bookmark Button

At the Semantic Technologies conference in San Jose I attended an interesting presentation entitled “persistent identifiers for the real web”. XML often uses URLs for identifying schema namespaces, and I suppose could be credited for influencing RDF’s practice of using URLs for identifying resources. In using RDF to describe and annotate things a problem arises…are you describing the web page, or the thing the web page is talking about. For example, if I assert that:

<http://tcowan.myopenid.com> :likes <http://www.myspace.com/lettucefunk>

Does that mean I like the web page or the band the page is about? As you’re traversing the semantic web it’s going to be advantageous to distinguish between content assets and the real world entities they may represent. Their proposed solution involves PURLs (http://purl.org for example). Normally a permanent URL redirects you to the best representation of the resource via a 302 response. They propose that when the PURL represents a real world entity that the response be given as a 303 (see also). The computer agent can then understand that the “thing” is a real world entity, and that the redirect is not to the real thing, but to another web resource about the thing.

I’m very much in favor of permanent URLs. Otherwise all our assertions will become disjointed as links break, or we’ll have to keep our own “archives” of dead links and sites. I also appreciate the simplicity of Dave and Eric’s proposal, however, I’m not so sure this is really the best way to solve identifiers for real world things. Consider books for example…what would be the best way to represent a book, it’s URL on Amazon or it’s ISBN number as a URN? If we use the Amazon URL we can’t be sure it’s a book, it might be binoculars or a coffee table. The URN however makes it clear:

URN:ISBN:0-395-36341-1

The urn namespace indicates that it’s a book, without a doubt. If PURL were to host a “see also” permanent URL scheme for each declared URN namespace we’d be able to visit that URL to find out more…

http://purl.org/urn/isbn/0-395-36341-1

But on the practical web, we don’t use PURLs or URNs for books, we use the Amazon.com url. I think in practical terms things are going to be represented on the web by the domain that has the best collection with the best open content. Perhaps the best approach in the end is to take advantage of blank nodes.

<http://tcowan.myopenid.com> :likes _:a
<http://www.myspace.com/lettucefunk> :describes _:a
_:a a :funkBand

In English, http://tcowan.myopenid.com likes the funk bank described by http://www.myspace.com/lettucefunk. Now we’ve made it clear, and without the use of PURLs or some new PURL redirection strategy.

Rick Jelliffe

AddThis Social Bookmark Button

First some jargon (from the Glossary of Typesetting Terms or Harrod’s Librarians’ Glossary full props to Google.) Castoff: The calculation the number of typeset pages a manuscript will make, based on a character count. Proof: An impression made from type before being finally prepared by printing. Proofs are made on long sheets of normal page width… Galley proof: Proof of text before it is made up into pages…just as long as can be conveniently photocopied – usually 13 inches. Compose: To set type-matter ready for printing.

Deciding on breaks

I am ancient enough to have used galley proofs, the long pages of text of books before it had finally been made up into the final pages and runoff on a printer (or rather, by a printery.) It still exists in the draft modes on some modern word processors, I suppose. There has always been a chicken and egg problem in documents which contain dynamic forward references that expand to section or page numbers (e.g. See page 99: how do you know how much space to reserve for the page number? A reference on a tightly-set line or full page may cause different page breaks if it is a two or three digit number, for example. A traditional way to deal with this was to allow a lot of space around page references (to reduce the impact) and to take two passes of the document, the first to estimate the pages and the space required for each reference, and the second to actually compose the document using the calculated space as fixed and squeezing the generated text if necessary.

The idea that you could divide the same text into different length pages is obvious, and quite early on even the electronic typesetting programs alllowed draft modes (or provided alternative macros) for producing proofs. The requirement of some publishers for double spaced manuscripts made the idea of separating structure and presentation, ideas ascribed to Charles Goldfarb and (independently) Brian Reid, does not seem a big leap to us nowadays. Multi-publishing and retargetting became commonplace in the SGML arena, with the advent of declarative stylesheets looming for a long while, but the next really big step was with the advent of the WWW and the impact of resizable windows on formatting.

One of the most important ideas following from the separation of presentation (into stylesheets) and content has been the formalization of the page-flow model (frames), which was championed by Frame Corporation’s FrameMaker though the simpler concept of regions was of course older. The idea is you “pour” the text into the frames and they flow, break and cause new pages where they will.

Loose

In my blog yesterday, I mentioned that the transformational approach of stylesheets in XML (the DSSSL, XSL-FO streams) is only loosely-coupled with the typesetting engine (or formatting engine…some people think that word processors don’t do typesetting, I don’t want to get hung up on terminology) so there are some kinds of page design rules that are impossible even if because the developers cannot be aware of every design rule anyone might want to make.

The separation also impacts another area: the area of document interoperability. I have written several blogs referring to Markup’s Dirty Little Secret, which is that because everyone’s system and each system’s algorithms and resources and capabilities are different, you cannot expect perfect fidelity to the extent of the same line and page breaks when exchanging XML+stylesheet documents (such as OOXML, ODF, DOCBOOK, you name them). This goes quite against the expectations of some users (though I think people are much more realistic about this now than two years ago) and quite against the hard requirements of others (for example, people who need fixed page numbering for legal requirements.)

In yesterday’s blog, Standardization as a collective loss of imagination? I suggested that users may need to assert themselves to prevent the standardization of the current round of office application formats from a particular pitfall of losing sight of the centrality of page (and document and information) design: how to help people communicate rather than how to add the latest pet feature from some vendor. Not that pets are not fun and valuable.

Hinting at our priorities

The tie in that suggestion and the page-fidelity problem (which is really an interoperability issue) is that I think we need some more imagination about whether our current re-pour-each-time model of formatting is actually good enough if we genuinely want substitutability of office applications. People don’t want to be sold a turkey.

Now SGML did provide processing instructions, a kind of markup that still exists in XML, for applications to add extra information that belonged to formatting for example. The ArborText Publisher program used them very successfully, with processing instructions that let you force page and line breaks in certain places, for example. That is one way Iof integrating page markup, but it is not what I am suggesting (for various reasons.)

At the moment, I think that a much better approach would be to add a kind of cast off hint as an attribute to each block-level object (paragraph, list item, table cell, etc). This would be added to the XML markup by the formatting engine as a hint, to enable a subsequent formatter to try to get the same results.

The first time data came into a document, the normal composition mechanisms would apply. But the document’s block structures would also be decorated by these hints at save time. And subsequent opens of the document would use these hints as well when composing the pages. For example the castoff hint might be as simple as
giving the bounding box of the block on the page. The composing system would used differences in these bounding boxes with the bounding boxes it wanted to use as penalties to adjust line feathering (or even margins, padding, breakpoints, spacing, text size.)

Auto-sizing is not completely unknown: WordPerfect had a patent on automated adjusting various page parameters to make sure some range of text fitted on a single page. And many people are aware of the behaviour of some page-oriented systems such as presentation programs to automatically resize text (including nested text lists) to fit into the available space.)

It could be user selectable whether to freeze the page according to the block hints or just use them as hints, or ignore them. As a hint, it wouldn’t interfere with minimal implementations.

Erik Wilde

AddThis Social Bookmark Button

have you ever heard of tree trauma, infoset ignorance, model myopia, or RDF rage? if not, and you are interested in these and other XML-related ailments, you might want to read about XML fevers:

The Extensible Markup Language (XML), which just celebrated its 10th birthday, is one of the big success stories of the Web. Apart from basic Web technologies (URIs, HTTP, and HTML) and the advanced scripting driving the Web 2.0 wave, XML is by far the most successful and ubiquitous Web technology. With great power, however, comes great responsibility, so while XML’s success is well earned as the first truly universal standard for structured data, it must now deal with numerous problems that have grown up around it. These are not entirely the fault of XML itself, but instead can be attributed to exaggerated claims and ideas of what XML is and what it can do.

if you are using XML or think about using XML or work with people who are using XML or think about working with people who are using XML, you might be interested in our XML Fever article in the current issue of the Communications of the ACM (CACM). here are your options:

the official citation for this article is Erik Wilde and Robert J. Glushko. XML Fever. Communications of the ACM, 51(7):40-46, July 2008.

AddThis Social Bookmark Button

One of the areas of web design that is often neglected is the accessibility of your content by impaired users. Because various technologies are used to aid those users who are impaired, you should make sure that your content is usable / readable if it’s ever read aloud.

The developers over of the BBC site Programmes have supported semantically marked up data ( in the form of Microformats ) from day one. Now comes word that because of certain decisions made during the design of hCalendar and its use of the abbr, they are removing hCalendar support from the Programmes web site. Other Microformats being used will remain ( rel & hCard ). However, developer Michael Smethurst has hinted that the Programmes team might migrate over to RDFa and remove all Microformats. This is the first instance that I have heard of where a team will be moving away from Microformats and possibly embracing RDFa.

I wonder if this will become more and more of a common occurrence. As companies begin to look at technologies to apply semantics to their data, I doubt that they will want to chose a technology that limits their audience.

Now, the Microformats community could change the hCalendar. However, I’m not sure I have enough faith in the Microformats community to come to an agreement on this topic. In my short time following the various Microformats mailing lists, I quickly became disillusioned with the community and administrators. I witnessed several instances of heavy handed administration, including the banning of users. Frequently, no real reason was given and I was left w/ the impression that it wasn’t much of a community after all.

I was an early fan of Microformats, but cases like this certainly make a compelling argument for the use of RDFa. Perhaps the most interesting quote from Michaels post was the fact that this decision was made by the developers themselves and not sent down via some edict:

And probably also best to note that this is not a decision that has come down from on high by the BBC equivalent of suits. The /programmes team has been concerned about this issue for a few months now and it’s good to get some clarity here.

Rick Jelliffe

AddThis Social Bookmark Button

Regularly as clockwork, every five years another group attempts to make a new standard language for typesetting. FOSI, DSSSL, XSL-FO, and ODF (plus the less grandiose scopes of CSS (styling) and OOXML (legacy).) I predict that in a decade we will see the same thing. In the past, these efforts came from the user side rather than the vendor side, and were driven by user requirements rather than vendor requirements. But requirements for standards now predominately come from questions about “Our product X supports feature Y and therefore the standard should support it” rather than “Our document A uses typesetting feature B therefore the standard should support it”: the cart is driving the horse. There is more vendor buy-in because the new standards demand and achieve so little.

In part it is understandable, the catch-up mentality does not necessarily encourage imagination.

Comparison Matrix

One very common tool for organized standards groups is a feature matrix: rather than just ad hoc consideration of this feature or that feature as proposed by vendors, the idea is to make a list of the general features required by the users document sets, or by the technologies being evaluated, or the products chosen to get first-class support. Traditionally, standards groups for typesetting and publishing have included actual typesetters (at ISO, Martin Bryan actually worked in type for example.)

A really good example of this can be seen in a document from a decade ago Final DSSSL Survey and
Assessment Report for the DOD CALS IDE Project
(Kidwell, Richman). This is a good introduction both to the Output Specification (FOSI) formatting language used by US military typesetting in the 1990s, and the ISO Document Style and Semantics Specification Language (DSSSL) which has been available standard on many Linux systems using James Clark’s open source JADE program.

The feature matrix can be found in Comparison Matrix which shows how well the standards support the document requirements: we learn that the US military requires both cartoons and running feet. This is the kind of table that I think should be driving requirements for ODF (and OOXML); preferable to the approach of feature- (or vendor- or product-) centric comparison matrix and much preferable to ad hoc feature requests.

FOSI and DSSSL

The US military adopted FOSI because it was under consideration by (what is now) ISO/IEC JTC1 SC34, however SC34 ultimately went with an extended version of Scheme under the (terrible) name of DSSSL; FOSI was deeply unlovable and never floundered outside its early adopters who were locked in; DSSSL was like the other power-user oriented standards from SC34 of the time and never found much commercial adoption by had uptake in the publishing industry that SC34 catered to. James Clark, the DSSSL editor, later merged it with CSS ideas and split it into XSLT and XSL-FO at W3C using an XML+XPath syntax rather than the S-expression syntax.

Where DSSSL and FOSI (MIL­PRF­28001 Output Specification) differ in particular was that DSSSL adopted a strict transformation approach: this is of course a UNIX-ism since the days of nroff, and the idea was that you could output to particular page description languages (RTF, MIF, etc.) Consequently there was no way for the DSSSL processor to make decisions based on typesetting metrics on the fly; instead the race was on for a set of abstract properties that could describe common cases. This fits in well with the checkbox mentality of desktop publishing tools, but was entirely counter to the typesetting-as-programming approach of the 1970s and 1980s generation of systems (systems such as troff and TeX used macro facilities so that creating a typesetting system for a document could involve all sorts of custom smarts to capture the design and fit in with the data; very high-end systems such as Interleaf even had full-blown LISP available for processing: some of these systems are still around with their niches: XYwrite and 3B2 for example, however they face a rising tide where quality and power is increasingly mysterious to the market.)

FOSI did allow or require some kind of interrogation of the pages while they were being typeset: while this can certainly allow much more expert typesetting and decision-making, it also must be tightly coupled to the formatting engine, which effectively prevents any network effects.

What do I mean by expert typesetting?

To give an idea of what I mean by expert (also known as “quality” or “industrial”) typesetting and decision-making, consider the case of typesetting a Yellow Pages (phone directory for businesses categorized by type of business.) Imagine you have to produce a Yellow Pages document using your favorite tool. The page designer and sales force come up with a design and timetable. The layout will be five columns. Entries may not span pages. Some entries take up part of a column and should be put as near to alphabetical order as possible, but rather than break they can be placed before or after their alphabetical position with previous or subsequent entries swapped before them. And there may be two, three, four or five column display adds, which also have this arrangement. And there can even be adds that take a half page but span over two pages.

And it is important that ads should not be orphaned or widowed, with one ad on a previous page by itself or on a subsequent page. And there are 6,000 pages of this. And you get the final data 24 hours before you have to deliver it.

Now how would you do that in ODF, or OOXML, or any of the standard declarative languages? You simply cannot: there is always an extra rule or concept that will not fit. (There are a few moderns systems that do allow this kind of flexibility: using JavaScript in Adobe’s In-Design for example. The program uses XPaths to locate information, but can also access the page model.)

Declarative abstractions are worthy replacements for programs and scripts but have different coverage

Now the history of (SGML and) XML is the effort to key presentation cues from structural information: the benefit of marking up “invisible” containers is that they are often not invisible. The current approach of both ODF and OOXML of allowing foreign container elements (in different syntaxes) but not providing facilities to format based on them, is the worst of all words: for QUASIWYG systems users will be loathe to do anything (well) which does not have a resulting visual/stylistic result in the on-screen draft. And (as was pioneered in pre-Adobe FrameMaker and taken up in CSS) the abstraction of frames (floating or relative, linked boundaries into which text can be flowed) also provides many hooks for making declarative properties that otherwise might require programming.

The way that standards for public declarative publishing formats (whether HTML or ODF) should go, in my opinion, is by progressively asking the question, how can we make it easier for users to do what they want to do? In the old days, this was easy: you had physical paper (from mechanical typesetting, for example) or device-independent page designs (the Yellow Pages for example) and you then programmed it by inserting commands in with the text. SGML and generalized markup came along and said describe the data in markup, then move the processing out of to a presentation system, except for Processing Instructions where you need specific overrides inline still. After this re-factoring came libraries where common code or functions were provided with the base system, and then consolidation where the code for the libraries was hidden from the user, and then exposure where only programming capabilities were removed and only the declarative portions left. RTF and MIF are examples, but so are OOXML and ODF.

At this point, users of transformation systems (such as XML with XSLT) have a lot of capabilities, even for overcoming the differences between the underlying typesetting engines of systems (see Different classes of typesetting engines and Markup’s Dirty Little Secret, but they have none for the kind of page-based calculations required by the Yellow Pages.

Now you could continue to make abstractions: nested keeps with partial float for re-ordering, for example. And in the past, there was a hope we might progress there, because the driving factor for markup languages and style languages was to cope with the kinds of designs which simple word processors failed at. But as I said, the cart seems to be driving the horse: I have no objection to document formats for existing and legacy applications (nor obviously to have them as voluntary standards, readers will no be surprised to read).

Universal pretensions without an assertively inclusive process merely disenfranchizes the weak and the foreign

However, and this was something that I saw as a flaw in the XML Schemas process, the more that you claim your format as a universal format, the more that you need to cope with cases that may be “niche” to vendors (i.e. that didn’t fit in with their development or profit model) but which are significant in their own right. When a technology, standard or not, mandated or not, does not provide a capability needed for a job, it will not (because it cannot) be used.

Lets take a concrete example. In about 1999 I spent a year looking at the various requirements for Chinese and XML, at Academia Sinica in Taiwan. As part of this, I looked at how Chinese actually did typesetting before the advent of computerization. I first made a (example below) of some interesting, but not at all atypical tables, some of which have visual structures that Japanese will recognize. (In effect, when you have the equivalent to very small word size, there are other graphical possibilities that don’t go well in Western text.)

t-b2.png

Then I made a suggested a possible structure that could be used to reconcile them. I was surprised at the reaction: Westerners universally made comments like “Oh, but those are *bad* tables and bad practise” and “They show confusion and unstructuredness”. Microsoft did add diagonal headers to Word 2000, but the SGML (and pre-SGML) idea that you should look at the artifacts and let design lead, rather than merely let vendor’s developers lead, had by the start of this decade died a rather sad death, it seemed to me.

Since then, the Chinese have gone their own way with a fork of ODF called UOF which features, as far as I can make out, Chinese element names (yah!) and extra markup for Chinese-specific requirements that other systems didn’t support. In April 2007, a request came in for ODFOpen Office to add it: Diagonal Header Specification. (which has a particularly wonderful and mad table example.) I don’t know what the status is at OASIS though, or if Sun has even passed it on: as I mentioned before, they are still discussing 2005 and 2006 user requests, which is what set my alarm bells off. (And in July 2007 Bert Bos raised the related issue of text rotation to dismiss it again for CSS at W3C: theoretically not all diagonal splits in tables require rotation or typesetting along the diagonal path, but the requirement for diagonal splits and for rotated headers spring from the same grapheme/glyph qualities of ideographic scripts.

Putting page design back at the centre

The only way to put the horse back in front of the cart is to put page design (in all its detailed aspects) at the centre of the process. Get stakeholders involved who are prepared to contribute (many will have them already) the kind of checklists that the Comparison Matrix has.

However, I fear that this may only push the issue back without changing it; if the external stakeholders themselves have their opinions formed by what commercial pre-fabbed system such as Office provides. In Different classes of typesetting engines I mention how the different implementation approaches lend themselves to different declarative properties. People realize that Office has pretty minimal keep-together control, but instead merely substitute some other products capabilities. We are quite lucky that countries like China are now getting fed up with the lack of imagination and responsiveness by Western developers and standards makers: it provides one of the few chinks in the protective armour by vendors that they only want change when driven by them.

So this has been a rather codgery item, are there any good signs? Well, I have praised Office’s Smart Art before and it is exactly an example of what goes on the page driving the technology: it is not the making format into as the driver but inventing a new class of page object (just as a table is a page object). Now Smart Art actually has crappy arbitrary structure, but the direction it can take is clear, and ODF could leapfrog it, if they could be bothered. There are hundreds of thousand of pages with simple diagrams, and once you decide to support what people actually do (and look at where people find things tedious) , that is putting the documents first.

So the only drivers I see for this, again, is for large user-side organizations to participate and dominate all the standards bodies, to work out their checklists, and force through the changes to ODF/OOXML/CSS/HTML that are required to conform to how people make documents when their focus is on good or natural page design (usability) rather than on incrementalism and conservativism.

Rick Jelliffe

AddThis Social Bookmark Button

I had an interesting discussion today with a key player in the development of a large, quite successful industry-specific standard by an industry consortium with representation from all the key stakeholders. I was surprised that he was less than sanguine about the standard: a common vocabulary was being used by multiple groups each making a schema for their particular sectoral use case, so it looked quite healthy.

But my contact had two particular gripes. The first was that the standardization process was addicted to making new vocabulary items, to the extent that talking about standardizing other things had never worked: the consortium was for making schemas not solving problems! In particular, while there was a lot of attention paid to describing what each field meant, there was no facility for comparison or identification: to say that “this address is that address” or “this person is that person” or “this agent is that agent” except by accidental string matching. So electronic forms using these schemas could be filled out, but data could never be integrated.

The second gripes comes out of the first. Because of the lack of ability to integrate and identify data, it all had to be kept together or messaged around in a bunch. So the schema for a complex process has to include fields for everything in that process except for trade secret fields, which wouldn’t be interchanged anyway: the consortium is made up from fierce competitors with a religious belief that their internal processes will be different from the internal processes of any other company in the same industry. Originally many participants would not even disclose the field names in their databases, they regarded them as so important—only to find that ther field for address was not so interestingly different from their competitor’s equivalent field.

So the result: kitchen sink standards that include so many optional or process-particular fields that the consortium is now having a problem that not enough vendors are able to implement the whole thing. However, underlying this is the problem that without even a simple process model, where each stage of the process could have a fat-trimmed or specific schema, one size has to fit all.

So my contact actually saw the standard as dying rather than thriving: the mania for new elements and structures bloating the project in the direction of unworkability coupled with a refusal to look at standardizing even basic process models or identity/tracking/aggregation capabilities.

Rick Jelliffe

AddThis Social Bookmark Button

The era of closed formats is dead is a friendly interview with South African standards activist Bob Jolliffe. I enjoy being in the same room as Bob, not least because for once some else gets their name constantly mispronounced: I think I counted three different mispronunciations from the same person in one day! I believe that both our names come from the Chaucerian English for jolly: fat and happy.

What I particularly like about Bob is that, if you read the interview, he is concerned with establishing requirements for interoperability and substitutability, and encouraging ODF, rather than slagging off MS or OOXML. I do tend to categorize people as “enablers” and “disablers” (not that these are permanent or unqualified vocations), and I certainly classify Bob as an enabler, even though we have different opinions on OOXML. But I don’t think we have particularly different opinions on ODF. Bob (who has been representing South African standards for the last couple of years at SC34) is now participating on OASIS ODF TC, and I think it is really important for government stakeholders to get intimately involved. I have repeatedly called for more government and stakeholder participation in standards groups, and I think Bob’s involvement should be a model for other governments who are wanting to make open standards mission critical.

It is clearly just the next step, that when a government starts adopting open standards, it also needs to develop expertise (Bob’s comment that there is an issue of scattered expertise is interesting), in particular in order to be able to make hard-nosed evaluations about the state of the art in implementation and on profiles.

I would have titled this “Bob Jolliffe gets it” (like Neelie Kroes gets it: “Standards are the foundation of interoperability.” and The Norwegians get it ) because I agree with pretty much everything in Bob’s answers (to the extent that I feel I could have written some of it!) but for the paragraph

One of the big dangers I see is the proliferation of backend office software which is so tightly coupled with single vendor’s office products. The promotion of open standards-based procurement of electronic document management systems is an urgent challenge.

Which goes further than I do, at the moment. I certainly agree that public government documents (in and out) should be in open standard formats, and that for that use OOXML is extraneous given the availability of ODF: however I think it would be better to think in terms of a hierarchy HTML, PDF, ODF with ODF as the last resort for publishing government material at least. (And I don’t see any harm in multiple formats being provided including the original native format of a file, for example OOXML or SVG, as long as the broad-reach standard was also available.)

However, for internal and specialist document systems, to the extent that I have a formed opinion it is that I suspect that functionality still has to trump standards support, until it can be proven that the standards meet the functional requirements. This is not to say that open systems will have to have a higher standard of scrutiny or QA than the old closed proprietary systems, but rather than functional-compliance requirements do not go away merely by deciding that you need standards-compliance, unless there is specific objective evidence that the one is fulfilled by the other.

(Oh, and I think Bob is technically wrong that IS29500 is not now an ISO standard. It has been approved by ballot so it is a standard; it’s publication has been delayed. The result of a successful appeal would be for it to be withdrawn as a standard (and still not published). )

UPDATE: Bob mentions the South African government’s Minimum Interoperability Standards (MIOS). It is available here (PDF) This extends the normal definition of open standard to include a requirement for multiple implementations: I think this is a mistake of naming (a standard is not open because of its implementions) but the correct requirement for procurement (a technology is open if it allows substitutions): what they should say is “open and mature” standards where multiple implementation is a property of maturity not openness. I think that is just fuzzy thinking that causes unnecessary squabbles: confusing issues doesn’t help thinking them through clearly.

I would severely criticize it for being entirely W3C Schema centred, including support for WSDL, and consequently a tool of one set of vendors. No explicit mention is made of ISO Schematron for example. How on earth does the requirement that XML Schemas should be used for data interoperability square with the fact the ODF has a RELAX NG schema not a W3C XML Schema? The trouble with making unnecessary restrictions is that then you have to turn a blind eye to wherever they are impractical, and turning a blind eye introduces an element of arbitrariness that goes against good government.

However, by the time you get to section 2.7, it turns out that RELAX NG is allowed. And it requires GML which has some Schematron schemas IIRC. Perhaps Schematron can creep in as a kind of XSLT? (Obviously because this is a minimal guideline, it is not exhaustive, so my criticism is unfair to that extent!)

I see that no versions of XSLT or XSD or XML (or most things) are mentioned: it would be interesting to have some idea about why versions don’t matter. And I see the list includes MPEG and ZIP. How do they fit into given the definition of openness? Anyway, these are all the practical issues that Bob will be grappling with.

M. David Peterson

AddThis Social Bookmark Button

I’ve known, loved, and respected David Carlisle for quite some time now. As of today, I’ve known him for 24 hours longer than I did yesterday, yet love and respect him twice as much as I did the day before (and that’s saying a *TON*),

Teaching XSLT vs. Teaching XQuery - O’Reilly XML Blog

If you know XQuery and want to learn XSLT you need to learn about template matching, if you know XSLT and want to learn XQuery, you just need to learn a new syntax.

*YES*! The world needs more straight shooters like David (who, like David, have the credentials to back up every word that comes from their general direction), don’t ya think? (NOTE: If you think differently, that’s just because you have no clue what you’re talking about. But that’s okay, I’ll still luv ya. Mmmwwahh! :D)

AddThis Social Bookmark Button

Is it easier to teach XSLT or XQuery to an experienced SQL developer? My recent training experiences indicates that XQuery is easier to learn.

For the last six years I have been building metadata management systems using a diverse set of XML-centric technologies. These languages include XML Schemas, XSLT, Schematron, XHTML, XForms and most recently XQuery. And to be honest, I really do enjoy XQuery. My job as a consultant is to develop feature-rich and highly customizable metadata management systems for my customers and also transfer the skills needed to maintain and extend these systems to my customers though formal training classes as well as one-on-one mentorship.

I have found that it has been very difficult to teach XSLT to an average support person that is only doing occasional XSLT development. But teaching XQuery has been much easier for me to teach when you consider that most of my students have had some exposure to SQL. Looking back on my own learning process, I recall took me about five months of almost continuous study to really feel comfortable with XSLT. Most of this learning curve was because I had not done production XSLT development. But I picked up XQuery in just a few weeks. Perhaps this is because I was already familiar with SQL and XPath. But perhaps this is because XQuery is a little bit more approachable.

I want to note that this does not necessarily imply a poor design of XSLT or the merits of functional programming. After I did learn XSLT I became a real evangelist of its elegance and beauty. At first I was frustrated by not being able to change a “variable”. Later I realized that this restriction is what made XSLT beautiful, simple and elegant. These features keep the transforms free of side-effects. Once XSLT scripts are deployed I seldom found problems. I became enthralled by the fact that the simplicity of the language implied that XSLT custom-hardware could allow transforms to be orders of magnitudes faster than software-only solutions. XSLT may always have a place in CPU-intensive applications.

The difference in my learning time and those of my students reflects the state of our existing knowledge base: most of are already familiar with SQL. Anyone that knows SQL can take a crash course in XQuery designed specifically for SQL developers. Priscilla Walmsley’s excellent book on XQuery (O’Reilly 2007) includes a single chapter targeted at SQL developers making the transition to XQuery that I use in many of my classes. And 90% of the small support and maintenance tasks that many support and maintenance people need to perform do not require them to ever use the more complex functions and modules features of XQuery.

So with this in mind, I have moved most of my metadata registry tools away from XSLT and toward XQuery. This seems consistent with other metadata managers are doing today. There are several people now working on open source metadata management systems.

What about you? Do you have experience teaching both XSLT and XQuery to SQL developers? What is your experience on the learning times?

Erik Wilde

AddThis Social Bookmark Button

Last week, the Open Archives Initiative (OAI) published a set of beta-stage recommendations for compound documents, called Object Reuse and Exchange (ORE). This set of specifications has been published as version 0.9 and has been released for public review and comments (ironically, the press release is a PDF blob).

The problem of compound documents (how to specify that a set of URI-identified resources together form one compound resource) has been around for a while, and never has been solved properly. There are various proposals from different application areas, such as XLink (not quite for compound documents, but it could be used for this purpose as well), METS (using and extending XLink), and DIDL. I am certainly missing some other technologies here, please let me know what they are. The problem is that none of these languages ever caught on, mostly because none of them tried to be general. XLink focused on navigation, METS on libraries, and DIDL on multimedia.

However, it would be good to have a general and simple language for compound documents. If designed well, it could even be easily extended to be used for application-specific scenarios such as those covered by XLink, METS, and DIDL.

The problem is, OAI-ORE will not be it. Instead of designing a simple data model and a simple language for it, they settled for RDF. None of the documents contains any explanation as to why RDF was chosen over a simpler XML-based model. There even is a document that talks about how to implement OAI-ORE in Atom, and all it does is showing how to embed RDF into Atom. Which means that for processing such an Atom feed you need an Atom toolkit as well as an RDF toolkit. As a side note: the terms in the Atom categories are URIs, which does not really follow Atom’s idea of terms as strings.

Generally, it is disappointing to see that a problem as important and manageable as compound documents, which still is an open problem looking for a good solution, has been approached on the wrong level. It is of course possible to come up with an RDF-based solution for that problem, but this unnecessarily introduces technology layers which for this particular problem are not required.

This means that the quest for a general and XML-based format for compound document descriptions is still on, and OAI-ORE is not a real contender in this race. Well, maybe it still could be one if the abstract data model also got a representation in plain XML. Unfortunately, the model is not as abstract as its name implies, it is a rather concrete definition of an RDF vocabulary, which will make it quite a bit harder to come up with a good and isomorphic XML representation. The effort might be worth it, however, the installed base of XML is significantly bigger than that of RDF.

Rick Jelliffe

AddThis Social Bookmark Button

European Commissioner for Competition Policy Neelie Kroes gave a really interesting talk this week at OpenForum Europe: sounds like a breakfast that would have been very stimulating.

There some very obvious tough talk directed at Microsoft, she is the person with the stick rather than the carrots after all, and most of the commentary I have seen have focussed on that. But there was a few other points that I found interesting with respect to comments I have been making.

Standards for market dominating technologies

Readers may remember that I have been pushing that All Interface Technologies by Market Dominators should be QA-ed, RAND-z Standards! By interface technologies I mean the boundary or exposed technologies: protocols, APIs, file formats.

Dr Kroes writes about so-called de facto standards:

First, the de facto standard could be subject to the same requirements as more formal standards:

* ensuring the disclosure of necessary information allowing interoperability with the standard;

* ensuring that other market participants get some assurance that the information is complete and accurate, and providing them with some means of redress if it is not;

* ensuring that the rates charged for such information are fair, and are based on the inherent value of the interoperability information (rather than the information’s value as a gatekeeper).

The process of subjecting a standard to the same requirements as a formal standard is called, err, standardization.

Note, I strictly use “standard” in the sense of the offered voluntary standard: standardization means being documented, QA-ed, RAND-z, etc and on the books, it certainly does not mean (in my usage) that it is mandated for use (from the demand side of the standards market). If I can fend off some flames before they arrive, at ISO/IEC JTC1 there are types of lesser standards, such as Technical Reports, that may have less scary implications for panic-ridden and be certainly more appropriate that full standards in some cases: I include these as “standards”.

So I don’t see any difference in what Dr Kroes has suggested and my comment; indeed I think it is a very welcome and logical step forward. Indeed, she mentions it in the context of what competition authorities may be obliged to do!

When a market develops in such a way that a particular proprietary technology becomes a de facto standard, then the owner of that technology may have such power over the market that it can lock-in its customers and exclude its competitors.

Where a technology owner exploits that power, then a competition authority or a regulator may need to intervene. It is far from an ideal situation, but that it is less than ideal does not absolve a competition authority of its obligations to protect the competitive process and consumers.

Dr Kroes does however earlier use “standardization” in a loose way, though I don’t imagine it would cause anyone to choke on their croissants: while I agree with It is simplistic to assume that because standardisation sometimes brings benefits, more standardisation will bring more benefits. on the vaccuuous lines that too much of anything is bad, the two different meanings of “standardization” should not be lumped together: standardization in the sense of “putting a technology on the books ready for voluntary use or voluntary disdain” then I don’t see that we are anywhere near the point of having too many standards nor that they are complete enough or updated enough (and I think Dr Kroes may not mean this, given the comments quoted above). However standardization in the sense of adopting or mandating a standard is an entirely different question, and I certain agree with her for that meaning.

In case people were wondering about MS increasing embrace of ODF, the writing is on wall. Dr Kroes says:

In addition, where equivalent open standards exist, we could also consider requiring the dominant company to support those too.

I certainly support that: see The Norwegians get it!

Cartels

Sometimes I feel like I am the only voice, peeping out “cartelization is a dominating regulatory issue” for standards bodies. Standards organizations have little and perhaps no obligations (or, at least, capability) to redress monopoly positions of technologies in a market, and indeed as the previous section mentions, standardization (if RAND-z and proper) actually can actually ameliorate monopoly positions (and they may have a duty to assist in making voluntary standards for that technology); however standards bodies must be careful not to operate as cartels of any kind.

Dr Kroes mentions cartels early: Her opening sentence.

Credible competition policy requires competition law enforcement. Cartel cases, merger cases, abuse of dominance cases.

and cuts to the chase later:

…standardisation agreements should be based on the merits of the technologies involved. Allowing companies to sit around a table and agree technical developments for their industry is not something that the competition rules would usually allow. So when it is allowed we have to look carefully at how it is done.

If voting in the standard-setting context is influenced less by the technical merits of the technology but rather by side agreements, inducements, package deals, reciprocal agreements, or commercial pressure … then these risk falling foul of the competition rules.

Now this brings up an interesting question. I raised the issue of cartelization, in particular the aspect of vendor collusion of a majority against their dominant competitor in Is our idea of open standards good enough?

The question may seem provocative to even ask, but sooner or later it must be asked. Are standards made by organizations where vendor stakeholders can and do outnumber non-corporate stakeholders acceptable or sound?

We can take OASIS, ECMA, W3C or any of the boutique consortia that allow corporate members (or their individual proxies.) Why should we believe that standard is sound enough to mandate merely on the absence of discovered side agreements, inducements, etc, if it has been made by a committee dominated by vendors (at the quorum level of real participation)?

It seems to me that only the various international standards bodies, which have direct voting by National Bodies not individual stakeholders in particular vendors, provides the workable immunity from direct control by vendors (singly or in collision) that needs to be required for mandatory standards. It can certainly be argued that the boutique consortia may have standards approved ultimately by a larger member vote than the working group that created the standard, and that the membership was not dominated by vendors; but that is something that requires certification or monitoring—with ISO it is manifestly the default case because of National Body voting.

So the National Body system prevents “cartelization-in-the-large”, where the final votes have a good measure of independence. However, no system I have seen completely prevents “cartelization-in-the-small”: this is where the small working groups that prepare the drafts initially have vendor domination. Again, it is not always the case: but look at the composition of the ODF TC at OASIS and the OOXML ECMA TC45 over the last two years and you can catch my drift.

Furthermore, in practice not all members are equal: government members of committees are very likely to be there to advance a particular government agenda (accessibility, say) rather than as providers of alternative technical solutions than the vendors come up with: a working group may have effective vendor domination at the technology selection level even though the vendors do not control of the requirements.

There are some other possible approaches too. For example, some standards bodies allocate chairs in working groups by a fixed number of representatives per sector: some academics, some government, some industry, which has some merit.

All this is why I wrote

But the issue of public and archival formats for government and agency documents is clearly one where governments have a vital interest: the customer is always right. This is why I believe governments need to look beyond the current academic definitions of “open standards” and re-frame the issue as “How do we achieve verifiably vendor-neutral standards?”

Maintenance

There is one part that where some implications need to be thought through a little more, perhaps. In the sentence after the When a market section quoted above, Dr Kroes says

In essence the competition authority has to recreate the conditions of competition that would have emerged from a properly carried out standardisation process.

Dr Kroes uses process but means a terminating process, I think. But standardization of a technology is a continuing process, not a one-off event: standards have lifecycles, and waving a magic wand of standardization on a market dominating technology to give it some number or status will do little to help it unless there is an ongoing process of development, correction, evolution, convergence, and so on.

And an ongoing process requires an organization. A standards organization. So when the competition authority “recreates” the conditions of competition that emerged from a properly carried out standarization process (she says this in the context of de facto standards that have had no official process, by the way) this must ultimately involve passing the maintenance on to a standards body and verifying it where there is some concern. (There is certainly scope for Competition Commission action here: if governments and user groups and academia do not participate in standards bodies, say out of some mix of sloth, underinvestment, underskilling, and lack of vision (rather than just because of being poor) it would be great if the Competition Commission could compel or encourage at least matching participation by non-vendors in standards groups of interest. But that is just pure fancy, I know!)

And, of course, this maintenance has to be done with some openness. And openness means not only openness to the needs of stakeholders, but a responsiveness to outside requests. A prioritization of vendor requirements for new features over external user requests for corrections should be taken as ipso facto evidence of vendor domination of the standards group, and/or a failure in openness. Andy Updegrove has recently been talking up the need for metrics for judging the effective operation of standards bodies, a good idea, and metrics for openness and lack of vendor domination in quorums should certainly be one objective measure of this. Despite how it sounds, actually people in almost every standards body are keen for more participation.

Rick Jelliffe

AddThis Social Bookmark Button

There is a new avenue for participation in the ODF effort at OASIS: ODF Implementation, interoperability and conformance which I commend.

Conventionally, people speak of syntactical conformance and semantic conformance, where the first is easy and the second is hard. In fact, because computers can only deal in symbols, the second is impossible. So the issue for automated conformance testing becomes “how can we reflect the semantic operations into syntactical artifacts: into symbols we can investigate.”

So the semantic conformance problem then resolves into just another validation issue. And we have lots of nice schema languages notably Schematron which can help out there. (And using general purpose languages at a pinch, no worries!)

To put it another way, it is an issue of data capture.

For ODF, I would recommend they adopt a strategy of progressive but complete verification.

For ODF import and export, this is easy: have a good RELAX NG schema (make it quite forgiving), use NVDL and DSRL if needed, then use Schematron phases to allow various levels of validity to be detected. The trouble with the monolithic valid/invalid distinction is that there may easily be invalidities in thing you don’t care about. An implementation of a word processor may have problems in its support for spreadsheets, but it should be a minor issue not a flagged as a showstopper. Schematron’s phase mechanism groups patterns of assertions so that you can have a much more useful chunked view of the strengths and weaknesses of a system.

But this leaves the issue of screen display. How can that be tested? Given my characterization of the issue as being one of data capture, the answer is that ODF needs to specify a page dump format, which can then be tested with automated tests. What would this format look like? Think PDF in XML: tiny-SVG may be good enough—anything where you can get the page position of each character (or string) and graphic on a page.

For example, let us suppose we want to test a table implementation. Now we can use RELAX NG to say that there should be tables, rows, cells etc. And we can use Schematron to say that various numeric constraints should hold. And that gets us a long way into validating that good ODF is being generated and accepted. And we can have tests for whether bad ODF is accepted, and so on.

But what about the graphical component? Having a simple page object dump allows testing that, for example, if you have a string a and a string b in two adjacent cells of the same row in a table (in the same script and of the same metrics, etc), then the (X,Y) co-ordinates of their base points conform to (Xa < Xb) and (Ya ~= Yb)

And you can use Schematron for that kind of validation. The advantage of having this built into the spec is that then the ODF spec can use mathematic properties and constraints rather than just natural language. The disadvantage of this approach is that it imposes a burden on the implementer, in particular if the graphic library cannot be trapped conveniently to provide the information; however, it certainly should be possible to generate this information from the PDF (in a reverse of the Magellan software!) especially if using a nice PDF subset like PDF/A.

Rick Jelliffe

AddThis Social Bookmark Button

I would like to propose a new test which you can use to see whether your favoured spout* of technical information is biased (or possibly just a re-printer of press releases, if there is a difference) or not. Here it is:

  1. They reported that the UK Unix Users group had take the British Standards Institute to the UK High Court, and
  2. They didn’t report in the same detail the outcome: that the High Court utterly rejected it.

Surprisingly, the Inquirer gets the guernsey here, in the marvelously titled UK unix beardies appeal for $cash. No sign of it so far on CNET, ComputerWorld, ConsortiumInfo, Slashdot (references welcome). (Groklaw perhaps did not have space for this, given that it has two interesting posts in its news about IBMs RoadRunner supercomputer which is “to ensure the safety and reliability of the nation’s nuclear weapons stockpile.” Terrific boxes! Perhaps the High Court needs to put out its findings disguised as product press releases in order to get into independent media?)

Quoting from the Inquirer:

Mr Justice Lloyd Jones rejected the UKUUG’s application for a judicial review last Thursday, giving the group until the break of dawn this Friday to raise a legal fund for an appeal.

“This application does not disclose any arguable breach of the procedures of BSI or of rules of procedural fairness,” said Justice Jones on Thursday.

“In any event, the application is academic in light of the adoption of the new standard by ISO,” he added.

For terminology. In JTC1, the terminology is that a standard is accepted by a ballot and consequently published. This general process is called adoption. So IS29500 has been accepted as an ISO standard, but not yet published. The UKUUG’s reported comment that

OOXML had not been ratified as a standard, it had merely been put on the fast track to certification.

is mumbo-jumbo.

AddThis Social Bookmark Button

Have you ever wondered if the laws of evolution apply to computer languages? When you walk down the isle at your favorite bookstore, does it seam like there are actually more computer languages than last year? What forces are driving each of these new languages to evolve?

In 1835 Charles Darwin visited the Galapagos Islands. There he collected what he thought were about a dozen distinct species of birds. Upon returning to England he discovered that each of these species had evolved from a single species of finches. On the various Galapagos Islands the requirements for food gathering was different, but consistent over hundreds of thousands of years. Enough time for a single species to adapt to meet consistent requirements.

Consider the Raccoon: omnivores that have proved to be one of the most adaptable mammals on Earth. The Raccoon’s range has rapidly expanded into urban areas due to their ability to quickly adapt to new requirements before other animals have had time for the wheels of evolution to turn.

So goes it with computer languages. Some procedural languages can be quickly adapted to fill in the needs for a new niche. When the web was young, procedural languages like Java and JavaScript quickly filled in the need for a variety of tasks. As the requirements for building web applications stabilized, declarative systems like CSS, XForms and XQuery started to push procedural languages back into niche-areas. As these declarative languages stabilize and become worldwide standards, graphical tools are being created to allow non-programmers to create, manipulate and extend these systems.

This is why many of us believe their will always be some need for procedural programming, but certainly not for building standard web applications that are controlled by style sheets and user interaction forms. Like the finch, declarative languages need a little longer to evolve. It sometimes takes years for a small vocabulary of functional specification patterns to emerge and be given labels. Additionally, it can takes years for the standards bodies to agree on the best way to deliver these new languages in a set of semantically precise data elements that have unambiguous interpretations. Finally, it may take another few years for IT managers to realized that they really do lower costs if they avoid vendor-specific implementations and adopt worldwide standards.

When CSS first came out you may have been a little reluctant to let web designers play with a rules engine. As XForms becomes ubiquitous you may be resisting change because you have invested so much time and energy learning how to debug JavaScript (without a debugger). You can not hold back the forces of evolution…and now we all need to adapt to the declarative world or risk our own extinction.

If you are interested in more on this topic see my Presentation from the 2007 Semantic Technology Conference The Semantics of Declarative Systems