Recently I passed both my 44th birthday and my 15th wedding anniversary, just signed my daughter up for high school and was told by my doctor that my HDL was soundly thrashing my LDL. My beard, which I’ve worn since my early twenties, is now streaked with gray (a curse of red hair, I fear), and I notice that lately the stairs seem to have mysteriously begun to grow from one trip to the next. T.S. Elliot is beginning to become … relevant … to me. All signs, perhaps, that I am no longer the young spring chicken I once was.
As I was thinking about things to write for this particular column, this realization about age began to sink in about the standard that I’ve spent the last decade writing about. A decade is a long time in computer circles, especially when you figure that there’s only been five or six of them in the whole history of computing. XML has gone from being a “standard” that perhaps a couple dozen people worldwide knew about to a pervasive technology that is so well entrenched that many people don’t really even think much about it any more. We argue about the XMLification of word processing and spreadsheet programs, we debate whether Atom or RSS 2.0 will predominate, we shake our heads at the whole notion of web services and how the dominant web services protocol was designed largely by bloggers to let people know about their websites.
In short, while XML is not exactly doddering off to the rest home, its angle-bracket knees are no longer as flexible as they used to be. If it were a person you’d expect it to be muttering about those damn JSON punks and how property taxes and inflation are eating up its standard of living. It no longer is as flashy a technology as it used to be (even as Flash has been migrating to an XML format), and more than once I’ve run into twenty-something AJAX hot-shots who declare XML so yesterday (even as they write applications that bind AJAX objects to XML structures). It’s become the establishment, though in many respects I suspect that while its glory days are behind it, XML is becoming more integrated into the fabric of computing.
To that end, I wanted to offer up an assessment of where XML itself is going. As always, this is written by a guy in a coffee-shop, so take it with the usually assortments of saline condiments:
- Hello, we’re from the government and we’re here to help. XML has become the lingua franca of a surprisingly large number of government agencies, ministries and departments. Whichever divide you fall into on the ODF vs. OOXML debate, the reality here is that both of these are XML formats and they would not have emerged if the demand for an XML based word processing format did not exist, largely from this sector. An XML word document format and a couple of transformations will give you the foundation of any number of CMS systems, and that in turn is now making it easier to actually turn that mountain of documents into useful repositories of information rather than largely locked in disk drive filler.
- The Marriage of XQuery and REST. I’ve written about this fairly extensively in this venue, but think it is worth reiterating here. Combine an XQuery based system with a server objects namespace and you have the foundation for a remarkably powerful server environment comparable to ASP.NET, PHP or Ruby, and what’s more, such a system is remarkably neutral in its deployment (you could deploy such solutions from ASP.NET, PHP, JSP or Ruby).
- Add XForms and Stir. XQuery is effective because it reduces the middleware “translation” layer to practically nothing; if you work with UI components that can in turn consume that XML (either via standalone model instance islands a la XForms or via other XML aware toolkits) and you have a remarkably powerful combination where you are shuttling XML back and forth without ever having to worry about the underlying implementations. Such a solution isn’t a “total” solution, but then again it doesn’t need to be - you can effectively wrap services such as sendmail calls or image manipulation in an XQuery module that lets you stay at the XML abstraction layer.
- Keep It Simply Semantically-neutral, Stupid. There’s an interesting trend developing at the enterprise level. XML by itself isn’t enough … a system also has to be both comparatively simple and push the semantics as late into the process as possible. If you have a system with 1000 elements, you have about 950 too many. I think one of the things that has proved a limiting case for the adoption of XAML (given the amount of resources you’d expect could be poured into it by Microsoft) is that XAML is a reflection of the .NET DOM, and as such has literally thousands of potential elements. The solutions that in general seem to have staying power tend to be “reasonably modular”, and with a clear mechanism for managing such modularization. What’s more, many of the most robust solutions work best when combined with a transformation pipeline, and as such remain semantically neutral until they become “actualized” in an appropriate viewer (such as a browser).
- Mobile Technology Pushing Standards. If you want to see where the real action is in the XML sphere, get away from the doddering browsers and take a look at the mobile market. Declarative programming works better in such an environment where you can define explicit pre-defined behaviors for given elements in firmware rather than deal with the vagaries of scripting, and as such these devices are rapidly emerging as the forefront of XML implementations. Most web browsers are only just now inching up to SVG 1.1, but SVG 1.2 Tiny (such as Ikivo’s implementation) has been a staple on many phones and other mobile devices for quite some time, and even in places where the W3C standards aren’t being used, the implementations that are being used are XML based. On the XForms front, picoforms is the player to watch in this space.
- Semantics isn’t just for kids anymore. The rise of folksonomies have brought the terms “taxonomist” and “ontologist” out of the domains of library science and religion respectively, and turned them into remarkably high paying jobs. We’re now discovering that the process of defining schemas is remarkably difficult, and that meaning is similarly difficult to hold and describe. While I am still not sure that RDF is the best language for describing such semantics, I see much of the borders of what used to be called AI and cybernetics increasingly described in angle-bracket terms, and declarative languages in general enjoying a renaissance as we push our awareness of meaning to the next level. If I was entering IT as a newly managed IT college graduate, I’d be looking at semantic systems and knowledge management as the “hot” fields to be getting into.
- AJAX. Okay, I touched on this one before, but I think its worth making a few more comments here. AJAX is here to stay, though I see it being most predominant in the desktop/laptop presentation side more than anything, and the predominant form is going to be Firefox flavoured. Why? At least for the next few years, Firefox has developer momentum behind it, though its running into the complexity conundrum that’s making it harder and harder to push the envelope. I don’t think that a branding change is going to make Silverlight any more palatable as a technology - those who are firmly in the MS camp will use it, but there is a fair degree of hostility in the marketplace for anything Microsoft right now from the developer side, and at least for a while, those who are most heavily into AJAX development are powering up Mozilla first then maybe thinking about Microsoft as an afterthought browser (I suppose we have to support it …). Microsoft would do itself a world of good to swallow its Not-Invented-Here pride and adopt the Mozilla API - even though I believe that technologically Microsoft’s are probably better written, programmers are just as political as anyone else.
- Pipelines and Work-flows. The W3C mandate is slowly moving up the stack, recognizing that in an increasingly distributed world, document management becomes an infrastructure issue. There’ve been a lot of enterprise level process flow, orchestration and work-flow management schemas developed by OASIS, WS-* and a number of individual companies or OSS projects, yet none of them have really managed to click. I believe that this is because work-flow management and orchestration are ultimately atomic processes that need to be intrinsic to the web infrastructure, and that all of the solutions presented thus far fail because they are working too high in the stack. For simple pipeline management, pay attention to XProc, which I see as being a low level specification that works in the same space as ANT (though perhaps lower in the stack). Work-flow management schemas might similarly need to be brought into the W3C (or at least OASIS); if XML development patterns hold true here, a simple workflow management schema would likely succeed where complex ones fail.
- Schematron. XPath is an interesting language - where it plays an integral part in other languages, those other languages eventually do quite well even though they may get a slow start. I see Schematron as falling into that camp. Schematron can be implemented in a number of different ways, but provides a mechanism for associating more complex constraints than can be expressed in XSD in an easy to use format. While its validation aspect is its primary goal, the conditional nature of validation also opens up the possibility of introducing other constraint options (such as calculations or relevancy constraints) into systems. There is similarly an effort at the W3C level for business rule encapsulation, though my suspicion is that in the long run it will look a lot like Schematron.
Public Repositories and Feeds. XML based data repositories are becoming common, and any time you provide that raw material you will see innovation. On the XForms.org site (a Drupal based site) I run a number of aggregated news feeds that listen in for XForms related content on both the general search sites and specialized technical news feeds and display them to the users; a related concept is to read (and filter) job sites to display relevant jobs in the field to members of the site. I suspect that this feed model will increasingly end up replacing the sometimes more cumbersome repositories in various verticals (such as Geographic Information Systems, or GIS).
- Atom and the Atom Publishing Protocol should make it big here. Consider a “mashup” of atom and the human genome database, for instance, and extrapolate from there. Google’s adoption of Atom and the APP will likely add considerable weight to that specification, and have already moved a significant amount of my own development efforts to support atom as a general transport protocol, and the APP as a general access format, and Tim Bray’s work in creating a mod_atom module for Apache should have significant impact in the next couple of years - providing a more efficient, usable and secure layer than WebDAV by itself has been able to provide.
- XML Databases and the plateauing of SQL. This will take a while longer, but I see SQL-based server systems plateauing in the next few years - they won’t go away, but those databases will likely begin to look increasingly like XML databases, while existing XML based database systems will continue to gain market share. Part of this will likely be due to XQuery (plus some form of XUpdate, which will likely be introduced shortly). I think that XUpdate has been slower out of the gate because the entrenched SQL vendors realize that by having both query and update that a big rationale for their SQL products as separate, self-contained systems goes away, but customer demand, hungry commercial upstarts and OSS projects will likely drive towards a need to readdress this issue.
I find that I like getting older, even despite those treacherous stairs. The problem domains have become larger, more complex, and more political. I think this is true of XML as well; the issues involved in its use are no longer those of adoption but of scope, of using the language as a tool for achieving consensus and smoothing the barriers to entry. I see XML coming to an equilibrium position with AJAX and another one with SQL - not wiping out these technologies, but providing a bridge between technical domains. It is playing a huge part in the emerging semantic web, and in hybrid areas (such as bioinformatics) it has become the exchange medium of choice. Maybe, just maybe, XML is beginning to grow up.
Kurt Cagle is an author and information architect living in Victoria, British Columbia, Canada. He will be at the O’Reilly Open Source conference in Portland Oregon, giving a paper on XQuery/XForms systems and REST Objectified XML (ROX).