Recently I passed both my 44th birthday and my 15th wedding anniversary, just signed my daughter up for high school and was told by my doctor that my HDL was soundly thrashing my LDL. My beard, which I’ve worn since my early twenties, is now streaked with gray (a curse of red hair, I fear), and I notice that lately the stairs seem to have mysteriously begun to grow from one trip to the next. T.S. Elliot is beginning to become … relevant … to me. All signs, perhaps, that I am no longer the young spring chicken I once was.
As I was thinking about things to write for this particular column, this realization about age began to sink in about the standard that I’ve spent the last decade writing about. A decade is a long time in computer circles, especially when you figure that there’s only been five or six of them in the whole history of computing. XML has gone from being a “standard” that perhaps a couple dozen people worldwide knew about to a pervasive technology that is so well entrenched that many people don’t really even think much about it any more. We argue about the XMLification of word processing and spreadsheet programs, we debate whether Atom or RSS 2.0 will predominate, we shake our heads at the whole notion of web services and how the dominant web services protocol was designed largely by bloggers to let people know about their websites.
In short, while XML is not exactly doddering off to the rest home, its angle-bracket knees are no longer as flexible as they used to be. If it were a person you’d expect it to be muttering about those damn JSON punks and how property taxes and inflation are eating up its standard of living. It no longer is as flashy a technology as it used to be (even as Flash has been migrating to an XML format), and more than once I’ve run into twenty-something AJAX hot-shots who declare XML so yesterday (even as they write applications that bind AJAX objects to XML structures). It’s become the establishment, though in many respects I suspect that while its glory days are behind it, XML is becoming more integrated into the fabric of computing.
To that end, I wanted to offer up an assessment of where XML itself is going. As always, this is written by a guy in a coffee-shop, so take it with the usually assortments of saline condiments:
- Hello, we’re from the government and we’re here to help. XML has become the lingua franca of a surprisingly large number of government agencies, ministries and departments. Whichever divide you fall into on the ODF vs. OOXML debate, the reality here is that both of these are XML formats and they would not have emerged if the demand for an XML based word processing format did not exist, largely from this sector. An XML word document format and a couple of transformations will give you the foundation of any number of CMS systems, and that in turn is now making it easier to actually turn that mountain of documents into useful repositories of information rather than largely locked in disk drive filler.
- Enterprise 2.0 is not JSON-based. Transaction validity, robustness of content, minimal semantics, proven tools, data store integration, component management .. all of these factors need to be met when businesses adopt a technology, and after a decade of development, XML is increasingly supplying all of these needs. It’s worth noting that, despite the AJAX term showing up in 2004, the technology for doing message transport via JavaScript has been around since 1999 (longer if you consider security hole hacks into IFrame), and for all that it is “hot” now, I am hearing from many of enterprise level clients that they distrust the security of AJAX, the rather poor performance of many JavaScript tools and its inability to play nicely with others. I’ve been working with JavaScript since its inception and work with it daily even now, but it works best when it can work in conjunction with an (increasingly XML based) Document Object Model.
- The Marriage of XQuery and REST. I’ve written about this fairly extensively in this venue, but think it is worth reiterating here. Combine an XQuery based system with a server objects namespace and you have the foundation for a remarkably powerful server environment comparable to ASP.NET, PHP or Ruby, and what’s more, such a system is remarkably neutral in its deployment (you could deploy such solutions from ASP.NET, PHP, JSP or Ruby).
- Add XForms and Stir. XQuery is effective because it reduces the middleware “translation” layer to practically nothing; if you work with UI components that can in turn consume that XML (either via standalone model instance islands a la XForms or via other XML aware toolkits) and you have a remarkably powerful combination where you are shuttling XML back and forth without ever having to worry about the underlying implementations. Such a solution isn’t a “total” solution, but then again it doesn’t need to be - you can effectively wrap services such as sendmail calls or image manipulation in an XQuery module that lets you stay at the XML abstraction layer.
- Keep It Simply Semantically-neutral, Stupid. There’s an interesting trend developing at the enterprise level. XML by itself isn’t enough … a system also has to be both comparatively simple and push the semantics as late into the process as possible. If you have a system with 1000 elements, you have about 950 too many. I think one of the things that has proved a limiting case for the adoption of XAML (given the amount of resources you’d expect could be poured into it by Microsoft) is that XAML is a reflection of the .NET DOM, and as such has literally thousands of potential elements. The solutions that in general seem to have staying power tend to be “reasonably modular”, and with a clear mechanism for managing such modularization. What’s more, many of the most robust solutions work best when combined with a transformation pipeline, and as such remain semantically neutral until they become “actualized” in an appropriate viewer (such as a browser).
- Mobile Technology Pushing Standards. If you want to see where the real action is in the XML sphere, get away from the doddering browsers and take a look at the mobile market. Declarative programming works better in such an environment where you can define explicit pre-defined behaviors for given elements in firmware rather than deal with the vagaries of scripting, and as such these devices are rapidly emerging as the forefront of XML implementations. Most web browsers are only just now inching up to SVG 1.1, but SVG 1.2 Tiny (such as Ikivo’s implementation) has been a staple on many phones and other mobile devices for quite some time, and even in places where the W3C standards aren’t being used, the implementations that are being used are XML based. On the XForms front, picoforms is the player to watch in this space.
- Semantics isn’t just for kids anymore. The rise of folksonomies have brought the terms “taxonomist” and “ontologist” out of the domains of library science and religion respectively, and turned them into remarkably high paying jobs. We’re now discovering that the process of defining schemas is remarkably difficult, and that meaning is similarly difficult to hold and describe. While I am still not sure that RDF is the best language for describing such semantics, I see much of the borders of what used to be called AI and cybernetics increasingly described in angle-bracket terms, and declarative languages in general enjoying a renaissance as we push our awareness of meaning to the next level. If I was entering IT as a newly managed IT college graduate, I’d be looking at semantic systems and knowledge management as the “hot” fields to be getting into.
- AJAX. Okay, I touched on this one before, but I think its worth making a few more comments here. AJAX is here to stay, though I see it being most predominant in the desktop/laptop presentation side more than anything, and the predominant form is going to be Firefox flavoured. Why? At least for the next few years, Firefox has developer momentum behind it, though its running into the complexity conundrum that’s making it harder and harder to push the envelope. I don’t think that a branding change is going to make Silverlight any more palatable as a technology - those who are firmly in the MS camp will use it, but there is a fair degree of hostility in the marketplace for anything Microsoft right now from the developer side, and at least for a while, those who are most heavily into AJAX development are powering up Mozilla first then maybe thinking about Microsoft as an afterthought browser (I suppose we have to support it …). Microsoft would do itself a world of good to swallow its Not-Invented-Here pride and adopt the Mozilla API - even though I believe that technologically Microsoft’s are probably better written, programmers are just as political as anyone else.
- Pipelines and Work-flows. The W3C mandate is slowly moving up the stack, recognizing that in an increasingly distributed world, document management becomes an infrastructure issue. There’ve been a lot of enterprise level process flow, orchestration and work-flow management schemas developed by OASIS, WS-* and a number of individual companies or OSS projects, yet none of them have really managed to click. I believe that this is because work-flow management and orchestration are ultimately atomic processes that need to be intrinsic to the web infrastructure, and that all of the solutions presented thus far fail because they are working too high in the stack. For simple pipeline management, pay attention to XProc, which I see as being a low level specification that works in the same space as ANT (though perhaps lower in the stack). Work-flow management schemas might similarly need to be brought into the W3C (or at least OASIS); if XML development patterns hold true here, a simple workflow management schema would likely succeed where complex ones fail.
- Schematron. XPath is an interesting language - where it plays an integral part in other languages, those other languages eventually do quite well even though they may get a slow start. I see Schematron as falling into that camp. Schematron can be implemented in a number of different ways, but provides a mechanism for associating more complex constraints than can be expressed in XSD in an easy to use format. While its validation aspect is its primary goal, the conditional nature of validation also opens up the possibility of introducing other constraint options (such as calculations or relevancy constraints) into systems. There is similarly an effort at the W3C level for business rule encapsulation, though my suspicion is that in the long run it will look a lot like Schematron.
- HTML 5 and Bindings. Note to the HTML camp - HTML 5.0 will be XML based, it’s just a question of how much core technology will separate it from XHTML 2. There is no valid reason for HTML not to close its tags, quote its attributes, and respect namespaces. I think the bigger debates are going to be around issues like XForms vs. HTML5 Forms, which I see as the question of whether the language should be component-centric or data model centric (and there are valid arguments on both sides of that one), and the degree to which CSS and JavaScript should control things. XBL (and XBL2) bindings bring a lot to the table, including a reasonably comprehensive mechanism for mixing the structure that tags bring with the fluidity of JavaScript to manipulate those tags. Certainly, I see user defined semantics for tags as being the hallmark of the next five years just as user defined activities are defining most of the next leg of the web. It also provides a formal mechanism for building mashups (god, that term is beginning to seem antiquated!) without breaking the integrity of an XHTML structure.
-
Public Repositories and Feeds. XML based data repositories are becoming common, and any time you provide that raw material you will see innovation. On the XForms.org site (a Drupal based site) I run a number of aggregated news feeds that listen in for XForms related content on both the general search sites and specialized technical news feeds and display them to the users; a related concept is to read (and filter) job sites to display relevant jobs in the field to members of the site. I suspect that this feed model will increasingly end up replacing the sometimes more cumbersome repositories in various verticals (such as Geographic Information Systems, or GIS).
- Atom and the Atom Publishing Protocol should make it big here. Consider a “mashup” of atom and the human genome database, for instance, and extrapolate from there. Google’s adoption of Atom and the APP will likely add considerable weight to that specification, and have already moved a significant amount of my own development efforts to support atom as a general transport protocol, and the APP as a general access format, and Tim Bray’s work in creating a mod_atom module for Apache should have significant impact in the next couple of years - providing a more efficient, usable and secure layer than WebDAV by itself has been able to provide.
- XML Databases and the plateauing of SQL. This will take a while longer, but I see SQL-based server systems plateauing in the next few years - they won’t go away, but those databases will likely begin to look increasingly like XML databases, while existing XML based database systems will continue to gain market share. Part of this will likely be due to XQuery (plus some form of XUpdate, which will likely be introduced shortly). I think that XUpdate has been slower out of the gate because the entrenched SQL vendors realize that by having both query and update that a big rationale for their SQL products as separate, self-contained systems goes away, but customer demand, hungry commercial upstarts and OSS projects will likely drive towards a need to readdress this issue.
I find that I like getting older, even despite those treacherous stairs. The problem domains have become larger, more complex, and more political. I think this is true of XML as well; the issues involved in its use are no longer those of adoption but of scope, of using the language as a tool for achieving consensus and smoothing the barriers to entry. I see XML coming to an equilibrium position with AJAX and another one with SQL - not wiping out these technologies, but providing a bridge between technical domains. It is playing a huge part in the emerging semantic web, and in hybrid areas (such as bioinformatics) it has become the exchange medium of choice. Maybe, just maybe, XML is beginning to grow up.
Kurt Cagle is an author and information architect living in Victoria, British Columbia, Canada. He will be at the O’Reilly Open Source conference in Portland Oregon, giving a paper on XQuery/XForms systems and REST Objectified XML (ROX).


> (...) those damn JSON punks (...)
Priceless Kurt! :D
Everything we told the HTML punks would happen did happen, and everything the JSON punks tell us will happen will happen.
My son turned 18 yesterday. I can describe almost every minute of the day 18 years ago but only snippets of what has happened since. On the other hand, it was all *inevitable*. So JSON supporters take heart, because we float on a sea of relentlessly churning technology which is short-cycle unpredictable and long-cycle boringly predictable, your days to whine shine and then recline will come too.
By the time you finally kick back, you won't be asking yourself if the technology wins matter. You will be looking quite personally and locally at what you did with it and how that affects your 18 year old kid. You will measure your success in the ambition in his eyes to best you, and you will be satisfied or dismayed to the degree you made that possible. Then it will be his day or her day.
Don't weep for the stuff. Cheer for the mammals.
Len,
True, all too true.
I think that AJAX and JSON will have a huge impact in the long term, but I suspect that its wins will likely be in areas far outside of its current focus - over time technologies that work together tend to reach a balance; for JSON, I see its role being that of RNA to XML's DNA, both are necessary, and RNA does not have quite enough structure and integrity to serve as a foundation by itself for stateful storage of content, but it makes a compelling envelope. Get a lightweight parser like E4X in place, and you have a powerful package.
As to your comments about kids besting their old man, I could not agree more. Raising girls I think the dynamics are a little different, but not dramatically so (my oldest has nearly no interest in computing, but she's going to be a hell of an artist, and the competitive spirit with her dad is alive and well there, thankfully ... of course, my youngest has figured out the game editor on one of my Linux games at the age of seven completely by herself, so yeah, the punks are alive and well ...).
Thanks for the perspective.
Love the Insight...
I am A young Punk(28) who really sees the value of XML. As more 4g Languages are Speaking Directly to Web services(which Are Spitting out MBs of XML), I have the urge to Get my new'XML' Tattoo right on My neck. Underneath all of this web 2.0 Buzz is the power of XML(portable DBs) Flying though Cyberspace...I wish More Of my generation would look a little deeper into the data, rather than CFing everything and swearing the know the Way.
Thanks for the Insight.
Brett
So, I'll admit to being something of an XML newbiebeing that I've been an unemployed stay-at-home dad since January 2001, right around the time when XML seemed to be getting hot.
I am, however, a fifty year old former IT professional with almost 20 years experience who is currently enrolled in an XML Programming classsomething that gets mildly griped about a tad on my blogand I must say that your post is exceptionally full of perspective that intrigues and excites.
Indeed, much in your post is beyond what I know about the technology at the moment, but that won't stop me from digging deeper into any of that. My Advisor at Saint Paul College told me XML was one of the topics I should embrace and it seems he's quite correct.
Great post. I'll be back . . .
My daughter scares me worse. My son is level headed and wants a career in computer science and game design. My daughter wants to rule the world and says she has to knock off the old man first because he is smart and will foil her. She's right on item 2. The daughter will break the heart worse because we have absolutely no defense.
XML is here to enable a system to interoperate with the most liberal contract possible. Ever since its release, most of the churn has been in tightening up the contracts and that leads to some insanely complex and innanely simple ideas. I still consider it a minimal amount of control where a minimal amount is the best thing possible across the domains, and local control strength is left to the local voters.
No size fits all. Some sizes fit most uncomfortably. What I'm seeing done to Rick Jeliffe (arguably one of the most level headed guys in the markup business) demonstrates the costs of being civil when the 'wisdom of crowds' turns into a lynchmob. That twig was bent a long time ago. There are times when the crowd should get what it wants and see if it can live with it.
The sad thing is, if he were a girl and those posts were being put up, a lot of folks you and I know would be out there blogging to defend him instead of benefitting by it. I weep for the web when I see this.
Or an apple user... it's sad that the web should and can enable us to communicate better with one another, and yet people still take an iota of information about a person and extrapolate the rest from there, whether the topic is iso standards or the ipod vs. zune debate. It puts me in mind of the secret life of walter mitty: "Your small minds are musclebound with suspicion. That's because the only exercise you ever get is jumping to conclusions."
XML has failed in many places. JSON is a childrens toy. AJAX is for 'inventors of wheels'.
Nothing here shows that XML has managed to integrate and interoperate X with Y in some great engineering fashion. To do those you have to that kind of work everyday which sadly means it failed to do it. Simple.
A,
I would not agree with that assertion at all. XML is the worst of all technologies, except for everything else. It's a generalist solution, and like many generalist solutions it is often not as efficient at accomplishing a task as a specialist solution, but nonetheless because it provides a tool for abstraction it makes solving a broader range of similar problems possible in a given domain.
Consider HTML for a second. Many people considered HTML too simple when it first debuted, that is was "a mere toy", yet HTML is now more widely used than any other computer language by a wide margin. There is more JavaScript (another toy language) in use now than there is C++, Java, COBOL, Perl, Python and Ruby combined. Yet because neither one of them came with the mind-numbing complexity that is so common with strongly typed imperative languages, they've long been dismissed as being too primitive to be useful. If, as I do, you see HTML as an early form of XML, then what this tells me is that, far from being a failure, XML has in fact largely replaced a fairly significant swath of the tools written in those other languages, and is well on its way to replacing the rest within a couple of decades.
Simplicity seems to go hand in hand with elegance, I've noticed, and that holds as true for computer languages as it does most other endeavors.
I like JSon and I like XML . Both have their places. JSon is simple and compact and works beautifully with Java script (not surprisingly as it is Java script ...). XML has the advantage of Schema and specifications and hence is preferable for interoperability. The biggest threat to XML is not from JSon but from DSLs which offer to be both more compact and descriptive than XML and still easy to validate for correctness.
Whether I prefer a set of different DSLs to the verbose but always simmilar beast called XML is however a different question ...
it is nide article
it is nice article. I like it ! Thank you !
Today I use XML when I am designing a document page that is also a web page that is also an application page that is also transformed into a printed form... yadda. The change most completely wrought by the HTML/XML/XSLT trinity is that when we design an application today, we are designing a document. It is so obvious we seldom mention it but the triumph of the hypertext community over almost all of the rest of computer interface designs would be what I would consider most notable were I to have slept from 1985 to the present day.
Straight out, I am using ASP 2.0 with those seductive data binding providers. So I can create some fast XML instances given I can type elements and atts faster than I can design a table, bind to them, get a notional screen up and running and move on to the next one. Is it a good way to work? Possibly not given I will eventually bind to a final table design, but it more nearly resembles what I want the customer to work with than when I start from the tables. Then I throw away the XML or keep it for the on-the-wire designs.
How many of you craft views/XML instances and then come back to the table design later?
I liked today's topic, lately I've been tempted to think of XML as "old" but that's not the case. xschema seems like it's still in infancy, judging from the tutorials, and people (myself included) are only just now being made aware of schematron.
Len,
I bind to XML in various ways; the ASP.NET bindings is certainly one of the features to like about the ASP.N2 framework. I use XForms some for this task - a table in XForms is a handful of elements for explicitly mapped bindings and a couple of elements for a generic mapping, though of course this isn't assuming CSS maps. I find I'm using XSLT less and less for this particular task, though I'll typically resort to that route if I can wrap the XSLT in an XBL or similar structure.
Taylor,
Yup. My take is that a lot of the W3C technologies in particular are finally JUST beginning to percolate into public consciousness - XForms, XQuery, RDF, SVG, etc, all of which are likely to cause some very interesting changes if they get widely adopted.
A nod to XML as a native data type seems appropriate. E4X (as already mentioned) Scala and Xlinq come to mind. There was a proposal to add this to Java 7 but I haven't seen any activity on that lately (no surprise, it was controversial and with everything else queued up for v7 it may not stand a chance anyway). The native XML plus pattern matching in Scala has been eye-opening for me. It's almost a guilty pleasure; like I'm eating candy when I should be "coding". Nothing a little java/DOM can't cure;-)
Native XML certainly isn't the "right thing" for every language but E4X and Scala/XML hit a sweet spot for me. On the other hand, flying pigs will be ancient history before python has native XML and I agree with the python communities thinking on this.
Do you see native XML becoming mainstream? What about the rise of functional languages and the builder pattern implementations in ruby and Groovy etc?
It depends on what you mean by 'native XML'. Isn't a DOM 'native XML'? If an XML file is supported by an XML data provider, isn't that 'native XML'? X3D, XAML, XUL, SVG, aren't these 'native XML'?
XML is a syntax, not an application. That distinction makes a world of difference in how or why 'native XML' is supported. One can create a 'native XML' database but effectively, it is just a hierarchical database with special provisions for XML syntax and the oddity of mixed structures.
len,
I'm using natvie XML to mean that literal XML is recognized as a core data type by the language compiler without explicitly importing libraries or invoking another compiler/processor. My intent was not to make a hard distinction but to recognize what's different in the context of Where is XML Going. It's still XML but a native data type does change how developers interface with it.
Ah. No disagreement with that. Back in the 1980s when discussing SGML at a design meeting, Charlie Sorgi from what was then Mentor Context made the statement that one day SGML would simply be a checkmark in a list of product features. A generation later that is pretty much the case. Maybe the answer to 'where is XML going' is 'nowhere'. It's just there.
Kurt,
What leads you to say that,
"HTML 5.0 will be XML based, it's just a question of how much core technology will separate it from XHTML 2. There is no valid reason for HTML not to close its tags, quote its attributes, and respect namespaces..."
It seems that some members of the HTML 5 working group are approaching the issue from the opposite angle. Which is, that they are saying there is no valid reason for HTML to close its tags, quote its attributes, and respect namespaces..."
Thanks.
Aaron,
I think the key to that statement is "Some members of the HTML 5 working group ...". There are two or three individuals that seem wedded to an extraordinarily conservative approach to web development working with the W3C, usually invoking either some mythical great aunt who works with HTML or some pre-existing customer base that would face incredible hardship if XHTML was used. However, while this argument is in fact fairly compelling in the face of the HTML 4.0 specification, its an assinine argument for HTML 5.0; the changes involved will necessitate that a new DTD be established, will necessitate changes in both HTML interpreters and renderers, and for the most part works against three significant changes since HTML4.
First, the idiot HTML coder theorem, to which I would replay that you have the fact that a number of alternative markup schemes have appeared in the wild since HTML4, such as BBCode or WIKI code, that assume some form of preprocessor interpretation. Most such schemes include the notion of explicit closure and require a sufficiently sophisticated understanding of attributes that the notion of quoting such attributes pales into absurdity in comparison. Given that these are actually used quite successfully, especially in client facing text entry mechanisms, the notion that sloppy validation is a core aspect of any HTML specification for adoption is absurd on the face of it.
The second facet is that increasingly rich text content generators on the client often hide the production of that HTML code (the number of sites that incorporate some form of JavaScript-based Rich Editor is growing dramatically) so that those people who do not know/cannot understand HTML do are not in fact in a position where they are explicitly generating that code. This does place a requirement on the developers of such functionality to support XHTML based encodings, but this is happening anyway for other reasons (largely feed syndication).
The notion of dealing with code for an existing customer base also does not apply here. HTML 5 is a new standard. In order to support it, vendors will need to change their code anyway in order to work with new HTML declarations. In most cases, when the 4.0 specification rolled around, most vendors were only just beginning to become aware of XML and were still assessing the role that the specification would have in their operations, whereas today, XML has become an integral part of the web in more ways than even than the original designers anticipated. Those vendors' customers want XML based solutions to work with their increasingly XML based document workflow systems, and they are much more aware of the hazards of splitting core technologies down two competing paths.
Finally, I would hazard (from my own experience with those individuals) that many people within the W3C have become leery of working with them and view their efforts as being frankly counterproductive. The WHATWG effort was a direct effort to wrench control of HTML from the W3C standards process, in part because of the belief that the W3C solutions were too esoteric and not sufficiently responsive to the needs of web developers. While I think there is some validity to that argument, those same individuals were involved in both the W3C and WHATWG, and the point can be made that the efforts made by those individuals often amounted to arguing points in order to keep the W3C from coming to resolutions that would have actually pushed those technology standards out the door.
I think it can be argued that much of the WHATWG "standard" has far more to do with what has increasingly been seen as AJAX related work - data stores and persistence, graphics in 2D and (theoretically) 3D, programmatic bindings and behaviors, enhanced validators in HTML fields and so forth, things that have traditionally sat above the HTML stack. I see HTML 5.0 as recognizing the need to embrace that higher stack, but to do so in a way that is consistent with related W3C specifications. HTML 4.0 is not compatible with most of these been CSS and (to a certain degree) DOM. However, by assuming that HTML 5.0 will make the predication that there is a minimal requirement upon an XML basis, it makes it far easier for the W3C to assert compatibility with its other technologies, from XPath to RDFa.
It's for these reason that I just do not see the looser validation requirements of HTML 4.0 making their way into 5.0. The adoption of these may be in the interests of a small number of individuals who do not wish to see XML succeed, but given that XML has long since established itself this action can only be seen as grandstanding - it works against the interests of the W3C, against organizations that use XML for content management and against web developers who increasingly see XHTML as a critical piece of pipeline architectures for moving data through their systems. Great Aunt Bernice frankly doesn't care - she uses a WIKI - and the vendors should realistically see an XML-aware HTML 5 as being an attractive bullet point for new products.
I'd personally like to see a legitimate argument FOR the non-adoption of XML notation ... I haven't to date, and frankly I think it would have to be an incredibly compelling argument to make up for all the negatives that it introduces.