Norm Walsh recently provided an update about XProc - a generalized XML Pipeline language that is being worked on by the W3C. The idea behind XProc is simple enough - you create an XML document that provides “glue” or conditional bindings for difference processes that can occur in an application. One such “standard” project for such a language already exists - Ant - and I find it interesting that Ant has been slowly replacing the cryptic and awkward
make syntax in an increasing number of applications, only a small portion of which are XML based.
There have been more than a few arguments about XML based “programming languages” (both make and ANT are examples of what used to be called Job Control Languages, way back in the days of mainframes) … that they are extremely verbose, that they put an undue requirement upon the use of XML (which for some reason procedural developers of both the C++ and Java stripe tend to embrace only very reluctantly), that because it is the “new kid on the block” XML is being used solely for being novel, without any other benefits out of it, and so forth.
The future of programming is distributed and asynchronous (and by extension compartmentalized and localized). A mashup is a comparatively simple concept, but in some respects it represents a huge jump forward compared to almost all previous forms of distributed programming (DCOM, CORBA, RMI, etc.). Provide a common language for abstraction, a common meta-language for encoding structure, a common lexical and linking framework and make the processors for handling this meta-language ubiquitous, and you get distributed programming for very nearly free. Because you are using an abstraction mechanism for the data, it becomes the responsibility of the individual language or processor to provide the specific back-end functionality to properly interpret the form of the data (rather than the data itself) - something which has been accomplished with remarkable alacrity in the programming world for XML.
As an aside, I think this is one of the key benefits that XML has over potential rival technologies such as JSON. JSON is topologically similar to SimpleXML, a notion that keeps popping up now and then but has always remained below the threshhold of critical change. JSON requires that a certain level of data type abstraction be specifically asserted within the data structures - fundamentally the notion that an element and an attribute are in fact simply representations of the same thing. However, in working extensively with both, I’ve found that attributes - qualities that are descriptive about a given element, rather than substructures associated with that element, do tend to naturally occur, and unscrambling them in JSON forces the introduction of artificial (and generally unstandarized) vocabulary labels.
Indeed, it may very well be that the rise of JSON is in fact simply another expression of the SimpleXML debate, except that it is occurring outside of the XML orthodoxy and as such has neither been suffering under the disapproval of those who have made investments in XML or faced the scrutiny that SimpleXML has repeatedly faced in the last decade.
Nonetheless, the challenge of a distributed environment is that centralization attempts in general are far more difficult to coordinate. This has been especially true of the pure pull model (aka Web client/server) where the only recourse to confirmation or further action of a transaction is an out of bound process (typically e-mail). However, AJAX is rewriting the rules here, and one of the most immediate effects of this rewriting is that in-bound processes (receiving asynchronous confirmation of a transaction, for instance, or receiving a bundle of information from a transaction necessary to perform additional processing) now become feasible. Any form of resource management application, from stock trading to hotel room scheduling to medical management systems can now treat the browser not so much as an end-point but simply another processing node in a (potentially cyclic) tree of processes.
Something needs to coordinate those processors, however. Pipelines are the logical mechanism to do that. By encoding them in XML, you can pass the process descriptors to the necessary processors without having to worry about what processing language those processors are using. Pipelines by themselves are fundamentally acyclic - they have definitive endpoints, but a good process flow architect also realizes that by placing two pipelines together what you end up creating is a circuit. If you have a pipeline language that can realistically handle asychronous invocation over the short term (which is a fundamental flaw in Ant, as it is (I believe) a synchronous application) then ultimately the only synchronous points that you need come when one pipeline hands off its results to another pipeline.
I don’t find it at all accidental that pipeline architectures have recently gained interest (the W3C has been wrestling with this problem for about a year, and orchestration has long been the holy grail of companies such as Microsoft). AJAX programming is redefining the role of the worlds most common user interface - the web. Asynchronous methodologies on the client are essentially now pushing for a significant re-evaluation of asynchronous methodologies on the server, with the attendant realizations that existing solutions (including the SOAP/WSDL/WDDL stack) are frequently too complex for people to feel comfortable using, because they presuppose that it is the nodes, rather than the conduits, that are the critical pieces of the network.
Pipelines, on the other hand, are fundamentally RESTian in nature - they concentrate on the interactions of “molecular” conduits, and while the characteristics of the individual “atomic” servers are important, without some formal overriding governor at the molecular level more complex structures become ever harder to build and maintain, and the atoms themselves remain largely isolated.
I’m not sure which pipeline architecture will ultimately prove dominant, though if history is any indication it is likely that for a while we may have several in play before any one network becomes sufficiently compelling. I personally am intrigued with the pipeline architecture presented by the W3C, both because of the esteemed efforts of Mr. Walsh and because the design itself has a great deal of merit and integrity. I’d recommend checking it out at http://www.w3.org/TR/xproc/.
Kurt Cagle is an author, industry analyst and software developer living in Victoria, British Columbia, which is, like much of the Pacific Northwest, in significant danger of floating away.