I recently wrote a blog about the directions that I saw with XML, and while it has proved to be fairly popular, it has also generated a fair number of comments that really need their own more detailed examination. One of these, and one that I’ve been planning to write for a while anyway, has to do with my comments about XSLT 2.0 increasingly being used as a “router” language, replacing such applications as Microsoft’s BizTalk Server.

This is not a disparagement of BizTalk - it’s actually one of the Microsoft technologies that I have actually endorsed on a regular basis, because it solves one of the thornier issues involved in creating complex data systems - how do you handle the intermediation of data coming from different data sources, and while I have some quibbles about the interface, I think BizTalk does its job admirably. It also served as a bridge technology for quite some time between the SQL and XML worlds, and it will continue to serve in that role for quite some time to come.

However, I also think that in the long run it is a bridge technology, and that as the world moves increasingly to the use of XML as the preferred data transport story, other technologies, such as XSLT 2.0, will likely end up making most of it functionality redundant and useful at best on the edge cases. Such statements presuppose a lot, of course, most notably that XML is in fact becoming the dominant form for expression of data content. Before digging into XSLT2, I’d like to address this issue head-on, as it ties into a number of other things I’ve been looking at of late.

SQL is very good technology, and a fairly remarkable standard. SQL is not going to go away, nor should it - SQL, and related standards such as LDAP or OLAP, are specifically defined to provide rapid access to relational data in a way that XML will never be able to do by itself, because the assumptions made with XML include the tacit assumption that you are replacing efficiency with flexibility.

A system that is too flexible becomes unwieldy because it fails to constrain the problem domain sufficiently - I can make a program do just about anything at the beginning, but the moment that I start defining what that “anything” is and how I can accomplish it, I perforce also reduce the ability of that application to do other things as well.

This is the sculptors dilemma - a block of stone can in fact contain a near infinite number of things before the first chip is made, but by more clearly defining what the sculpture is you also cut down dramatically on what it is not. XML assumes a fairly broad definite of data structures and as such is nearly infinitely malleable, but once you start defining the data structure, this places sizable constraints upon what the structure isn’t.

This manifests itself in the fact that XML-heavy applications also require considerably more up-front design, even though they tend to be easier to developer and in the long run they are more maintainable than SQL-oriented ones are.

However, data transport is a somewhat different issue. Most contemporary databases now produce XML, either directly through specialized extensions or via an XQuery layer. Given that databases are also increasingly being sequestered behind data abstraction layers, this also means that from the standpoint of an external application such databases are simply another vector for supplying XML in a given format (and increasingly for consuming XML being sent to them).

SQL fails in one very notable way - it does not in fact clearly articulate the serialization of content. Database vendors used this to their advantage, wrapping access to such databases to their own internal APIs, and for the most part even in cases such as mySQL, the primary serialization format is either an explicit API wrapper or a text output that is highly vendor dependent.

This has resulted in a rather remarkable services industry built almost exclusively in “translating” between SQL output (or input) and some formal presentation layer. I’ve personally built any number of such systems for clients, until I began to realize that such development basically keeps development locked into a three tier database/middleware/client architecture that is remarkably inflexible in an age where most resources are increasingly networked rather than hierarchical.

It can of course be argued that XSLT itself is simply another example of a translation layer, and to be honest, if the language is used improperly, that statement is actually quite true.  Here’s that particular scenario. I write a PHP or ASP.NET (or whatever) script that pulls up an XML file from some resource (a file or web service), loads in a hard-wired XSLT document, shoves the XML in to be transformed, performs the action, and sends the result down the wire to the client.

Now, in this particular case you may manage to get some performance benefits in building the transformations via XSLT compared to doing it in script, but you haven’t really managed to take much advantage of the powers of XSLT, and the costs involved in loading and processing the initial XSLT are usually high compared to doing it in native script. You’re still talking about a transformational “box” that does only one thing.

However, XSLT has a rather paradigm altering function called document(), as well as another interesting capability called parameters. The document function can work on static XML content, but it can also use the GET protocol (through query strings) to retrieve content from web services. Parameters can be set from the hosting language to determine these web services invocations, and additionally parameters can be calculated from within the XSLT and passed in the same manner.

However, there are several problems with this approach. For starters, creating such query strings in the first place and passing them in is a pain in the butt, because you have to use the rather cumbersome call-template syntax, wrap the results in a variable, then pass the variable into the document() function. There are few checks to handle err conditions, and once you create the output, you can’t necessarily use that output as the input to some other action, because the output as an XML Fragment rather than an XML Node.

Thus, while it has certainly been possible to use XSLT in this fashion, you were all too often forced to rely upon inconsistently implemented extensions such as the node-set() function. Indeed, most of the really interesting things you can do with XSLT 1.0 come down to these self-same extensions, which again raised questions as to whether there really was that much benefit in using XSLT 1 in the first place.

However, much, if not most of this concern evaporates with XSLT 2.0, which I see as significantly advancing the state of the art. for XSLT transformations. While there are a number of features about XSLT2 that are major upgrades, with regards to the thesis of this article the following are in fact of special importance:
  • xsl:function. This element makes it possible to create XSLT functions that can then be placed in special namespaces and invoked from within XPath expressions. This is a huge piece of functionality, as it makes it far easier to modularize XSLT functionality, to turn XSLT into a formal “programming language”. This is complemented by …
  • formal XPath extension mechanism. In other words, XSLT now has a formal (and consistent) means of invoking outside methods written in other languages (Java, PHP, .NET, XSLT, XQuery, whatever) from within an XSLT expression. Again, other XSLT1 implementations often extended XSLT in this manner, but each did it in their own way … and few treated XSLT as being “just another” extension language.
  • unparsed-text() and unparsed-text-available(). The unparsed-text-available() method solves one of the biggest problems involved with working with the document function - dealing with situations where the URL is unable to retrieve content - it can essentially “pre-check” a URL to insure that it is in fact capable of retrieving something. The unparsed-text() function on the other hand solves another hole - being able to load non-XML content into an XSLT transformation. This works on any content - I actually loaded in JPEG images at one point and output them as part of XML content, but this also readily handles any type of process that generates text which could be parsed via regexes to retrieve the relevant content (such as CSV files, or position-oriented data files). I need to experiment with this a bit, but it occurs to me that it may also be possible to retrieve header information from a resource, though this can also be handled by directly passing the header bundles directly into the transformations in the first place as parameters.

    Indeed, that point alone raises some interesting possibilities - SOAP enabled web services actually pass a fair amount of information in the headers, and this head content should be passable as a bundle to the transformation; this means that XSLT can in fact be used to process SOAP-based services … more on that point in a bit. This function also makes it possible to import HTML into an XML document without that HTML needing to be parsed or otherwise manipulated, which is VERY useful for user input.
  • sequences. One of the reasons why XSLT 2.0 took so long to get out the door came from the realization that the XPath model as it existed was too limited - it turns out that it is in fact not possible given the data-model to legally create an internal node-set() function. After considerable effort, what emerged was the decision to support general “sequences” of objects - lists, in other words - that could be either atomic data types or XML objects. This change had ramifications that worked their way through the entire model, significantly expanding what could be done while opening up the door to eliminate node-set() and related functions altogether. This in turn has enabled more sophisticated grouping, set operations (union, intersection, difference) and collapsing lists. It’s also enabled …
  • numeric iterations. You can now do such expressions as for (1 to 10) that will return increasing iterative values, reducing the need for recursive expressions dramatically and consequently simplifying the code base for any number of different operations.
  • regular expressions. XSLT 2.0 and XPath 2.0 both contain support for regular expressions, as well as a number of string functions for taking advantage of regexes. For instance, the tokenize() function can split a string into a sequence based upon a regular expression (or straight text) making it much easier to split apart lines and fields in CSV files, extract data from irregular phone  number formats, performing actions if two words are within a given number of characters of one another and so forth. This also makes it generally possible to use XSLTs for general schema validation, and gives a considerable leg up in the generation of rich schematron output.
  • result-document and output.  The element makes it possible to send content (and not necessarily just XML content) to a file or web service, independent of the final output mechanism used by the transformation itself. Simple examples of this include the ability to take a single large XML document and split it up into a large number of smaller XML documents that can be saved to disk, creating a SOAP message that can then be sent asynchronously as a POSTed entity, or writing documents to URL based REST services. The two limitations that result-document faces are the fact that these are asynchronous POST events (if you send a SOAP message, the returning call will not occur in process to the transformation) and you are limited to being able to control only a very limited number of HTTP headers (depending upon the implementation). In general, neither of these are show stoppers - such calls SHOULD be asynchronous, and in general if you are doing WebDAV type transactions with headers you’ll probably want to do that out of process anyway (Certain SOAP transactions have a specific requirement for such header invocations (which I’ve argued against for some time, for precisely this reason)).  It also may be possible to pass the server Response object in as a function registry … I’m blathering now, but will let you know the results of this later.

    It is also possible now to have multiple named elements, so that the result-document’s output headers CAN be controlled to a certain extent. This combination of being able to generate multiple streams of content and controlling the serialization of that content becomes essential in working with XSLT 2.0 as a more generalized router entity, as I’ll discuss in greater detail below.
  • inline control keywords. XML is great for certain things, but there are times where structures such as , and were essentially overkill, forcing a lot of excess verbiage in places where simpler structures would have been more useful. XSLT 2 now supports a number of XQuery extensions (not the entire set, but a fair number) for doing things such as iterating with a for loop or performing various actions based upon conditional statements, directly within XPath. This again can reduce file sizes considerably, and generally makes for somewhat easier code to read.
  • character maps. Entities are a holdover from the days of SGML; as XML has become increasingly focused on data-centric rather than document centric modalities, entities can be more of a pain than they are worth. Fortunately, with XSLT 2.0 you can now create character maps that let you map certain character sequences to some output form. This serves a number of purposes. Character maps replace the rather cumbersome (and often poorly used) disable-output-escaping to insure that specific entities (such as the less than “<” symbol) stays preserved properly in output. This actually proves very useful for creating intermediate XML structures that can nonetheless be processed through other XSLT calls, and even more is useful for generating output files that resemble XML but are not quite identical (such as jsp pages, which might have inline <% %> elements), such as:

    <jsp:setProperty name="user" property="id" value="<%= "id" + idValue %>">
    </jsp:setProperty>

    Personally, I think this feature is actually a profoundly useful one, because it effectively opens up XSLT to the world of generating processing logic in most web server languages - PHP, JSP, ASP.NET, and so forth. Sich customization is still a rarified area - code generators require that you can effectively process information both at the higher level XSLT layer and the lower level server language layer, but it has obvious benefits compared to the often awkward process of trying to keep a large custom PHP or ASP.NET base running and maintainable.
  • tunnel parameters. Tunneling in XSLT 2.0 is a fairly obscure feature the first time you use it, but after the first time, you’ll wonder how you ever managed to get along without it. Parameterization has always been at odds with the recursive nature of XSLT, and has often proven an impediment to modularization. In general, if you passed parameters through elements, it would necessitate that the called templates would have to declare those parameters, even if the only reason was to in turn pass the parameters on to some other called template down the recursion descent. However, in 2.0, you can now invoke a with-parameter call with the tunnel attribute set to yes. When this happens, only the template that actually needs the parameter value specifically needs to declare the parameter - not any of the intervening templates. This process, known as tunneling, again reduces unnecessary verbiage and results in far cleaner code. There’s also a failsafe at play here - the invoked template specifically needs to indicate that such a parameter is declared as tunneling - otherwise, the parameter must be passed directly by the calling apply-templates invocation, and any tunneling parameters are ignored.
  • data-types. For die-hard XSLT programmers, data-types are something of a mixed blessing. You can specify that certain variables should be considered to be of a specific data-type (with the whole XSD simple type set supported), and the operations done on these will then reflect the data-type in question. Thus. if you declare two variables as being of type xs:integer, then trying to set one to a non-integer value will generate an error. Additionally, if your XSLT processor is schema-aware, then such types are automatically assigned into the infoset and all operations work on the presupposition that the operands have known types. This tends to work best for data-centric processing, and from experience can cause more than a little bit of a headache when trying to debug why a given operation refuses to work in more text-oriented systems, but the option exists nonetheless.
  • Stand-alone processing. It is also now possible to invoke an XSLT transformation without needing an additional XML file on which to operate. This isn’t that big of an issue - you could create a stub XML and transform on that, but it did by necessity place some limitations on how such transformations were created, and stand-alone processing actually works quite effectively in routing systems.

The New Roles of XSLT 2.0

With all of these changes, XSLT 2.0 is able to assume a much more extensive “work-horse” mode than it has previously. Most of these modes have already been explored with older extended XSLT 1.0 processors, but because such implementations tended to differ in critical areas developers and IT managers tended to shy away from them for all but very specialized applications.

XSLT 2.0 as Router

An XML message enters the system to be processed in some manner. One of the more fundamental distinctions in programming models on the web has to do with the question of where intent is located - where does the responsibility for indicating what should be done with a message reside. In REST mode the intent resides solely within the URL - the message itself contains the associated data, but doesn’t by itself contain the relevant processing intent. Typically these models are most strongly associated with publishing type systems. In RPC mode, on the other hand, the responsibility for processing resides primarily within the envelope (typically a SOAP message) which may also include parameters, all of which is intended to invoke a method in some other language such as Java or C#.

Now, suppose, for a moment, that you created an XSLT2 transformation and bound it to one or more external objects under appropriate namespaces. It is not, in general, possible to invoke an XSLT transformation that is created by another transformation in one pass, regardless of the version (nor should it be, for security reasons). However, it is certainly possible to create a dual pass system, the first of which constructs from the incoming message one (or, more significantly, more than one) transformations that invoke the appropriate external class method calls in stand-alone mode, which would then in turn either pass the results to the output stream or generate secondary streams that pass newly created XML to different places.

Why do this? Well, there are actually a number of benefits that derive from this approach. For starters, what you are passing initially is XML, to be transformed AS XML. This means that while you are doing this transformation you can also be applying tests (schematron comes to mind for this) that will determine whether in fact the incoming data is not only well formed but business valid, and that protects the system from potentially serious attacks - and if the schematron itself is also generating XML then you can also send messages back up the pipe (if such communication exists) to outline what exactly has gone wrong in a user friendly form. It also makes it easier to stop potentially expensive server operations from being invoked if such calls exceed some parameter (such as the rate of automated requests coming from a given client). Since you’re processing the information as XML, you’re not putting the integrity of your system at risk from insertion attacks.

The first process also makes it possible to create “batches” consisting of multiple jobs - once a given job (a command call, for instance, or some additional processing is made) then given job is removed from this stack, and the XSLT can then choose to process the remainder of the job stack based upon some conditional expression emerging from the previous job’s processing. In other words, the XSLT at that point acts as a job control language, based not only upon the incoming data but also upon the results of processing that data.

In an AJAX oriented system, such invocations could (and generally should) be done asynchronously - the first XSLT passes the initially processed XML to a second asynchronous transformation using result-document, which would then be retrieved as a message from a set of queued messages. In this particular case, the effective routing could be done solely within the first XSLT, with little need to create multiple synchronous chains of transformations (though its likely that the resulting transformation would need to include a link to the message queue to be queried for responses, possibly with a transaction identifier - likely using the new XPath current-dateTime()).

Such a system is a routing system - you are using the XSLT as a router for XML messages to be sent to the appropriate internal services, where each of those systems in turn exist either as URLS or as named extensions. That it also can serve as a validation system is not accidental - one of the powers of XML is that you can check for the validity of XML without the danger of instantiating the object in live form.

XSLT 2.0 as Code Generator and Code Parser

I’m one of those people that’s long been fascinated by self-generating code and code generators, which is probably one of the things that attracted me to XSLT in the first place. The ability to map from an XML document (typically some form of site map) to a set of one or more XHTML pages has of course long been one of the major uses for XSLT, and I do not see that changing significantly with XSLT2 (other than the fact that it will be much easier to do so thank with XSLT1). However, with the increasing prominence of other XML-based application frameworks, I see XSLT2’s role in this space to also increase fairly dramatically.

Beyond the obvious “XSLT generates XML” side of the argument is a considerably more subtle one: these frameworks are for the most part fairly complex, necessitating sophisticated interrelated components and document object models that make XHTML look absurdly simple in comparison. You can in general create fairly basic applications in technologies such as XUL, XAML, FLEX, Boxely, and so forth, but once you reach a certain level of complexity, such applications become both difficult to work with and even more difficult to maintain. I see XSLT2 as a way of helping to tame that complexity by making it possible to build the intermediate components, initially statically (as XSLT2 processors are likely to only slowly make their way into such toolkits) and later dynamically.

I also see XSLT2 beginning to make its way into client side development through the use of AJAX. Increasing availability of high bandwidth makes components that perform the requisite processing of client-side data through an XSLT2 transformation on the server much more feasible. Indeed, what I see here is a setting staging transactions - the client XSLT1 processor (now becoming prevalent on most browsers) would make the requests for content and generate the skeletal results, while the XSLT2 processor sitting on the server will do the heavy lifting and integration of server-based data resources. This solves two problems - minimizing the amount of unnecessary (and potentially insecure) data transport to the client and getting past one of the bigger restrictions facing the client XMLHttpRequest objects - the challenge of integrating web services that aren’t directly in the client/server communication path - as the server-side XSLT does not face the same sandbox restrictions.

However, it’s also worth considering the flipside to this; the use of XSLT2 as a parser of both XML and non-XML content. A SOAP message ultimately requires a certain degree of parsing, for instance; the conversion of an XML object to some external action involves interpreting the facets of that XML object and translating that into some immediate action. However, with the addition of regular expressions and XSD type support, it now becomes possible for an XSLT2 transformer to read something like a Java or C# header file, parse it, and generate XML (or other output) from it. It becomes possible to parse non-comformant HTML files and convert them into valid XHTML ones. Indeed, this facet of XSLT2 makes it fairly attractive as a generalized parser in the spirit of m4 or similar resources - you can create very sophisticated replacement macros (including macros that are able to incorporate external resources), can handle conditional processing, and can otherwise perform just about all of the actions that you would expect of a general parser system.

The question as to what the parsing is parsing “to” should be addressed as well. One of the more challenging aspects of any project management system is the generation of API documentation. I can readily envision an XSLT2 transformation on a Java, C# or even JavaScript file that would parse out the structure and build an intermediate XML form that could then be viewed and edited by an XForms system. This not only makes it much easier to document legacy code (creating a usable, browsable interface in the process) but also makes it easier to actually design certain types of code. I’ll try to address this in greater detail in a subsequent column.

XSLT 2.0, XQuery and XForms

I would of course be remiss if I didn’t touch on these topics. Over the years I’ve written a great deal about XQuery, and will have to admit that I was not originally that taken with it and saw it as being a somewhat awkward “replacement” for XSLT. One aspect that I had thought about was the idea of using XQuery to retrieve content that could be transformed by XSLT into the appropriate format, but until I saw the eXist database I had assumed that these would occur in separate processes. With eXist, however, you can make an XQuery call to the transform:transform() extensions to invoke a transformation on an XML node. With a quick download and a little fiddling with some of the configuration files, you can switch from the default Xalan transformer to Saxon 8.9, enabling XSLT2, XPath2 and XQuery all in the same system … and you can turn your XQueries directly into web services.

This makes for an incredible combination, in part because you never leave the XML context. You don’t have to spend a large amount of time writing different configurations of transformer objects, XML Document resources, pipes, parameters or the like - you basically end up working with XML itself pretty much through the entire process. The combination of this with the ability to work with the various server objects (request, response, session, etc.) essentially gives you the entire application context in a single (compilable) XQuery.

This becomes especially important when working with XForms.  I find it increasingly difficult NOT to work with XForms, to be honest, even given some of the complexities involved in different implementations. With XForms, you can build the data model XML on the client side, send it up to an XQuery that will validate and process it, then this object can in turn be passed off to a transformation to generate either another XForms instance, an XHTML “report”, or an SVG chart of some sort. XSLT2 works well in building such “input templates”, again giving you fine grain conditional control and establishment of interface capabilities. I think that the “X” model - XQuery + XSLT2 + XHTML + XForms will likely prove a potent one in the future, and one that’s already gaining the attraction of various industries, especially in the medical, insurance, government and education sectors.

The Future of XSLT …

Predicting the future of any technology, especially as one as esoteric as XSLT, is an exercise fraught with chances to make you look like a fool. XSTL 2.0 is easer to learn, in general, than its predecessor, is considerably more powerful, makes most of the right moves with regard to extensibility, and already has some first class implementations in place. Microsoft announced that they will be producing an XSLT 2.0 processor recently, and I wouldn’t be surprised if other XSLT implementation owners aren’t at least evaluating the option.  It does what a good second version is supposed to do, in that it solves most of the problems that the first version has without introducing a whole raft of new ones.

Overall, I see it as becoming far more heavily used within the next couple of years as implementations proliferate, especially when you consider that XSLT 1.0 implementations now exist for very nearly every platform currently in use today, making it a remarkably successful “cross-platform” solution. As someone who has wrestled with run-away recursive stacks, clunky called template invocations and implementation headaches, it couldn’t come soon enough.

Kurt Cagle is an author and web technologist specializing in XML-based technologies. He writes understandingXML.com and is the webmaster of XForms.org. He lives in Victoria, British Columbia with his wife and two daughters, and is coming to terms (barely) with having a teenager in the house.