ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


Developing, Applying and Optimizing XSLT with Java Servlets

by Eric M. Burke
12/15/2000

This article explains the basic patterns and programming techniques commonly used when XSL Transformations (XSLT), Java Servlets, and XML are combined to create sophisticated web applications. The types of applications that benefit from this approach include

It is hard to imagine a situation where XML and XSLT would not work, often more elegantly than JSP or pure servlet approaches.

XSLT basics

XSL stands for Extensible Stylesheet Language which is a two-part specification from the Worldwide Web Consortium. XSL Formatting Objects, which is not covered by this article, is an XML language for specifying the formatting of documents, such as fonts, colors, and alignments. At the time of this writing, XSL Formatting Objects is a Working Draft and is subject to change. Current web browsers do not support XSL Formatting Objects, so this is not a viable technology for the foreseeable future.

The second part of XSL is XSLT, which was designed to transform well-formed XML documents into XSL Formatting Objects documents. XSLT is a W3C Recommendation, which is equivalent to saying that it is a standard. Although XSLT was designed to support XSL Formatting Objects, it works well as a general purpose XML transformation language. When people talk about XSL, they are usually referring to XSLT, which is much more widely used today.

The following are the basic elements of XSLT.

XSLT Transformation Process.

Figure 1: XSLT Transformation Process.

As Figure 1 illustrates, the XML input is transformed into something called a result tree. The result tree could be another XML file, an HTML web page, or even a plain text file. Nothing binds the XML input to the XSLT stylesheet, so there is a very clear separation between data and formatting. This is a boon for servlet programmers, because you can easily target multiple incompatible browsers by simply supplying two different stylesheets. You could also target Wireless Markup Language (WML), again via another stylesheet. This is much more attractive than traditional approaches which require changes to programming logic to support multiple targets.

Example 1: Basic XSLT Stylesheet

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="html"/>

  <xsl:template match="/">
    <html>
      <body>
        <xsl:apply-templates select="creditInfo"/>
      </body>
    </html>
  </xsl:template>

  <xsl:template match="creditInfo">
    Name: <xsl:value-of select="name"/>
    <br/>
    Type: <xsl:value-of select="type"/>
    <br/>
    Number: <xsl:value-of select="number"/>
    <br/>
    Expires: <xsl:value-of select="expires"/>
  </xsl:template>
</xsl:stylesheet>

The stylesheet shown in Example 1 will be used later in this article to transform an XML document using a servlet. The root element of an XSLT stylesheet specifies the version and namespace. The only version available at this time is 1.0, and the namespace sets up the xsl: prefix for all XSLT elements. The next element specifies that the output document is HTML, although this line is optional in this case.

Comment on this articleShare your experiences with XSLT.
Post your comments

The remainder of the stylesheet consists of two <xsl:template ...> elements. The job of a template is to match patterns in the XML input. The first template typically matches /, which is the XML document itself. The content inside of the template is then copied to the result tree except for <xsl:apply-templates ...>.

The apply-templates element causes additional pattern matching to occur in the source document, this time within the context of the current template. Since XML documents form a tree data structure, XSLT processing follows a recursive process of matching patterns with <xsl:template>, and then recursively searching down the tree data structure using <xsl:apply-templates>. In our example above, the root element of the XML is <creditInfo>, which contains <name>, <type>, <number>, and <expires> elements.

Of course, a complete explanation of XSLT is well beyond the scope of this article, but this should give you a flavor of what stylesheets generally look like (see the side bar for links to more detailed information). You may be wondering why the HTML <br/> tag has the / character in it. This is required because the XSLT stylesheet must be well-formed XML. Omitting the / character would cause a parsing error. Since the output method is set to HTML, Xalan removes the extra character, formatting <br> as normal HTML.

Problems with typical servlet approaches

Separation of data, presentation, and programming logic has always been an issue for servlet developers. The first generation of servlets were deployed as replacements for CGI programs, and typically contained code that looked like

Example 2: Tedious HTML Generation Code

PrintWriter pw = response.getWriter();
pw.println("<html>");
pw.println("<head><title>Home Page</title></head>");
pw.println("<html>");
pw.println("<h1>Welcome to our web site</h1>");
...remaining tedious code omitted!

This approach is tedious, error prone, and hard to maintain. It completely locks out non-programmers from web authoring, and it requires significant effort to change the look of a web site generated this way. Although you generally will write the code in modular, reusable fragments of HTML, the approach still requires source code changes for any new requirements.

Java Server Pages (JSP) provides a solution to this problem. Unfortunately, JSP only reverses the problem. Instead of embedding HTML code into servlets, you typically end up embedding Java code into your HTML pages. JSP custom tags do allow you to remove all Java code in theory, but this is not without the difficulty of first creating the tags or else finding an existing tag library that meets your needs. The following example shows how JSP custom tags are used within a JSP page.

Example 3: JSP Using Custom Tags

<%@ taglib uri="/my_taglib" prefix="abc" %>
<html>
<head>
<title>JSP Tag Library Demonstration</title>
</head>
<body>
  <abc:standardHeader/>
  <abc:companyLogo/>

  <h1>Recent Announcements</h1>
  <abc:announcements filter="recent"/>

  <h1>Job Openings</h1>
  <abc:jobOpenings department="hr"/>
  <abc:standardFooter/>
</body>
</html>

Even with custom tags, it is virtually impossible to remove all Java code in a pure JSP solution. Specifically, the pure JSP approach breaks down when complex HTML forms are submitted and the JSP has to perform validation. For this reason, most modern web architectures stress a hybrid solution, in which servlets are responsible for intercepting requests and validating form data, and JSP is used for sending the response back to the browser.

Hybrid servlet and JSP Approach.

Figure 2: Hybrid servlet and JSP approach.

As Figure 2 illustrates, a class called RequestDispatcher is responsible for coordinating between the servlet and JSP. This class is part of the javax.servlet package; it can either forward the request as shown above or include the output of another web resource into the current page.

This approach is sometimes referred to as "Model 2" and is very easy to implement. Its primary limitations manifest themselves when multiple client browser types must be supported or when Wireless Markup Language clients are desired. Additionally, providing a raw XML data feed to a non-browser client is increasingly popular, and JSP does little to address that need.

The XSLT + servlet approach

The design process for an XSLT-backed web site can be driven from the HTML or the database. When your requirements dictate a certain HTML page layout, the first step is to create HTML prototypes and then analyze which pieces of the HTML are dynamic and which are static. The static content will become part of one or more XSLT stylesheets, while the dynamic content must come from an XML data source. For example, an application that prompts the user to enter his or her credit card information could use the following XML document.

Example 4: XML Document

<?xml version="1.0"?>
<creditInfo>
  <name>John Q. Public</name>
  <type>Visa</type>
  <number>111-222-333</number>
  <expires>05/2000</expires>
</creditInfo>

The web page will obviously contain a lot more information such as text fields, graphics, captions for each field, and tables for layout. Since that presentation data is the same for every user, however, it can easily be placed in a stylesheet. The stylesheet shown in Example 1 was written for this document.

Related:

Style-free XSLT Style Sheets - Building web sites with XSLT sometimes raises architectural issues. This article presents a pattern for maintaining a clear separation between style, logic, and content in XSLT-produced websites. [From XML.com]

If your design is dictated by an existing database, then instead of creating HTML prototypes, you generally start with the XML specification. In either approach, you will probably find yourself iterating through several potential schemas before finalizing your XML. One thing that can easily be overlooked is that XML must sometimes contain items that are not pure data. For instance, if your web site contains a navigation bar along the top of each page, then the current page will typically be highlighted somehow. It may simulate a folder tab, where the current page is the active tab. In this situation, the XML must specify which tab is current, although that data would certainly not reside in your database. The ultimate solution is to extract "pure" data from the database and add on any GUI specific items as you are generating the XML.

Once the XML has been clearly defined, perhaps with an associated Document Type Definition (DTD) or schema for validation, the XSLT stylesheets that create the HTML can be written. This is another advantage of this architecture: the database does not need to be present in order to begin working on this task. The XSLT can be developed completely independently of any servlet or database code by using static XML files. An XSLT processor can be invoked from the command line for testing, or integrated tools like XML Spy can be used to speed up the process.

While some of your team is developing the XSLT stylesheets, work can begin on two other fronts. First, a basic servlet framework needs to be implemented. Although you may want to look into existing XSLT frameworks, the model is so simple that it works well enough to write your own servlet code.

Second, someone else can be writing JDOM code to generate the XML dynamically. This may involve creating some sort of abstraction to a relational database, or perhaps some interfaces to EJB components in a multi-tier application. In either case, the first step is to write Java classes that know how to convert a data object, such as CreditInfo, to the XML representation you defined earlier. In the example shown below, the CreditInfo class uses JDOM to convert itself to the <creditInfo> document shown above. In case you are wondering, JDOM is an open source Java API for dealing with XML. It is available at jdom.org.

Example 5: XML Generation Using JDOM

import java.io.*;
import org.jdom.*;
import org.jdom.output.*;

/**
 * Example of an object that knows how to represent itself as
 * XML using JDOM.
 *
 * @author Eric M. Burke
 */
public class CreditInfo implements Serializable {
    private String name;
    private String type;
    private String number;
    private String expires;

    // transient fields are not serialized.  This prevents the potential
    // overhead of sending too much data between an EJB and the web tier
    private transient Document doc = null;
    private transient Element elem = null;

    /**
     * Construct a new data object.
     */
    public CreditInfo(String name, String type, String number,
            String expires) {
        this.name = name;
        this.type = type;
        this.number = number;
        this.expires = expires;
    }

    /**
     * @return the contents of this object as an XML document.
     */
   public Document getDocument() {
       if (this.doc == null) {
           this.doc = new Document(getElement());
        }
        return this.doc;
    }

    /**

* This method makes it possible to easily embed the output from * this data object into some other larger XML document. * * @return the contents of this object as an Element, which is just * the root element without the XML declaration. */ public Element getElement() { if (this.elem == null) { this.elem = new Element("creditInfo"); this.elem.addContent(new Element("name").setText(this.name)) .addContent(new Element("type").setText(this.type)) .addContent(new Element("number").setText(this.number)) .addContent(new Element("expires").setText(this.expires)); } return this.elem; } /** * A simple test program. */ public static void main(String[] args) throws IOException { // create an object CreditInfo ci = new CreditInfo("John Q. Public", "Visa", "111-222-333", "05/2000"); // convert to XML, then format with two space indent Document doc = ci.getDocument(); new XMLOutputter(" ", true).output(doc, System.out); } }

You may have heard of Document Object Model (DOM), the standard XML object model from the W3C. If you have worked with DOM, you will immediately see from the example above that JDOM is significantly easier for Java programmers to work with. This is actually the intent of JDOM. DOM is specified using CORBA IDL, so it can be mapped to many different languages. Unfortunately, the Java mapping does not take advantage of Java's unique language capabilities, resulting in a lot more work for programmers.

The missing link at this point is the servlet framework, which has the following design.

Servlet and XSLT Architecture

Figure 3: Servlet and XSLT architecture.

This may seem overwhelming at first glance, but it's about everything that is needed for a basic XSLT driven web site. The entire design is very modular, so different programmers can work on different pieces in parallel.

First of all, a single servlet is used rather than many. This means that you only have one point of entry into the application, making security, logging, and deployment easier. Since a single servlet is responsible for the entire application, classes called RequestHandler are used for validation logic. All the servlet does is analyze the request to figure out which RequestHandler subclass to invoke. This keeps the servlet small, even if your application has hundreds of pages.

There are two basic approaches to locating request handlers. You can either add an additional parameter to the HTTP request or add extra path information. For example, this URL could invoke the SubmitCustomer request handler:

http://hostname/webapp/xsltservlet?requestHandler=SubmitCustomer

With extra path information, the URL looks a little cleaner:

http://hostname/webapp/xsltservlet/SubmitCustomer

In either approach, your servlet will extract the text "SubmitCustomer" and convert that into a class name for the request handler. This can be accomplished via reflection or via a lookup table. In either case, once the object is located, it is then responsible for the remainder of this request.

The RequestHandler object may then execute methods on EJB components that return data objects back to the web tier. These data objects are converted to XML, which is then passed on to Apache's Xalan or some other XSLT processor. The output from the XSLT processor is finally delivered to the client browser.

Optimization techniques

The primary disadvantage of the XSLT approach may be runtime performance. Every request to the servlet requires an XSLT transformation. For most applications, performance is fast enough; a few optimization techniques can improve the situation where maximum throughput is required. First of all, the XSLT stylesheets are just static XML files. These can be loaded into memory once and reused over and over again. This substantially improves performance because the XML only has to be parsed once.

The second major improvement is to use the Document Object Model (DOM) for your data objects, instead of converting those to XML text files first. For example, the following code takes the JDOM output from the CreditInfo class and converts it to DOM, without first writing it to a text XML file.

Example 6: Converting JDOM to DOM

DOMOutputter domOutputter = new DOMOutputter();
org.w3c.dom.Document domDoc = domOutputter.output(jdomDoc);

The domDoc object can then be passed to Xalan directly, which should be much faster than parsing in the raw XML text from a file or input stream. The following example is a simple test servlet that creates some XML using JDOM, then applying an XSLT stylesheet. The stylesheet is parsed once and then cached in memory, and the JDOM is converted into a DOM tree that is passed to Xalan directly. Comments in the code explain each step.

Example 7: A Servlet Example

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
import org.apache.xalan.xslt.*;
import org.jdom.*;
import org.jdom.input.*;
import org.jdom.output.*;

public class JDomServlet extends HttpServlet {
    // reuse the same processor over and over
    private XSLTProcessor processor = XSLTProcessorFactory.getProcessor(
            new org.apache.xalan.xpath.xdom.XercesLiaison());

    // initialize the Servlet.  This code is executed once.
    public void init(ServletConfig config) throws ServletException {
        super.init(config);

        ServletContext context = config.getServletContext();

        try {
            // read the XSLT stylesheet and cache it for future reuse
            InputStream xsltStream = context.getResourceAsStream(
                    "/WEB-INF/viewCreditInfo.xslt");
            StylesheetRoot parsedStylesheet = processor.processStylesheet(
                    new XSLTInputSource(xsltStream));
            processor.setStylesheet(parsedStylesheet);
        } catch (Exception ex) {
            throw new UnavailableException(ex.toString());
        }
    }

    // handle a single request from the client
    public void doGet(HttpServletRequest request,
            HttpServletResponse response) throws IOException,
            ServletException {
        try {
            response.setContentType("text/html");

            // in a real app, the CreditInfo object would be retrieved
            // from an EJB component
            CreditInfo ci = new CreditInfo("John Q. Public",
                    "Visa", "111-222-333", "05/2000");

            // convert the JDOM into DOM
            Document jdomDoc = ci.getDocument();
            org.w3c.dom.Document domDoc = 
                    new DOMOutputter().output(jdomDoc);
            
            // transform the XML into HTML
            processor.process(new XSLTInputSource(domDoc),
                              null,  // use pre-compiled stylesheet
                              new XSLTResultTarget(response.getWriter()));
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

Summary

Related information from XML.com

What is XSLT?

Transforming XML

Style-free XSLT Style Sheets

The XSLT approach to web development has clear advantages over traditional approaches. It does not lock you in to any vendor-specific APIs or frameworks. Perhaps the biggest roadblock to widespread acceptance of XSLT is programmer resistance. Many people have the impression that XSLT is simply too difficult to use, although much of this opinion is probably based on lack of experience rather than the complexity of XSLT. The main reason for the perceived complexity of XSLT is its syntax, which is a direct result of XML. Trying to express a programming language as XML is challenging and can lead to somewhat quirky syntax.

Beyond training and acceptance issues, an XML/Java/XSLT approach has the distinct advantage that its constituent parts are clearly segregated. From a project management perspective, this allows every team member to work in parallel, rather than overloading the servlet guru while the remainder of the team struggles to contribute.

Eric M. Burke is an O'Reilly author and a principal software engineer with Object Computing, Inc. in St. Louis, MO.


Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.