Sign In/My Account | View Cart  

advertisement

AddThis Social Bookmark Button

Print

Java and XML: Interested Parties Apply Here

by Brett McLaughlin
06/01/2000

Java and XML: The pairing of the two is the developer's holy grail in the 21st century. With no programming language as attractive as Java, XML offers just as attractive a solution for data representation. While Java and XML are certainly useful solutions in their own right, the pairing of the two has application developers foaming at the mouth and drooling on their keyboards. However, these developers are often found rubbing their temples in frustration when it comes to putting that coupling to work.

In this article, we look at why XML has not been as easy to use from Java as many expected, and how recent offerings in the Java and XML space are changing things, making XML usage from Java available to all who are interested.

eXpert Markup Language?

Excited and enamored with XML, Java developers rushed to the World Wide Web Consortium's website to read the XML specification, their Java editor open, their fingers ready to begin typing. What so many of these developers found, though, was a flurry of complicated phrasings and shadowy concepts. XML wasn't quite so simple as expected. For example, the specification contained definitions that looked like this.

Certainly not what the average Java developer is used to looking at! So the developers waited; they bided their time, hoping that with the flurry of additional XML vocabularies and specifications being developed (XSL, XML namespaces, XPath, etc.), the markup language would become accessible. They waited ... for an API.

Evolution of an API

Like so many new concepts, a dedicated few made XML available to the many. In other words, developers from the Java, C, Perl, and assorted other programming worlds plowed through the specification, and developed an application programming interface (API) to help their fellow programmers (the ones who wanted to sleep every once in a while) use XML. The fruit of their labor were the first two APIs for the Java platform (as well as C and some other languages): SAX and DOM.

SAX

The Simple API for XML (SAX) was developed in the XML-dev mailing list, a sort of lion's den of XML gurus. It provides a sequential look at an XML document, and defines a number of events that occur in the parsing lifecycle of a document. When elements start and end, when character data is encountered, when a DTD defines an entity, and on a host of other occasions, the SAX-compliant parser invokes a callback method. The developer has the ability to define behavior for these callbacks, reacting to the events, and thereby allowing an application to utilize the XML data as it is reported. Developers downloaded the API en masse from David Megginson's Web site, and started coding. Suddenly, XML was being put to work!

However, the promise was yet to be fully realized. While SAX is blazingly fast (as it never needs to store the entire XML document in memory), it didn't allow modification of the underlying XML data. Additionally, the event-based, callback methodology was unfamiliar to object-oriented developers, and when employed, was often being used inefficiently. Many developers went searching again, and found DOM.

DOM

The Document Object Model (DOM), and in particular the Java language binding for DOM, provided a very different look at an XML document. Based on a tree model, DOM allows random access to an XML document, and also gives users the ability to modify the document's contents. The sequential, event-based model that SAX presented was balanced by the object-based approach of DOM: Every item in the tree is a specific type of Node, and all extend a common interface. Common Node types include Document, Text, Element, and Attribute; all can have children, and when representing an XML document, they make up a complete DOM tree. Again, Java developers fired up their browsers and downloaded the specification and API from the W3C. And again, XML was being used in Java applications. Developers could understand this API, and used it heavily.


Check out our exclusive interview with
Brett McLaughlin

Still, DOM required understanding of every last nook and cranny of XML to be used effectively. A Java developer couldn't get by with an overview of XML, but instead found himself wading through the XML specification again (albeit an annotated one at www.xml.com), often for the sake of understanding concepts never used in common applications, such as entities, namespaces, and processing instructions. Additionally, the heavier DOM API caused performance problems in Java applications, and many developers either went back to puzzling out SAX, or decided that XML's time still had not come.

JAXP

With JAXP, the Java API for XML Parsing (java.sun.com/xml), Java APIs for XML finally began to turn the corner: The creators and designers of Java APIs were beginning to realize that Java developers didn't want to have to be XML gurus to use XML; they wanted to be...well...Java developers. JAXP provides a means to obtain a DOM Document or SAX-compliant parser through a simple factory class. This meant that developers were no longer responsible for the idiosyncrasies of different vendors' parser implementations. JAXP also intends to provide the ability to interchange parsers with minimal code changes. This was the sort of API Java developers were used to working with.

However, JAXP only attempts to make parsers interchangeable, and still leaves developers in the not-so-friendly hands of DOM and SAX for manipulating XML data. It also supports older versions of these APIs (SAX 1.0 and DOM Level 1), and therefore is limited in its usefulness until a new revision of the API is released.

JDOM

JDOM, the Java Document Object Model, is a new API for handling XML from Java. Additionally, it is the first offering for manipulating XML built expressly for Java developers. This means it is based on an average Java programmer's expectations, use-patterns, and desires. Instead of having to deal with a strict tree model (as in DOM), the API exposes, for example, an Element's value directly; this is in contrast to having to iterate through an Element Node's children, determining if the child is a Text Node, and extracting its textual value.

Additionally, Java collection classes are returned from method operations, as opposed to API-specific constructs (NamedNodeMap in DOM, Attributes in SAX, for example). This makes the API arguably more usable than the less-intuitive DOM, but because it is written to be Java-optimized, it performs on par with SAX. You can find out more on JDOM (an open-source project with a license in the style of Apache) on the Web at www.jdom.org, or read the definitive reference on the API in Chapter 8 of my upcoming book, Java and XML.

Perhaps more important than the JDOM API itself is the shift in paradigm that it represents; instead of forcing Java developers to become XML gurus, it allows developers to leverage their existing Java expertise, and brings XML to them. This pattern of catering to developers instead of expecting developers to support a technology or API because of its inherent "value" is an important one, and it may finally bring XML into millions of applications instead of just an elite and bleeding-edge few.

It Only Gets Better From Here....

With the obvious evolution of Java APIs for XML moving from simple to advanced, XML-centric to Java-centric, and voted-on-by-a-few to contributed-to-by-many, XML is finally becoming a viable solution for any Java developer. While understanding the darkest corners of XML is a worthy and noble goal, it shouldn't have to be a cross that must be borne by any programmer who wants to use XML. In fact, I would argue that the converse is true: XML gurus, and Java developers who do live and breathe XML, have a responsibility to leave their preconceptions and high standards at the door; and they need to work to make XML easy to use, simple to understand, and exciting to implement.

While JDOM is a terrific start toward this goal, there is much work to be done. Both DOM and SAX will hopefully contribute to this movement. And all three APIs, as well as JAXP (which can also take up the call by supporting the newest standards), will commit to aiding the developer even more, and result in a better offerings for Java developers who want, and even must, use XML.

Can you help? Absolutely. Read up on SAX and DOM, and join the JDOM mailing lists, where discussions about improving these APIs are occurring daily. Download the APIs, and let their authors know what you want, what you expect, and what you need in your applications. Finally, don't let XML dictate your job; let your job dictate how you use XML!


O'Reilly & Associates will release Java and XML in June 2000.


Brett McLaughlin works as an Enterprise Java consultant at Metro Information Services, and specializes in distributed systems architecture. He is author of the upcoming Java and XML (O'Reilly). He is involved in technologies such as Java servlets, Enterprise JavaBeans, XML, and business-to-business applications. He is an active developer on the Apache Cocoon project, EJBoss EJB server, and a co-founder of the Apache Turbine project.