Published on
O'Reilly (http://www.oreilly.com/)
http://java.oreilly.com/news/javaxml_0500.html
See this if you're having trouble printing code examples
Java and XML: Interested Parties Apply Here
by Brett McLaughlin
06/01/2000
Java and XML: The pairing of the two is the developer's holy grail in the 21st
century. With no programming language as attractive as Java, XML offers
just as attractive a solution for data representation. While Java and XML are
certainly useful solutions in their own right, the pairing of the two has
application developers foaming at the mouth and drooling on their
keyboards. However, these developers are often found rubbing their
temples in frustration when it comes to putting that coupling to work.
In this article, we look at why XML has not been as easy to use from Java
as many expected, and how recent offerings in the Java and XML space are
changing things, making XML usage from Java available to all who are
interested.
eXpert Markup Language?
Excited and enamored with XML, Java developers rushed to the World Wide
Web Consortium's website to read
the XML specification, their Java editor open, their fingers
ready to begin typing. What so many of these developers found, though,
was a flurry of complicated phrasings and shadowy concepts. XML wasn't
quite so simple as expected. For example, the specification contained
definitions that looked
like this.
Certainly not what the average Java developer is used to looking at! So the
developers waited; they bided their time, hoping that with the flurry of
additional XML vocabularies and specifications being developed (XSL, XML
namespaces, XPath, etc.), the markup language would become accessible.
They waited ... for an API.
Evolution of an API
Like so many new concepts, a dedicated few made XML available to the
many. In other words, developers from the Java, C, Perl, and assorted
other programming worlds plowed through the specification, and developed
an application programming interface (API) to help their fellow
programmers (the ones who wanted to sleep every once in a while) use
XML. The fruit of their labor were the first two APIs for the Java platform
(as well as C and some other languages): SAX and DOM.
SAX
The Simple API for XML (SAX) was developed in the XML-dev mailing list, a
sort of lion's den of XML gurus. It provides a sequential look at an XML
document, and defines a number of events that occur in the parsing
lifecycle of a document. When elements start and end, when character data
is encountered, when a DTD defines an entity, and on a host of other
occasions, the SAX-compliant parser invokes a callback method. The
developer has the ability to define behavior for these callbacks, reacting to
the events, and thereby allowing an application to utilize the XML data as it
is reported. Developers downloaded the API en masse from
David Megginson's Web site, and started coding. Suddenly, XML was
being put to work!
However, the promise was yet to be fully realized. While SAX is blazingly fast
(as it never needs to store the entire XML document in memory), it didn't
allow modification of the underlying XML data. Additionally, the
event-based, callback methodology was unfamiliar to object-oriented developers,
and when employed, was often being used inefficiently. Many developers
went searching again, and found DOM.
DOM
The Document Object Model (DOM), and in particular the Java language
binding for DOM, provided a very different look at an XML document. Based
on a tree model, DOM allows random access to an XML document, and also
gives users the ability to modify the document's contents. The sequential,
event-based model that SAX presented was balanced by the object-based
approach of DOM: Every item in the tree is a specific type of Node, and all
extend a common interface. Common Node types include Document, Text,
Element, and Attribute; all can have children, and when representing an
XML document, they make up a complete DOM tree. Again, Java developers
fired up their browsers and downloaded the specification and API from the
W3C. And again,
XML was being used in Java applications. Developers could understand this
API, and used it heavily.
Check out our exclusive interview with
Brett McLaughlin
Still, DOM required understanding of every last nook and cranny of XML to
be used effectively. A Java developer couldn't get by with an overview of
XML, but instead found himself wading through the XML specification again
(albeit an annotated one at www.xml.com), often for the sake of understanding
concepts never used in common applications, such as entities, namespaces, and
processing instructions. Additionally, the heavier DOM API caused
performance problems in Java applications, and many developers either
went back to puzzling out SAX, or decided that XML's time still had not come.
JAXP
With JAXP, the Java API for XML Parsing
(java.sun.com/xml),
Java APIs for XML finally began to turn
the corner: The creators and designers of Java APIs were beginning to realize
that Java developers didn't want to have to be XML gurus to use XML; they
wanted to be...well...Java developers. JAXP provides a means to obtain a DOM
Document or SAX-compliant parser through a simple factory class. This meant
that developers were no longer responsible for the idiosyncrasies of different
vendors' parser implementations. JAXP also intends to provide the ability to
interchange parsers with minimal code changes. This was the sort
of API Java developers were used to working with.
However, JAXP only attempts to make parsers interchangeable, and still
leaves developers in the not-so-friendly hands of DOM and SAX for
manipulating XML data. It also supports older versions of these APIs (SAX
1.0 and DOM Level 1), and therefore is limited in its usefulness until a new
revision of the API is released.
JDOM
JDOM, the Java Document Object Model, is a new API for handling XML
from Java. Additionally, it is the first offering for manipulating XML built
expressly for Java developers. This means it is based on an average Java
programmer's expectations, use-patterns, and desires. Instead of having to
deal with a strict tree model (as in DOM), the API exposes, for example, an
Element's value directly; this is in contrast to having to iterate through an
Element Node's children, determining if the child is a Text Node, and
extracting its textual value.
Additionally, Java collection classes are returned from method operations,
as opposed to API-specific constructs (NamedNodeMap in DOM, Attributes
in SAX, for example). This makes the API arguably more usable than the
less-intuitive DOM, but because it is written to be Java-optimized, it
performs on par with SAX. You can find out more on JDOM (an open-source
project with a license in the style of Apache) on the Web at
www.jdom.org,
or read the definitive reference on the API in Chapter 8 of my upcoming book,
Java and XML.
Perhaps more important than the JDOM API itself is the shift in paradigm
that it represents; instead of forcing Java developers to become XML
gurus, it allows developers to leverage their existing Java expertise, and
brings XML to them. This pattern of catering to developers instead of
expecting developers to support a technology or API because of its
inherent "value" is an important one, and it may finally bring XML into
millions of applications instead of just an elite and bleeding-edge few.
It Only Gets Better From Here....
With the obvious evolution of Java APIs for XML moving from simple to
advanced, XML-centric to Java-centric, and voted-on-by-a-few to
contributed-to-by-many, XML is finally becoming a viable solution for
any Java developer. While understanding the darkest corners of
XML is a worthy and noble goal, it shouldn't have to be a cross that must be
borne by any programmer who wants to use XML. In fact, I would argue
that the converse is true: XML gurus, and Java developers who do
live and breathe XML, have a responsibility to leave their preconceptions
and high standards at the door; and they need to work to make XML easy
to use, simple to understand, and exciting to implement.
While JDOM is a terrific start toward this goal, there is much work to be done.
Both DOM and SAX will hopefully contribute to this movement. And all three
APIs, as well as JAXP (which can also take up the call by supporting the
newest standards), will commit to aiding the developer even more, and
result in a better offerings for Java developers who want, and even must,
use XML.
Can you help? Absolutely. Read up on SAX and DOM, and join
the JDOM mailing lists, where discussions about improving these
APIs are occurring daily. Download the APIs, and let their authors know what
you want, what you expect, and what you need in your applications. Finally,
don't let XML dictate your job; let your job dictate how you use XML!
O'Reilly & Associates will release
Java and XML in June 2000.
Brett McLaughlin works as an Enterprise Java consultant at Metro
Information Services, and specializes in distributed systems
architecture. He is author of the upcoming
Java and XML
(O'Reilly). He is involved in technologies such as Java servlets,
Enterprise JavaBeans, XML, and business-to-business applications. He is
an active developer on the
Apache Cocoon
project, EJBoss EJB
server, and a co-founder of the
Apache Turbine
project.
Copyright © 2007 O'Reilly Media, Inc.