| Sign In/My Account | View Cart |
Brett McLaughlin worried that his presentation at the O'Reilly Conference on Java [March 2000] would be a dud. He thought his session's vague title, "Uncoupling Applications: Modular Application Architecture with J2EE" didn't adequately convey the importance of his message. And on top of that, he had to compete with a Sun Microsystems presentation on J2EE (Java 2 Enterprise Edition) next door.
But to his pleasant surprise, McLaughlin ended up with a standing-room-only audience. And the frustrations developers expressed in his session confirmed his thinking: Developers are tired of application components so tightly coupled they can't be used anywhere else. Developers want to be able to uncouple their applications from the data. And they want tools to enable their software components to talk to each other through common contracts and still handle a wide variety of different implementations.
XML affords all of this and more, according to McLaughlin, who is the author of O'Reilly's upcoming book Java and XML. As a longtime Java developer now working extensively with XML, he felt compelled to write this book. O'Reilly talked with McLaughlin about XML's rising popularity, XML as the link to fulfilling Java's "write once, run anywhere" promise, and his new book.
The excitement around XML stems from one of its most important features: It doesn't say a whole lot about itself. By contrast, HTML (which is another markup language), has a very specific set of tools--tags and attributes--that are recognized and processed only one way. XML says, "We're not intelligent enough to think of every possible way you use your data, so we'll let you use it in whatever way makes sense." You have to let your document users know your tag meanings, but XML provides for this by defining DTDs (document type descriptors) to give tags concrete definitions, attributes, and all kinds of parameters.
XML lets you work a little smarter by defining a standard way for any data to be defined, without defining the particular semantics. What to call attributes and elements is left up to you to define for your particular project's need. XML creates portable code and portable data, allowing for that third-party instance of business-to-business, e-commerce, and e-business--all those e's being promised right now. For the first time we really have complete application portability, something we've all been claiming for years. We're finally actually getting to a standard way to exchange data and write code.
Somebody finally realized people are going to come up with their own slang and meanings, so why not provide a format for that? And in 1996, James Clark suggested the Extensible Markup Language was developed to accomplish this. XML finally caught on because of its ability to accommodate many disparate creations and to find shortcuts without breaking specifications.
But I'm certainly not from the XML school of thought. I'm really a Java developer at heart who sees XML as something useful. That's an important distinction. One of the reasons I've gotten into Java and XML--and one reason I think this book, Java and XML, is very different from many other books on XML--is that it's from a Java point of view.
The problem with this is twofold. First, there is the longstanding problem of handling changes. The marketing people want to change a logo, for example, so everything has to be red instead of blue. Every developer has dealt with this. Maybe you can do a global search-and-replace. But if some data had the word "blue" in it, now you've got this wonderfully formatted page with incorrect data because everywhere it said "blue" it now says "red." People have gotten sick and tired of that. They want to be able to separate their data from their presentation. They want fifty lines of data in one file and the HTML or markup in another file, and they want to merge those when necessary.
The second, more prevalent problem lately is the advent of the wireless markup language for phone, palm pilots, and handheld devices with Internet connectivity and pure Java browsers. Suddenly developers can no longer assume HTML clients. There may be clients that support only a subset of HTML. The knee-jerk response is to code to the lowest common denominator so all devices have nifty looking displays. But company Web pages look horrible with only ten tags to work with. The reaction at the other end of the spectrum is to build completely different sites for different clients, incurring huge maintenance overhead.
I can write an XML document using nothing but data with my made-up elements and attribute names. I can then create an XSL (Extensible Stylesheet Language) stylesheet, which is another offshoot or specification of XML that follows the same rules as the core XML language. XSL allows you to specify markup specifics and then provides a pattern-matching approach with instructions. There's also an XSL/T processor to run the document and stylesheet together to produce output.
This means I can create an XSL stylesheet for an HTML client, one for my wireless markup language client, and one for my palm client. And I can go even further. Perhaps I want to use some fancy DHMTML for Internet Explorer 5. I can use a different stylesheet for that one versus the Netscape style sheet, which doesn't support DHTML as much. I can end up with an array of different presentations, all of which can be applied to the same underlying document.
When we get out of the mode of thinking of people as clients and everything else as the application, we start building these very loose contracts between components. If I want to expose my EJB container and business logic to an entirely different company, that's fine. The XML data I'm sending back and forth is application neutral. Perhaps the client doesn't want to use all of the data, or they want the data to conform to some filter. In XML, the developer doesn't have to require any presentation-specific markup. Clients are not forced to deal with the developer's application paradigm.
Within the next year, people will really begin to understand that it's all just data that they can manipulate differently using XML. Right now we're still writing XML like it's HTML or some presentation model. We're still thinking linearly. But you can avoid locking yourself into a presentation model in XML. You can group all your tags at the top of the file. The stylesheet can deal with those tags at the beginning of the output for parsing document information, regardless of presentation specifics. In an XML-centric world, we can model our thoughts toward the best representation of data so that any program can use that data.
Ask anyone doing enterprise application development, and I'd wager they're spending ten to twenty-five percent of their time coding validation. For example, when a user inputs a taken or invalid user name or forgets to type in their password, a form comes back and alerts the user. It may come back several times declining subsequent tries and suggest available user names or passwords, most of which usually don't make much sense to the user. Someone has to code all this logic, and it's a real hassle. This is a fundamental necessity, yet it's just like the proprietary data format. Developers go from company to company re-coding validation into business logic with very tight coupling. A simple change like requiring eight-character user names can cause a huge impact to a very simple application because business logic is embedded into the code.
A companion article by Brett McLaughlin, Java and XML: Interested Parties Apply Here, looks at why XML has not been so easy to use from Java and tells how recent Java and XML offerings are changing things, making XML usage from Java available to all who are interested.
I believe a close marriage between things like XML Schema and the Java Virtual Machine--the actual Java interpreter--is possible. Instead of having to code this validation explicitly, you define your Java program and interface. Then you also define a separate XML Schema to make available to that interface. Now you've got very portable code, and changing something like the length of a user name requires simply going into the Schema rather than recompiling code. The XML Schema model is very close to how Java is modeled. It's very object-oriented. There are even things in the XML Schema that allow you to do inheritance and equivalency so you can extend elements for new attributes without having to redefine a new type.
When developers first started using Java, they were writing a wealth of non-standard extensions for Java. Due to mounting problems with this, developers had to start shipping all those extensions with their code. Then JDBC (Java Database Connectivity) came out to standardize Java for databases. The same thing has happened with servlets and EJB. If developers want to do distributed computing, they need a standard to make their code portable. As soon as somebody realizes their need is a common one, they should work to develop a common standard to keep data portable.
XML is undergoing this same sort of philosophy. Developers want to be able to handle presentations, so XSL was developed as the Extensible Stylesheet Language. Other things are cropping up like XQL, the XML Query Language, which handles database access in XML by defining a standard way to represent SQL queries and return results. XML is changing at an incredible pace. The XML 1.0 spec was finalized in late 1998, and here in early 2000 the number of extensions is starting to get almost humorous: XML, XSL, XSLT, XPath, XLink, and XPointer, and XSP-- all of which are important specifications in a year and a half's time.
Most XML APIs (Application Programming Interfaces) are coming out of the XML community. The two most popular XML APIs for Java are SAX, a simple API for XML, and DOM, the document object model, which is actually used for lots of other things. David Megginson and others from the XML DEV list, one of the largest XML communities, hammered out SAX. The W3C formalized the DOM.
The XML-dev mailing list came up with SAX. It's great for XML-based specifications, but for Java APIs, I'm not convinced either SAX or DOM are good solutions. They require learning new constructs, when Java already has perfectly good alternatives. One of the things I'm proudest about in Java and XML is introducing JDOM, a completely new, pure Java interface. Jason Hunter, a Java developer who is using many of the new Java-XML tools, (and whose Java Servlet Programming book is the best there is), worked with me on developing this. JDOM provides for developing in XML in a Java-centric world without having to learn new constructs. JDOM also approaches XML data without any intention of porting it to C.
Chapter 8 in the book gives all the JDOM classes as well as a complete appendix with with an API reference. It's also available at http://www.jdom.org. It's a rather revolutionary approach, but I believe people do not like dealing with some of the weird things required to talk to the DOM. I go a step further by taking every example for the rest of the book and demonstrating how to rewrite these using JDOM. I am also putting my money where my mouth is and actually using a beta form of JDOM in production already.
The JDOM API will be open source, and by the time the book is released, there may be another completely native implementation not built on DOM or SAX. The book's appendix includes JDOM in the API reference along with SAX and DOM. I think JDOM is tremendously helpful for Java developers learning XML because it doesn't force them to be XML gurus. They can learn about XML through a Java worldview in which data is portable.
When we sent the book out for technical review, we gave the draft to two people with no XML knowledge at all--and one of them had never written a servlet or anything but some stand-alone Java code. In their feedback, they indicated no trouble. Both of them understood all of it. And this is not a "Java and XML in a Nutshell" level book! Not only does it teach fundamentals and how to do it, but it also tells you why something is important. Even for someone familiar with SAX and DOM who has been doing XML and production for years, this is still a great book, because the last six chapters covers topics they would have had to figure out on their own.
The book approaches all these topics as an application developer by showing how to download and use existing, standards-based tools, then building on top of this to create really good applications instead of mediocre applications built from scratch. That's an approach I haven't seen in other XML books, particularly those with a Java focus. Java and XML is for anyone who is doing enterprise application development, using Apache Cocoon, or who wants to know XML-RPC or how to write Perl applications that talk XML and spit it back out to Java. It definitely covers a wide breadth of subjects.
XML-RPC is important because it provides an alternative to RMI, remote method invocation, in Java. RMI is a very useful but heavy-duty means of accessing objects on remote servers and performing foreign functions remotely. XML-RPC is easier to use, making it a good entry point for distributed systems. The book teaches the basic concepts. XML-RPC allow talking XML with a server that can't talk RMI but still needs to handle calls. XML-RPC lets you send your data and requests in XML, and then executes it. It also tells you what happened, and speaks XML back to you whether you're using Perl, Corba, C, or anything else.
Covering XML-RPC is intrinsically important due to a kind of rebirth. RPC was popular ten years ago and died out because everyone was writing proprietary data formats. RPC provides a way to communicate XML across the network, but it never had a good way to represent data uniformly. That's what XML does. XML-RPC also forces developers into a new paradigm.
I've tried to write both these books with the real world in mind. The cool thing is that I didn't have to make up any examples for the book. Everything is based on valid ideas that I've already put in production. All the code in both of these books--the examples, the JDOM and other interfaces--are going to be made available to the public as open source and maintained on a Web site. I'm committed to this stuff being usable. I want these to be the kinds of books that don't just sit on your bookshelf but actually lie open on your desk. [Editor's Note: O'Reilly doesn't yet have a title for Brett's new book, but we plan to relaease it in the winter of 2000.]
O'Reilly & Associates will release Java and XML in June 2000.