We've expanded our news coverage and improved our search! Visit news.oreilly.com for the latest or search for all things across O'Reilly!
advertisement
Listen Print

Take My Advice: Don't Learn XML -- Footnotes



  1. [ Back to article ]

  2. I should note that the table really isn't up to date. If you want to complicate things even further, you can throw in the various XML schema languages (RELAX NG, W3C XML Schema) and lots of other specifications that are currently making their way through the W3C and various other standards organizations. [ Back to article ]

  3. My real opinion about XML and about structured authoring: Unless and until we ever come up with a better way of doing it, it makes absolutely no sense these days to talk about structured authoring outside of the context of XML and SGML. Mistrust anybody who tries to tell you otherwise. I also think it's not very useful to talk about XML-based structured authoring without talking about a specific XML vocabulary. That's why this article focuses exclusively on DocBook. [ Back to article ]

  4. Disclosure: I am a member of the DocBook Technical Committee at OASIS. If you reckon that's an indication that I'm a little less objective about this topic than the average man on the street, well, you're right. [ Back to article ]

  5. As I take a moment to step off my soapbox and readjust it to the proper speaking height, I want to put in a good word for the TEI DTD, a standard developed by the Text Encoding Initiative.

    Much of the curriculum in this lesson might possibly apply to TEI as well as to DocBook. Without a doubt, certain types of documents are better marked up in TEI than in DocBook: poems, plays, and literary texts in general, because those are the types of documents the TEI DTD was specifically designed for; also things like speech transcriptions and dictionaries and such, and probably for lots of other things I don't even know about.

    DocBook, on the other hand, is the better choice for documentation--especially for computer software and hardware documentation--because that's one of the areas it (in contrast to TEI) was specifically designed for. You and I can get ourselves into a knock-down-drag-out over which one, TEI or DocBook, is better for general things, like this article (which does happen to be written in DocBook, by the way). But in the end, we'd be agreeing that we're both a thousand times better off than the guy who's still authoring in Microsoft Word or one of its equivalents. [ Back to article ]

  6. It's more accurate to call DocBook a dialect because, in addition to defining "words" (elements and attributes), DocBook also (like any other XML dialect) specifies a grammar: a set of rules that constrains how the words can be used. [ Back to article ]

  7. You can also think of DocBook as a library of reusable content models. Programmers rely on standard libraries of reusable code for tasks that are common to many applications, rather than wasting time writing their own custom code to perform those same tasks. Similarly, DTD designers and customizers can rely on the wealth of content models for common structures that DocBook provides, rather than reinventing the wheel by writing their own content models from scratch to define those same structures.

    Essentially, you're free to use as much or as little of DocBook as you need. Or (because it's completely extensible) you're free to add to it and customize it in any way that you like. And it's built in such a way that you can use it to author any kind of modular content. That is, you can use it not just to author printed books and articles, but also to create reusable sections or topics that can be combined and presented in any form you choose, delivered as pages for a Web site, for example, or as a series of slides in a slide presentation, or as a set of help topics.

    There is an unfortunate misconception in some provincial neighborhoods that DocBook was designed only for authoring complete books and articles. Please don't fall into the trap of accepting or propagating that misconception any further. DocBook does have a uniquely rich history of being used, among other things, for book-oriented authoring, and not just for authoring Web content or other online content (which makes sense because DocBook actually predates the Web). But that history does not limit it; DocBook is just as well suited to authoring reusable, modular content for delivery over the Web, as online help, or in any other form. [ Back to article ]

  8. And others are dissatisfied enough with DocBook to argue that it just isn't a very good standard. The common expressions of dissatisfaction with DocBook can be paraphrased as follows:

    • There's a lot more in DocBook than I need
    • There's not enough in DocBook that I do need

    Both of these are valid complaints, but the DocBook designers have tried to account for them by building in a sophisticated customization layer to the DTD. And the DocBook Technical Committee has tried to account for them by providing a mechanism by which anyone can submit a DocBook Request for Enhancement (RFE). It seems to me that it makes more sense to first try to use, understand, and improve a current, very popular standard like DocBook than it does to jump into attempting to create alternatives to it.

    That said, I am certain there are some kinds of technical-document authors for whom DocBook is just not the right DTD; but before you decide you're one of those authors, I think it's wise to first try to understand and use DocBook, and then make that determination for yourself. [ Back to article ]

  9. Writers working on open source documentation don't have this alternative, of course. In the open source world, where there is a common need to exchange or interchange documentation, a structured-authoring standard is absolutely essential. [ Back to article ]

  10. Another cost is, of course, the cost of developing a custom DTD from scratch, measured against the cost of using DocBook or a DocBook-based customization. A DTD-development team unfamiliar with DocBook or with DocBook customization could very well end up wasting a lot of time and money creating a DTD that basically reinvents content models that already exist in DocBook, or ones that could easily be added to a DocBook customization layer. [ Back to article ]

  11. Many of the same people who designed the DocBook DTD went on to contribute very significantly to the development of XML and its related technologies. In addition to Eve Maler, Jon Bosak, Murray Maloney, Dave Hollander, and Conleth O'Connell--all members of the original W3C XML Working Group--the list also includes Terry Allen, Dale Dougherty, Eduardo Gutentag, and many others. For more details, see the article, The Making of the DocBook DTD and the What is DocBook? page at the official DocBook site. [ Back to article ]

  12. For example, when it comes time to create your own customization of DocBook (still legal in most countries, though some people will try to tell you otherwise), you will need to learn more about DTDs. And when you get to the point of finding that you want to modify the output that the modular DocBook stylesheets produce, you'll need to learn more about XSLT so that you can build a customization layer.

    Or you might need to learn about DSSSL (Document Style Semantics and Specification Language), if you happen to be working with the DSSSL stylesheets that are also available for DocBook--they are still very widely used and still widely useful. Spirited debates have been known to occasionally flare up about the relative merits of tried-and-true DSSSL versus the young contender, XSLT. Inexplicably, these debates have failed to produce any "DSSSL versus XSLT" bumper stickers, though something similar to the Ford versus Chevy ones with that comic-strip character Calvin would seem to be widespread by now. [ Back to article ]

  13. I considered but then rejected another analogy: The difference between trying to learn a second language by first studying general linguistics, and learning a second language by actually speaking and using it. (The team of legal-assistants-in-training that I keep on retainer tells me that to protect myself from the Bad Analogy Enforcement Bureau, I must explicitly state the following: while I meant for DocBook to be the "language" in the analogy and XML to be the "general linguistics," I admit that the analogy is not a very good one).

    What I had in mind with that analogy was the point that while it may be valuable to learn about linguistics in broad terms, not much of that learning will help you actually communicate with anyone in any specific language that you might want to learn. In fact, it seems like it works the other way around: the more you learn about and use a specific language, and the more languages you understand, the better prepared you are for learning about linguistics in general. [ Back to article ]

  14. The documentation is easy to get; the wrinkle here is getting an XML-editing application installed and working, though that's not all that hard either. One editing package worth highlighting is Paul Kinnucan's open source XAE, which aims to be a single-download solution that will work right out of the box, as long as you already have the GNU Emacs or XEmacs text editor installed. If that's not novice-friendly enough for you, there are freely downloadable demo versions of several very good commercial applications, including at least one higher-end one and some that cost as little as $150 (USD) or so. For more details, post a question to the xml-doc mailing list, where this sort of thing is discussed. [ Back to article ]

  15. Wait, let me backtrack for a minute and give you a warning: you will run into problems and you will need help figuring them out, so your real first step should be to subscribe to mailing lists where you can get answers. Without a doubt, try the docbook and docbook-apps mailing lists. Optionally, you might also want to subscribe to xml-doc, a list I started a while back for general (e.g., non-DocBook-specific) discussions of XML and structured authoring, particularly for documentation. It started out with one subscriber (me), but through a very successful leafleting and subway-poster campaign, I've managed to get another 1,200 or so people to sign up. [ Back to article ]

  16. Another Disclosure: I helped to get the DocBook Open Repository Project started and I continue to contribute to it in an administrative role. [ Back to article ]