O'Reilly Databases

oreilly.comSafari Books Online.Conferences.

We've expanded our coverage and improved our search! Search for all things Database across O'Reilly!

Search Search Tips

advertisement
AddThis Social Bookmark Button

Listen Print Subscribe to Newsletters

The XML Elements of Style

by Steve Muench
10/18/2000

In honor of the eminently pragmatic William Strunk, Jr. and E. B. White, I present the XML Elements of Style: the elements you must follow as you create your own documents. If your XML document follows these ten basic rules, it qualifies as a "well-formed XML document."

Related Reading

Building Oracle XML Applications
By Steve Muench

  1. Begin each document with an XML declaration. The first characters in any XML document should be an XML declaration. The declaration is case-sensitive and looks like this in its simplest form:

    <?xml version="1.0"?>

    The special tag delimiters of <? and ?> distinguish this declaration from other tags in the document. The <?xml characters in the XML declaration must be the very first characters in the document. No spaces or carriage returns or anything can come before them.

  2. Use only one top-level, enclosing document element. The first, outermost element in an XML document is called the document element because its name announces what kind of document it is--<FAQ-List>, <Book>, <Transaction>, <TrackingStatus>, etc. You must have only one document element per document. So the following is legal:

    <?xml version="1.0"?>
    <Question>Is this legal?</Question>

    But the following is not:

    <?xml version="1.0"?>
    <Question>Is this legal?</Question>
    <Answer>No</Answer>

    because both <Question> and <Answer> are top-level elements. You can't even have the same element name repeated at the top level: there must be exactly one. So the following is also illegal:

    <?xml version="1.0"?>
    <Question>Is this legal?</Question>
    <Question>Is that your final answer?</Question>

    You need to pick a single name and use that element to enclose the others, like:

    <?xml version="1.0"?>
    <FAQ-List>
        <Question>Is this legal?</Question>
        <Question>Is that your final answer?</Question>
    </FAQ-List>

  3. Match opening and closing tags properly. XML is case-sensitive, so the following are not considered matching tag names:

    <Question>Is this legal?</question>
    <QUESTION>Is this legal?</Question>

    You'll find that XML syntax is rigid and unforgiving. You cannot get away with being sloppy about the order of closing tags. The following is illegal:

    <Question><Link href="http://qa.com/">Is this
        legal?</Question></Link>

    You need to close </Link> before closing </Question>, like this:

    <Question><Link href="http://qa.com/">Is this
        legal?</Link></Question>

    Simply keeping your tags neatly indented helps you avoid this mistake:

    <Question>
        <Link href="http://qa.com/">Is this legal?</Link>
    </Question>

    Note that adding extra spaces, carriage returns, or tabs between nested tags to make an XML document look indented to the human eye does not affect its structural meaning when working with datagrams, although clearly it increases the document's size slightly.

  4. Add comments between <!-- and --> characters. You can include comments anywhere after the XML declaration as long as they are not inside attribute values and don't occur in the middle of the < and > boundaries of a tag. So the comments in the following document are legal:

    <?xml version="1.0"?>
    <!-- Comment Here ok -->
    <FAQ-List>
        <!-
            | And here, multiple lines are fine
            +-->
        <Question>Is this legal?<!-- Here is fine --></Question>
        <!-- Here too -->
        <Answer>Yes</Answer>
    </FAQ-List>
    <!-- Even Here -->

    but all four comments in this example are not:

    <!-- NOT before XML declaration -->
    <?xml version="1.0"?>
    <FAQ-List>
      <FAQ Submitter="<!-- NOT in an attribute value -->" >
        <Question <!-- NOT between < and > of a tag --> >Is this
            legal?</Question>
        <Answer>Yes</Answer>
        <!-- Illegal for comment to contain two hypens -- like this -->
      </FAQ>
    </FAQ-List>

  5. Start element and attribute names with a letter. Element and attribute names must be a contiguous sequence of letters and cannot start with a digit or include spaces in the name. The following are not allowed:

    <2-Part-Question> <!-- Error: element name starts with a digit --> <Two Part Question> <!-- Error: has spaces in the name --> <Question 4You="Yes"> <!-- Error: attribute name starts with a digit -->

    Some punctuation symbols (like underscore and hyphen) are allowed in names, but most others are illegal:

    <_StrangeButLegal>Legal</_StrangeButLegal>
    <More-Normal-Looking>Legal</More-Normal-Looking>
    <OK_As_Well>Legal</OK_As_Well>

  6. Put attributes in the opening tag. Attributes are listed inside the opening tag of the element to which they apply. The following is correct:

    <FAQ Submitter="smuench@oracle.com">
        <!-- etc. -->
    </FAQ>

    while the following is illegal:

    <FAQ>
        <!-- etc. -->
    </FAQ Submitter="smuench@oracle.com">

  7. Enclose attribute values in matching quotes. Either of the following is fine:

    <FAQ Submitter="smuench@oracle.com">
    <FAQ Submitter='smuench@oracle.com'>

    but the following two are not. You can't forget the quotes:

    <FAQ Submitter=smuench@oracle.com>
    <FAQ Submitter='smuench@oracle.com">

    or be sloppy about using the same closing quote character as your opening one.

  8. Use only simple text as attribute values. Elements are the only things that can be nested. Attributes only contain simple text values. So the following is illegal:

    <Task Subtasks="<Task Name='Learn XML Syntax'>"/>

  9. Use &lt; and &amp; instead of < and & for the literal less-than and ampersand characters. The less-than and ampersand characters have a special meaning in XML files, so when you need to use either of these characters literally, you need to use &lt; and &amp; instead:

    <Company>AT &amp; T</Company> <!-- AT & T --> <Where-Clause>SAL &lt; 5000</Where-Clause> <!-- SAL < 500 -->

    On occasion, the &quot; and ' also come in handy to represent literal " and ' in attribute values:

    <Button On-Click="alert('Print a &quot; and &apos;');"></Button>

  10. Write empty elements as <ElementName/>. Elements that do not contain other elements or text nested within them can be written with the more compact empty element syntax of:

    <Task Name="Learn XML Syntax">
        <Task Name="Use Empty Elements"/> <!-- Empty Element -->
    </Task>

    As shown above with the Name attribute on the empty <Task> element, attributes on empty elements are still legal.

Steve Muench is Oracle's lead XML Technical Evangelist and development lead for Oracle XSQL Pages.


Return to oracle.oreilly.com.




Tagged Articles

Be the first to post this article to del.icio.us

Sponsored Resources

  • Inside Lightroom

Related to this Article

Understanding Oracle Clinical Understanding Oracle Clinical
by Joan M. Johnson
June 2009
$9.99 USD

New Features in Oracle 9i New Features in Oracle 9i
by Howard J. Rogers
June 2009
$5.95 USD

Advertisement
O'Reilly Media

©2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly
Content Archive
Business Technology
Computer Technology
Google
Microsoft
Mobile
Network
Operating System
Digital Photography
Programming
Software
Web
Web Design
More O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net
O'Reilly Insights on Forbes.com