advertisement

Print

XHTML: The Clean Code Solution
Pages: 1, 2, 3

XHTML Part 2: Differences

First the basics. Here are a few simple changes to how you currently code. As you'll see in these examples, it's nothing dramatic, just some rules to remember about tags and attributes.



All HTML must be in lowercase

Since XML is case-sensitive, all HTML element and attribute names must be in lowercase. You can no longer get away with what many of us used to do to improve the readability of code -- entering the element and attribute names in uppercase and the values in lowercase, or other coding styles.

HTML: XHTML:
<BODY BGCOLOR="#ffffff"> <body bgcolor="#ffffff">

Fortunately most good HTML editors give you the option of inserting your HTML code in uppercase or lowercase, and many even convert the case of existing tags. An exception to this rule is that user-defined attribute values can be any case you want. For example, the #ffffff hex color above can also legally be written as #FFFFFF."

All attribute values must be quoted

This one is pretty straightforward -- no more <table border=0>. You now need to quote every attribute, even if it's numeric:

HTML: XHTML:
<table border=0>... <table border="0">...

This one will be particularly annoying to some Perl coders I know, who for years have been writing:

print "<table border=0>\n"; instead of
print "<table border=\"0\">\n"; or even better
print qq{<table border="0">\n};

It's also frustrating that some HTML editing programs that claim not to change your code do remove quotes around numeric attribute values.

All non-empty elements must be terminated

Remember when the <p> tag was used to separate paragraphs? Well that was never the intended use for that tag. But many HTML coders, including myself, used it that way. Some web developers still preach against the </p> tag at the end of a paragraph. That's a whole can of worms I'm going to avoid.

What I've learned is that the <p> tag is designed to mark the beginning and end of a paragraph. That makes it a "non-empty" tag since it contains the paragraph text. I still occasionally use it by itself, especially on pages that don't use a style sheet. But in XHTML that's a big no-no.

HTML: XHTML:
Paragraph 1<p>
Paragraph 2<p>
<p>Paragraph 1</p>
<p>Paragraph 2</p>

In addition to the <p> element, this also applies to list elements which are often left unterminated: <li></li>, <dt></dt>, and <dd></dd>.

Elements must nest, not overlap

HTML doesn't care whether you overlap elements. For example, if you have a bold tag at the end of a paragraph, it works pretty much the same whether you close the </b> first or the </p> first. With XML and XHTML, you need to close the most recently opened tag first, then the others in succession:

HTML: XHTML:
<p>here is a bolded <b>word.</p></b> <p>here is a bolded <b>word.</b></p>

Required elements

These little tidbits are pretty obvious, and I imagine that most of us are doing them already.

  • The <head> and <body> elements cannot be omitted.
  • The <title> element is a required element within the <head> element.

Pages: 1, 2, 3

Next Pagearrow