O'Reilly Network    
 Published on O'Reilly Network (http://www.oreillynet.com/)
 See this if you're having trouble printing code examples


Writing RSS 1.0

by Rael Dornfest
08/25/2000

A step-by-step guide to building an RSS 1.0 document by hand. (Updated for RSS 1.0 RC1)

(This article assumes a certain familiarity with the basics of XML markup (the "pointies") and perhaps even a little fiddling with RSS itself. The introductory material is brief, focusing on the distinguishing characteristics of the recently proposed RSS 1.0.)

Introductions

RSS ("RDF Site Summary") is a lightweight multipurpose extensible metadata description and syndication format. Whew, that was a mouthful! Let's take that bit by bit, shall we.

That's about all I'll say about the overall picture of RSS. I do realize that this was a rather brief overview, but since our intention is to actually create an RSS document, I'll leave further introduction to the many wonderful RSS articles already in existence; visit the Resources section below for a list.

RSS Document Structure

A basic RSS document (or "channel") is structurally rather simple:

XML Declaration
Container
  Channel Description
  Image Description (optional)
  Item Description
  Item Description
  ...
  Item Description
  Text Input Description (optional)

Related Articles:

RSS Delivers the XML Promise

RSS Moves Forward

Developers Explain: Why RSS 1.0?With audio

RDF and Metadata (XML.com)


Previous Features

More from the RSS DevCenter

Let's start at the top and work our way down, shall we. And since the proposed RSS 1.0 builds on the foundation of RSS 0.9, we'll start by building a 0.9 document and then cover the few basic mechanical changes necessary to bring it into compliance; if you're already familiar with RSS 0.9, feel free to breeze through the first part of the tutorial.

Since we want to focus on the markup, let's keep our example as simple as pie. Mmmm ... pie. Our online pie shoppe, pie-r-squared.com, features a continuously changing lineup of delicious pies for download (alright, online ordering). We'll create an RSS feed to syndicate the choices du jour.

XML Declaration: <?xml version="1.0"?>

While XML documents are not required to begin with an XML declaration, it is generally good practice to do so. The declaration says "This is an XML document" and specifies the version thereof -- the current version of XML itself is 1.0.

Now the XML declaration does also afford you the opportunity to specify your preferred encoding type -- the way you'll be dealing with special characters. Unless specified otherwise, RSS 1.0 assumes UTF-8; let's go ahead and add it for pedantic/illustrative purposes. So the first line of our document (make sure it's the first line!) looks a lot like this:

<?xml version="1.0" encoding="utf-8"?>

(By the way, I'll be calling out changes in our evolving document as we go along by highlighting new bits in orange.)

The Container: <rdf:RDF>

Every XML structure can have one and only one outer container -- the "root element." RSS 1.0's root element is borrowed from the earlier 0.9 version. The root element also affords us the opportunity to declare the namespaces we'll be using in our document.

Let's take a pit stop and see what we mean by namespaces.

In my sphere, there exist two Tims, two Jons, and a number of Daves (or variations thereof). To avoid confusion (never mind embarrassment), I have to be sure to clarify which Tim or Jon or Dave I'm referring to. Thank goodness they all have different last names, making Tim O'Reilly distinct from Tim Berners-Lee.

Now, since XML elements and attributes don't have last names, it can be difficult to differentiate between <title> as in the title of a Web page and <title> as in the title of a book. The distinction, using XML namespaces, may be expressed so:

html:title
book:title

Now these namespace prefixes (the bit before the colon) are not particularly useful if you don't have a decent definition for what html and book are. They are, therefore, associated with a URL.

xmlns:html="http://www.w3.org/TR/REC-html40"
xmlns:book="http://www.oreilly.com/book"

This scheme effectively identifies the former as "title as defined by the HTML 4.0 specification" and the latter as "title as in O'Reilly book." URLs are used because they're a convenient way for everyone to invent unique names under their own control. The URLs don't have to point to anything useful, but it's nice if they do (documentation, for instance).

Now, since I work for O'Reilly & Associates, a book company, it's fair to assume that when I say the word "title" in the office I'm referring to a book title. I would always qualify when talking about an HTML document title by saying, well, "Web page title" or the like. So my "default namespace," then, in the book world, is declared in XML like so:

xmlns="http://www.oreilly.com/book"
xmlns:html="http://www.w3.org/TR/REC-html40"

You'll notice a lack of prefix associated with the http://www.oreilly.com/book realm, allowing me to refer to the two types of titles as:

title
html:title

Mind you, the namespace doesn't refer to the "title" itself, but to the vocabulary which defines it. This rather simplistic example should hopefully provide enough on namespaces to get you going; for more information, be sure to visit the Namespaces in XML W3C recommendation and Tim Bray's "XML Namespaces by Example."

We'll add the default namespace for RSS 0.9 and one for RDF itself to our outer rdf:RDF element and drop the opening and closing tags into our document:

<?xml version="1.0" encoding="utf-8"?>

<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://my.netscape.com/rdf/simple/0.9/"
>

</rdf:RDF>

Channel

Welcome to the channel element, a place to describe a few aspects of our RSS channel. We're required to fill in a title, link, and description. How about:

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://my.netscape.com/rdf/simple/0.9/"
>

  <channel>
    <title>Pie-R-Squared</title>
    <description>
      Download a delicious pie from Pie-R-Squared!
    </description>
    <link>http://www.pie-r-squared.com</link>
  </channel>

</rdf:RDF>

The channel is titled "Pie-R-Squared" and suggests to the end-user that they render the title as a link to our (imaginary) home page.

Image

We can optionally associate a little image (usually 88x33 pixels) with our channel to be used in a My.Netscape-style newsbox rendering. A title element provides text for the image's alt attribute, a link element specifies where the image should hyperlink to, and the url element is the location of the image file itself. We'll use values similar to the channel definition above and add the URL of our imaginary logo.

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://my.netscape.com/rdf/simple/0.9/"
>

  <channel>
    <title>Pie-R-Squared</title>
    <description>
      Download a delicious pie from Pie-R-Squared!
    </description>
    <link>http://www.pie-r-squared.com</link>
  </channel>

  <image>
    <title>Pie-R-Squared du Jour</title>
    <url>http://www.pie-r-squared.com/images/logo88x33.gif</url>
    <link>http://www.pie-r-squared.com</link>
  </image>

</rdf:RDF>

Item(s)

We finally get to the meat (or tofu) of our RSS channel -- the items meant for syndication. There's not much room here for detail -- just a simple title and link. While we're allowed up to 15 items (1 at a minimum), we'll just add a couple.

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://my.netscape.com/rdf/simple/0.9/"
>

  <channel>
    <title>Pie-R-Squared</title>
    <description>
      Download a delicious pie from Pie-R-Squared!
    </description>
    <link>http://www.pie-r-squared.com</link>
  </channel>

  <image>
    <title>Pie-R-Squared du Jour</title>
    <url>http://www.pie-r-squared.com/images/logo88x33.gif</url>
    <link>http://www.pie-r-squared.com</link>
  </image>

  <item>
    <title>Pecan Plenty</title>
    <link>http://www.pie-r-squared.com/pies/pecan.html</link>
  </item>

  <item>
    <title>Key Lime</title>
    <link>http://www.pie-r-squared.com/pies/key_lime.html</link>
  </item>

</rdf:RDF>

Textinput

Finally we arrive at the textinput element, affording a method for submitting form data to an arbitrary URL -- a script handling the GET method. While I'm not a big fan of the textinput element, we'll throw it in as a searchbox for laughs. Title and description are self-explanatory; link is the URL of the receiving script, and name is the variable to which anything entered into the box is assigned.

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://my.netscape.com/rdf/simple/0.9/"
>

  <channel>
    <title>Pie-R-Squared</title>
    <description>
      Download a delicious pie from Pie-R-Squared!
    </description>
    <link>http://www.pie-r-squared.com</link>
  </channel>

  <image>
    <title>Pie-R-Squared du Jour</title>
    <url>http://www.pie-r-squared.com/images/logo88x33.gif</url>
    <link>http://www.pie-r-squared.com</link>
  </image>

  <item>
    <title>Pecan Plenty</title>
    <link>http://www.pie-r-squared.com/pies/pecan.html</link>
  </item>

  <item>
    <title>Key Lime</title>
    <link>http://www.pie-r-squared.com/pies/key_lime.html</link>
  </item>

  <textinput>
    <title>Search Pie-R-Squared</title>
    <description>Search our pie catalog...</description>
    <name>keyword</name>
    <link>http://www.pie-r-squared.com/search.pl</link>
  </textinput>

</rdf:RDF>

So entering "chocolate" into the above-specified searchbox would result in an HTTP GET of http://www.pie-r-squared.com/search.pl?keyword=chocolate

Onward and upward to 1.0

While getting from our RSS 0.9 compliant document to RSS 1.0 takes only three simple mechanical changes, it opens up a whole new dimension of RSS extensibility and rich metadata relationships which we'll get to in a bit. Let's do the the easy mechanical pieces first. ...

As I mentioned at the beginning of this tutorial, the proposed RSS 1.0 builds on the foundation of RSS 0.9. It's just a 0.9 core with a little "syntactic sugar" mixed in for extensibility's sake.

"Syntactic sugar?" While RSS 0.9 had fledgling hooks for extensibility in its rdf:RDF root element and rudimentary support for namespaces, a spot more syntax is in order for RDF-enabled software to grok (read: understand) the structure of an RSS document. Parsers, not being as smart as you or I, don't know the significance of an item's <link> element, for example, and must be explicitly told: "The <link>'s the URL of the item we're talking about."

"I don't know a thing about RDF." No worries! The RDF markup is a simple mechanical transformation and doesn't require any particular understanding of RDF principles or serialization. That's not to say I don't encourage you to look into RDF, just that it's not necessary. Tim Bray provides a wonderful layperson's guide to RDF and Metadata, and it's a quick read if you have a sec.

"Why should I care about RDF support?" RDF will allow the computers that currently throw information at our feet at an alarming rate to instead work with us to make sense of it all and point us at the bits and pieces we are seeking. Some folks are working very hard to make this happen; with just a few simple additions, you can aid in this effort with no skin off your nose.

New default namespace

RSS 1.0 has its very own namespace, distinct from that of 0.9. We'll change the root rdf:RDF element's default namespace declaration to reflect this difference. (I've left off the rest of the document for brevity.)

<?xml version="1.0" encoding="utf-8"?>

<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://purl.org/rss/1.0/"
>

...

</rdf:RDF>

What's it all rdf:about?

Each resource (channel, image, item(s), textinput) we describe must have an associated URI to specify canonically what it is we're describing. This is accomplished in RDF by giving it an about attribute -- as in "we're talking about this URI."

This leaves us with:

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://purl.org/rss/1.0/"
>

  <channel rdf:about="http:///www.pie-r-squared.com/rss.rdf">
    <title>Pie-R-Squared</title>
    <description>
      Download a delicious pie from Pie-R-Squared!
    </description>
    <link>http://www.pie-r-squared.com</link>
  </channel>

  <image rdf:about="http:///www.pie-r-squared.com/images/logo88x33.gif">
    <title>Pie-R-Squared du Jour</title>
    <url>http://www.pie-r-squared.com/images/logo88x33.gif</url>
    <link>http://www.pie-r-squared.com</link>
  </image>

  <item rdf:about="http://www.pie-r-squared.com/pies/pecan.html">
    <title>Pecan Plenty</title>
    <link>http://www.pie-r-squared.com/pies/pecan.html</link>
  </item>

  <item rdf:about="http://www.pie-r-squared.com/pies/key_lime.html">
    <title>Key Lime</title>
    <link>http://www.pie-r-squared.com/pies/key_lime.html</link>
  </item>

  <textinput rdf:about="http://www.pie-r-squared.com/search.pl">
    <title>Search Pie-R-Squared</title>
    <description>Search our pie catalog...</description>
    <name>keyword</name>
    <link>http://www.pie-r-squared.com/search.pl</link>
  </textinput>

</rdf:RDF>

Pulling it all together

Now the last thing to do to make your RSS document RDF happy is to tie together the various elements to the RSS channel itself.

"But aren't they tied together by virtue of being in the same RSS document?" Good question! While everything's syndicated/published in one document, one can't assume it'll all stay together as it travels the Net. Aggregators decouple items from their parent channels, stirring them in various combinations to cook up new RSS feeds for resyndication, incorporation into a Website, commentary, etc. RDFers are really gathering data ("stuff said about a particular URI") and merging them into large data structures for the purposes of poking and prodding to extract some meaningful relationships.

What I like to call the channel's "table of contents" allows for this decoupling and munging while retaining some memory of an RSS item's parentage. The idea is similar to referencing sources of quotes you've used in your term paper; that [14] after the quote serves to preserve the association between the words you quoted and the source thereof.

Let's start with the optional elements, image and textinput; this step is only necessary for the optional element(s) you do use. We simply copy each element's opening tag, replace the rdf:about with rdf:resource, make it an empty element by adding a / just before the closing angle-bracket, and paste it inside the <channel> element. A couple of copies, edits, and pastes later:

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://purl.org/rss/1.0/"
>

  <channel rdf:about="http:///www.pie-r-squared.com/rss.rdf">
    <title>Pie-R-Squared</title>
    <description>
      Download a delicious pie from Pie-R-Squared!
    </description>
    <link>http://www.pie-r-squared.com</link>

    <image rdf:resource="http:///www.pie-r-squared.com/images/logo88x33.gif" />
    <textinput rdf:resource="http://www.pie-r-squared.com/search.pl" />

  </channel>

  <image rdf:about="http:///www.pie-r-squared.com/images/logo88x33.gif">
    <title>Pie-R-Squared du Jour</title>
    <url>http://www.pie-r-squared.com/images/logo88x33.gif</url>
    <link>http://www.pie-r-squared.com</link>
  </image>

  <item rdf:about="http://www.pie-r-squared.com/pies/pecan.html">
    <title>Pecan Plenty</title>
    <link>http://www.pie-r-squared.com/pies/pecan.html</link>
  </item>

  <item rdf:about="http://www.pie-r-squared.com/pies/key_lime.html">
    <title>Key Lime</title>
    <link>http://www.pie-r-squared.com/pies/key_lime.html</link>
  </item>

  <textinput rdf:about="http://www.pie-r-squared.com/search.pl">
    <title>Search Pie-R-Squared</title>
    <description>Search our pie catalog...</description>
    <name>keyword</name>
    <link>http://www.pie-r-squared.com/search.pl</link>
  </textinput>

</rdf:RDF>

The items piece is only marginallly more complicated if you've ever used HTML's Order Lists: <ol> <li>something</li> ... </ol>

We create a new element called items which will hold our list of, well..., items. Since this will be an ordered list, we'll also wrap them in RDF's concept of a sequence, rdf:Seq. We then do precisely the same thing for each item that we did for image and textinput, except that we replace "item" with "li" (read: list item) and place these items in our preferred order inside our sequence list.

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://purl.org/rss/1.0/"
>

  <channel rdf:about="http:///www.pie-r-squared.com/rss.rdf">
    <title>Pie-R-Squared</title>
    <description>
      Download a delicious pie from Pie-R-Squared!
    </description>
    <link>http://www.pie-r-squared.com</link>

    <image rdf:resource="http:///www.pie-r-squared.com/images/logo88x33.gif" />
    <textinput rdf:resource="http://www.pie-r-squared.com/search.pl" />

    <items>
      <rdf:Seq>
        <li rdf:resource="http://www.pie-r-squared.com/pies/pecan.html" />
        <li rdf:resource="http://www.pie-r-squared.com/pies/key_lime.html" />
      </rdf:Seq>
    </items>

  </channel>

  <image rdf:about="http:///www.pie-r-squared.com/images/logo88x33.gif">
    <title>Pie-R-Squared du Jour</title>
    <url>http://www.pie-r-squared.com/images/logo88x33.gif</url>
    <link>http://www.pie-r-squared.com</link>
  </image>

  <item rdf:about="http://www.pie-r-squared.com/pies/pecan.html">
    <title>Pecan Plenty</title>
    <link>http://www.pie-r-squared.com/pies/pecan.html</link>
  </item>

  <item rdf:about="http://www.pie-r-squared.com/pies/key_lime.html">
    <title>Key Lime</title>
    <link>http://www.pie-r-squared.com/pies/key_lime.html</link>
  </item>

  <textinput rdf:about="http://www.pie-r-squared.com/search.pl">
    <title>Search Pie-R-Squared</title>
    <description>Search our pie catalog...</description>
    <name>keyword</name>
    <link>http://www.pie-r-squared.com/search.pl</link>
  </textinput>

</rdf:RDF>

That's all, folks!

And we're done! Now, that wasn't painful, was it? If you have any questions or recommendations as to how to make this tutorial even easier, please feel free to drop me a line. If you'd like to participate in the further development and fine-tuning of RSS 1.0, point your browser at the RSS-DEV Working Group mailing list and dive on in!

Conversion without being a convert

Want 1.0-compliance without becoming a convert? Then one of these Web-based RSS conversion tools just may be for you, my friend.

For more RSS 1.0-compliant tools, libraries, XSLT stylesheets, code snippets, and more, be sure to visit the RSS-DEV tool shed.

Resources

For many more resources, visit the Resources section of the RSS 1.0 Specification proposal.

Suggestions, bug reports, and other feedback

We welcome any constructive criticism you might offer. Please post your suggestions, bug reports, praise, and other feedback to the O'Reilly Network RSS Forum.

Rael Dornfest is Founder and CEO of Portland, Oregon-based Values of n. Rael leads the Values of n charge with passion, unearthly creativity, and a repertoire of puns and jokes — some of which are actually good. Prior to founding Values of n, he was O'Reilly's Chief Technical Officer, program chair for the O'Reilly Emerging Technology Conference (which he continues to chair), series editor of the bestselling Hacks book series, and instigator of O'Reilly's Rough Cuts early access program. He built Meerkat, the first web-based feed aggregator, was champion and co-author of the RSS 1.0 specification, and has written and contributed to six O'Reilly books. Rael's programmatic pride and joy is the nimble, open source blogging application Blosxom, the principles of which you'll find in the Values of n philosophy and embodied in Stikkit: Little yellow notes that think.


Discuss this article in the O'Reilly Network RSS Forum.

Return to the O'Reilly RSS DevCenter.

Copyright © 2009 O'Reilly Media, Inc.