RDF and Metadata (XML.com)
Over the last few weeks I've been privileged to have followed the work of a group of developers creating a proposal for the next generation of RSS. If you've not heard of RSS before, it's a popular XML format for describing items of content on a web site.
Applications such as My Netscape, My UserLand, and O'Reilly's Meerkat can use these descriptions to mix and match headlines from many web sites. For web authors, RSS has been a great way to get new traffic. For web surfers, aggregators like Meerkat and Moreover provide a convenient way to get news on a topic of interest from multiple providers.
Portal toolkits like Apache Jetspeed have incorporated RSS support, making it trivially easy for headline content to be imported into a portal.
So, if RSS was doing just fine, why a new version? Well, the success of RSS caused it to be used in many ways it wasn't originally designed for. The original purpose of RSS was only to support the My Netscape portal. As it started to be used for the propagation of metadata about web site content to other providers like UserLand, users of RSS started to feel its limits.
For example, a financial web site may well wish to annotate each headline with a stock symbol. RSS 0.91 had no way of doing this. Moreover, web sites want to indicate the author of an article, which wasn't supported either.
Unfortunately, as Netscape was no longer developing the format, these frustrations went unresolved.
The goal of RSS 1.0 has been to fix some problems, provide an extensible framework for the future, and bring RSS into community ownership.
Perhaps the most important change in RSS 1.0 is the move to use the W3C's Resource Description Framework (RDF) to encode the file. This effectively adds a little syntax to XML in order to provide modularity and extensibility to the format, and makes RSS a milestone in the move towards the Semantic Web vision of Tim Berners-Lee, inventor of the World Wide Web.
Several months ago I argued strongly that the next generation of RSS should use RDF, so I'm obviously happy to see this. Beyond semantic web feel-good factors, the new RSS 1.0 has a lot to offer the developer. The use of RDF has enabled a modular approach so, for instance, financial news providers can invent a module for sending ticker symbols and include it in their feed. Unlike with straight XML parsing, however, this new information will simply be ignored by processors that don't understand it or want to use it. This is a big advantage over RSS 0.91, which was held in a straitjacket by its DTD.
RSS 1.0 is at "proposed" stage. The authors, having done significant work to craft a coherent specification, have opened the process to public participation in order to create a final spec. Interested parties can join the working group.
The authoring group is also working hard to provide tool support for RSS 1.0. Jonathan Eisenzopf has already upgraded his Perl XML::RSS module, the basis of many applications that generate or process RSS, to support the new format. Online validators, RSS 0.91 converters, and other tools are expected to be made available shortly, as is support for the new format from RSS aggregators. Rael Dornfest's Meerkat already supports RSS 1.0 -- as one would hope from one of the spec authors -- and registries like xmlTree are likely to support RSS 1.0 in the very near future.
I'm particularly excited about the possibilities that the use of RDF opens up. Building on some recent experiments converting e-mail metadata into RDF, I'd like to be able to tap into a worldwide database of web site content metadata too.
The RSS 1.0 proposal is a sensible step forward in ensuring the longevity and widespread application of one of the web's few successful metadata formats. It has done this by opening up the doors for participation by special interest communities, without requiring centralized authorization for the addition of new tags. I hope and expect to see RSS being adopted by new communities more diverse than the current web development community. Distribution of metadata is an important problem for many groups, and RSS 1.0 provides a solid framework on which to build.
Edd Dumbill is co-chair of the O'Reilly Open Source Convention. He is also chair of the XTech web technology conference. Edd conceived and developed Expectnation, a hosted service for organizing and producing conferences. Edd has also been Managing Editor for XML.com, a Debian developer, and GNOME contributor. He writes a blog called Behind the Times.
Discuss this article in the O'Reilly Network RSS Forum.
Return to the O'Reilly RSS DevCenter.
Copyright © 2009 O'Reilly Media, Inc.