At the Semantic Technologies conference in San Jose I attended an interesting presentation entitled “persistent identifiers for the real web”. XML often uses URLs for identifying schema namespaces, and I suppose could be credited for influencing RDF’s practice of using URLs for identifying resources. In using RDF to describe and annotate things a problem arises…are you describing the web page, or the thing the web page is talking about. For example, if I assert that:
<http://tcowan.myopenid.com> :likes <http://www.myspace.com/lettucefunk>
Does that mean I like the web page or the band the page is about? As you’re traversing the semantic web it’s going to be advantageous to distinguish between content assets and the real world entities they may represent. Their proposed solution involves PURLs (http://purl.org for example). Normally a permanent URL redirects you to the best representation of the resource via a 302 response. They propose that when the PURL represents a real world entity that the response be given as a 303 (see also). The computer agent can then understand that the “thing” is a real world entity, and that the redirect is not to the real thing, but to another web resource about the thing.
I’m very much in favor of permanent URLs. Otherwise all our assertions will become disjointed as links break, or we’ll have to keep our own “archives” of dead links and sites. I also appreciate the simplicity of Dave and Eric’s proposal, however, I’m not so sure this is really the best way to solve identifiers for real world things. Consider books for example…what would be the best way to represent a book, it’s URL on Amazon or it’s ISBN number as a URN? If we use the Amazon URL we can’t be sure it’s a book, it might be binoculars or a coffee table. The URN however makes it clear:
The urn namespace indicates that it’s a book, without a doubt. If PURL were to host a “see also” permanent URL scheme for each declared URN namespace we’d be able to visit that URL to find out more…
But on the practical web, we don’t use PURLs or URNs for books, we use the Amazon.com url. I think in practical terms things are going to be represented on the web by the domain that has the best collection with the best open content. Perhaps the best approach in the end is to take advantage of blank nodes.
<http://tcowan.myopenid.com> :likes _:a
<http://www.myspace.com/lettucefunk> :describes _:a
_:a a :funkBand
In English, http://tcowan.myopenid.com likes the funk bank described by http://www.myspace.com/lettucefunk. Now we’ve made it clear, and without the use of PURLs or some new PURL redirection strategy.