REST offers a great way to build simple applications that Create, Read, Update, and Delete resources. But what if you want to get at part of a resource?
I’m having a bit too much fun in working with Rails 2.0’s RESTful approach. First, I enjoyed the way it lets applications spew XML without spending forever pondering schemas and agreements. Now, I’m starting to wish for a counterpart to ActiveRecord that works with XML documents instead of relational databases.
I’m not yet nearly enough of a Ruby programmer to build that, but it did get me thinking about some old technology that solves a problem ActiveRecord never has to address.
A database row is a simple thing, even if it enables immense complexity. It contains named fields - no more than one to a given name, usually conforming to a rather predictable schema. There’s nothing floating between the fields, and every row contains the same set of fields. (They may be empty, of course, but they’re clearly defined.
An XML document - or even an XML document fragment - is potentially incredibly complicated. While many XML documents are regular and relatively simple, the ones that aren’t simply holding data as it moves between databases are often very complicated. XML elements are kind of like fields, sure, but:
-
There might be multiple elements with the same name (and rather different content structure);
-
There might be text (not just whitespace either) between the elements;
-
There’s all kinds of metadata in the attributes on those elements;
-
And most techniques for addressing parts of XML documents have at least the possibility of selecting more than one piece in the same document!
Nonetheless, it seems like the basic operations most people would like to perform on these documents (and other loosely-structured resources) are the same operations people want to perform on database records: Create, Read, Update, Delete. CRUD is everywhere, and CRUD is good.
Typically, though, an XML document is treated as a single resource. A book might assemble itself using entities or XIncludes that pull in chapters, of course, and those chapters could be individually addressed as resources, but that has limits. Though it’s possible, I don’t think anyone wants to write paragraphs in which each sentence is included from a separate file using one of those mechanisms. As soon as you hit mixed content, the entity approach breaks down anyway. (Other formats, like JSON, don’t have entities but share a few of the same problems.)
So how can developers build RESTful applications that address parts of documents?
One approach that’s getting a lot of discussion in the last few days is to add a new verb, PATCH, to HTTP. As soon as I started reading about it the hairs on the back of my neck stood up, and red flashing lights and sirens went off in my head. Visions of infinite diff formats danced in my head, triggering an avalanche of memories from the too-many conference sessions I’ve attended on XML diff techniques.
It seems to me that the problem is not that developers want to do something that can’t be expressed with a RESTful verb - in this case, probably UPDATE. The problem is that developers can’t address the resource on which they want to work with sufficient granularity given their current set of tools and agreements.
Though I’ve inveighed against the many many sins of XPointer for years, that incredibly broken process was at least working to solve the problem of addressing XML documents at a very fine granularity, extending the tool most commonly used on the client side for this: fragment identifiers.
There are some glitchy things about fragment identifiers that are great kindling for a flame war. Clients are the only tools that process them, and the server never actually sees them. Perhaps most complicating of all, every MIME type is entitled to its own flavor of fragment identifier. XML can use different syntax (and everything) from HTML, JSON, JPEG, PNG, etc. Combine that with content-negotiation, where a server picks what representation of a document to send a client, and it’s a recipe for endless spin and fruitless arguments.
The only reason that fragment identifiers work is that we typically only use them when we have some certainty what’s going to be coming down the pipe for a given request. For the kinds of situations I’m thinking about, that’s mostly okay - or at least when it breaks, it’ll probably be clear what happened.
So, since fragment identifiers don’t normally get sent to the server, how can we use them to get fragments and only fragments from the server?
The key is a shift that was mentioned in the earliest drafts for XLink (from which XPointer was eventually separated. They offered three different ways to identify a fragment within a URL:
If the XPointer is provided, the designated resource is a “sub-resource” of the containing resource; otherwise the designated resource is the containing resource.
- If the connector is “
#“, this signals an intent that the containing resource is to be fetched as a whole from the host that provides it, and that the XPointer processing to extract the sub-resource is to be performed on the client, that is to say on the same system where the linking element is recognized and processed.- If the connector is “
?XML-XPTR=“, this signals an intent that the entire locator is to be transmitted to the host providing the resource, and that the host should perform the XPointer processing to extract the sub-resource, and that only the sub-resource should be transmitted to the client.- If the connector is “
|“, no intent is signaled as to what processing model is to be used to go about accessing the designated resource.
The notion of “subresources” didn’t go over very well, though I think it’s needed. The first of these bullets is the classic client-only fragment identifier approach. The second shifts the fragment identifier into the query string, where the server has access to it. (I believe that’s true of PUT, POST, and DELETE as well as GET, though I couldn’t find much in the way of discussion or examples.) The last was an interesting idea that never went anywhere.
The idea that intrigues me at the moment is putting the fragment identifier into the query string. A URI referencing fragments might presently look like:
http://simonstl.com/book.xml#xpath1(//book/chapter/title)
A client processor that understood the xpath1 scheme (like Mozilla) would currently know that that referenced all the chapter titles in a book (to say it in English.)
To let the server know that something was directed toward that same set of titles, the query string syntax might look like:
http://simonstl.com/book.xml?XML-XPTR=xpath1(//book/chapter/title)
A server might then return just the titles, or perform an operation on those title elements if a verb other than GET was used. (Be very careful with identifiers that reference multiple fragments!)
I have a lot of thinking to do on this, and hopefully eventually some coding, but this seems worthwhile. There’s been some interesting conversation on xml-dev around this, and hopefully some of these nearly 10-year-old ideas can finally get traction.
It may be that we finally need them!


Some XML database implementations support 'well known' query string
variable names. For example eXist provides ?_query=... where ...
contains an XQuery expression or the location of an XQuery resource. XML databases increasing support a native REST interface too (eXist does) or you can use your preferred framework such as Restlet or Rails.
RESTful way to work with XML Schema is something to ponder. Speaking of Ruby on Rails, XML Schema scaffolding and using REST for maintaining state sounds great. In my industry, the latest HL7 version 3 standards really heavily on XML. Thus, if we could work with XML like we do with databases, then life would be good.
It works as long as you have a format that has a processor for a notation in which a given address type works. The problem is all of the formats for which a different notation requires a different process even if the address type can be mapped to it.
The endless arguments about this a decade ago stopped from fatigue. The web designers punted it away and documents that cannot be linked to using sub-resource links don't have them. Lazily, it evolved into 'send them the whole document and if they don't have a processor, they get an error from the client saying there is no registered handler'. Then they move on.
The other side is that some addresses can be declared for the published or on the wire version that won't work in a running instance. For example, you can use an id to point to an object in a real-time 3D scene graph, but using other DOM methods are not likely to work because such scenes being real time are in flux inserting and deleting objects in real time.
So properly, these ideas have to be scoped or you are doing Hytime again. I know you don't want that. ;-)
I don't think reinventing HyTime is necessary, though yes, those problems will resurface if you try to solve too many problems simultaneously.
One of the nice things about the ?XML-XPTR query notation is that it could be a "well known" query string, and it also identifies a fair amount about the content being identified. It's likely to be applied to an XML document, using XPointer notation to identify the fragment.
It probably helps that I'm not expecting these things to survive in independent contexts. If there isn't a server available to run this query against, you're just not going to get anything - period. If a document is in flux, you're probably going to get an error message. That's pretty much what happens now, and the world hasn't collapsed as a result.
I don't expect to be "doing HyTime again". Working with XML is just complicated enough to reveal corner cases, but not so complicated that the corner cases usually matter. That's probably why a lot of people are using XML, while HyTime is a memory for specialists.
>> The problem is that developers can’t address the resource on which they want to work with sufficient granularity given their current set of tools and agreements.
You are right. However, I don't think using xpath such as "#xpath1(//book/chapter/title)" in URIs is desirable for addressing partial resources. Such a URI design ties the URIs to a particular type of representation (i.e. XML in your case), and goes against content-negotiation.
"Such a URI design ties the URIs to a particular type of representation (i.e. XML in your case), and goes against content-negotiation."
Yep. It has to. Unfortunately, you can't really edit pieces of something unless you know how the pieces are defined.
Fragment identifiers are by their nature MIME-type dependent, which is why I like the ?XML-XPTR approach. The first piece says what your expected type context is (XML), and the second piece says what kind of fragment identifier you're using (XPointer).
(No, I don't think writing ?application/xml-XPointer would be an improvement.)
Effectively, though it may be redundant with other headers, this approach makes it extremely clear what kind of document you expect the fragment identifier to grapple with.
And it has to be clear, or there's no point whatsoever in bothering with this.
>> Yep. It has to. Unfortunately, you can't really edit pieces of something unless you know how the pieces are defined.
The client needs to know what it is changing, but it should be possible to express that independently of the types of representations possible for the resource, preferably by relying on a Content-Type for the PATCH request and not the URI. IMHO, URIs are content-type should be treated as independent axis for dealing with various representations of a given resource.
Sincerely,
Subbu
"The client needs to know what it is changing, but it should be possible to express that independently of the types of representations possible for the resource, preferably by relying on a Content-Type for the PATCH request and not the URI."
First, I'm not using PATCH here. I think I made it pretty clear that I think PATCH is an extremely bad idea. "Red flashing lights" and all that.
Second, I don't think it's actually possible to express requests at the necessary level of granularity without some understanding of what it is I'm actually address. You just can't get ahold of anything much useful.
Now maybe it's conceivable that I could use a Content-Type header to say that I want this to apply to the XML and not an HTML, SVG, or JPEG representation - but I don't think that as a matter of practice that I should be using the same resource identifier for sub-resources in different formats.
I'm much more comfortable breaking the rules for URIs and content-negotiation than I am in breaking the rules for how REST works, if you really want a straight answer. Fragment identifiers have been a nightmare corner in the URI conversation for a very long time now, and I don't think that's ever likely to change.
"I don't expect to be "doing HyTime again". Working with XML is just complicated enough to reveal corner cases, but not so complicated that the corner cases usually matter. That's probably why a lot of people are using XML, while HyTime is a memory for specialists"
That's a lazy dodge, Simon. The means to address subresources in markup are part of Hytime. What Hytime does is to go beyond markup and consider the problem of locator addressing in non-markup formats. The problem isn't XML. That's easy. The problem is exactly what is suggested in the follow ons: content types. Without format knowledge, no locator is reliable. The link can be but not the locator. This is the sense in which the URL/URI is uniform. The locator or resource representation is still required, thus content types or NOTATIONS.
Yes, just do XML. But one favor: if other standards for notations insist on having their own locator types, then the test is not where they conform to the XML sense of these, but where they are behaviorally consistent with the REST verbs. Yes?
BTW: adressing into a real-time 3D scene graph will not return errors if the locators ignore XML locations and rely on the namespace. It is dynamic. Time is a necessary component of the address.
"That's a lazy dodge"?
Recognizing that HyTime went off a cliff because of its enormous and effectively infinite scope, and trying to avoid that fate, is a lazy dodge?
Remember, the Web itself was this useless little toy that couldn't possibly do anything relative to the much more powerful facilities that HyTime had. And even XLink vanished while the Web's mediocre linking soldiered on, ever stronger.
The rest of your comment is issues I've already addressed. I don't think it's sane to attempt to address fragments of something that's in constant motion, unless it has an underlying addressable structure that isn't in motion.
I certainly wouldn't object to other types with addressable components working along similar lines for their own content either. JSON seems like a good first case.
"Recognizing that HyTime went off a cliff because of its enormous and effectively infinite scope, and trying to avoid that fate, is a lazy dodge?"
Actually, yes. Your view won't change and I won't attempt to change it here. I simply am pointing out that the only way to make your proposal logical is to scope it properly.
Your title is "Adressing Fragments in REST". Your content "Addressing Fragments in XML Using REST Verbs". So once again, if other standards for notations insist on having their own locator types, then the test is not where they conform to the XML sense of these, but where they are behaviorally consistent with the REST verbs. Yes?
In other words, formats beyond or different from XML devised for their own processor requirements may in fact use locator types not described in your proposals and still work with REST. REST isn't about XML linking and location systems. These are separable, yes?
Len - you've got to start somewhere, and with something solveable.
I lack HyTime's ambitions, and frankly think the lesson of HyTime is that solving the full set of problems they addressed requires a God-like level of knowledge that simply isn't possible for now.
As I've said repeatedly, I'd be happy to see this approach reused for other formats, but I think trying to solve the problem for all formats is simply never going to happen. If the developers of other formats want to go in another REST-compatible direction instead, great for them.
Let's solve granularity one format at a time. The size, shape, and specification of the granules vary from format to format - there's no getting around that.
You misunderstand. I am not saying use Hytime. Hytime can't be implemented politically. It requires too comprehensive an agreement which experience shows can't be achieved on the web. There are too many preconditions that have to be in place. But it is useful to understand the technical preconditions and leave the politics of the early days of the web aside. It was a naive design. Naive designs work as long as the implementor ambitions are similarl limited. That doesn't mean they don't change or the naive system doesn't work. But it is smart to know exactly where it runs out of steam. If that weren't the case, you wouldn't be resurrecting these topics or asking for a ten year review.
I use the example because it is the standard where the necessity of having locator types as distinct from links is made most clear. REST skirts that issue. You are trying to solve link/locators for XML, a problem solved by other unpopular standards in the past (XLink, XPointer, both derived from Hytime). What is different this time?
1. Frank admission that this is for XML only. Some locator types won't work for some flavors of HTML. Some won't work for PDF. Knowing in advance which locator types are supported by the specific implementation of a handler for a specific format is required to use these reliably. REST does not guarantee any of that for sub-resources. It can't. That is why all the discussions of 'what is a resource?' are ratholes.
2. Establishing the preconditions any format must meet to work with the locator types you are proposing or resurrecting.
BTW: if an XML 2.0 removed the requirement to have a root, how does that affect your locator types?
To your point 1, I've repeatedly discussed how my solution is for XML only. I'd be happy to see the same approach used for other solutions, but that isn't remotely the same as denying that this is for XML.
To your point 2, I don't know what preconditions you're talking about, as I've already made it clear that this is for XML.
Removing the root element wouldn't bother me in the slightest. XPath can already deal with document fragments.
As to the big question of what's different this time, it's pretty simple: I'm not trying to do very much. It seems very clear at this point that I'm not solving the problems you find interesting - and frankly, solving the problems you keep bringing up seems like a remarkably awful idea.
I'm certainly NOT trying to "solve link/locators" for anything - I'm trying to assemble fragments of failed solutions into something that works for specific use cases. That seems like a valuable enough project to me, and one that won't wander into the weeds you keep describing.
Best of luck then, Simon. I look forward to seeing how this works out.