I have a confession to make. I’ve never had a formal class in C++ (though I’ve written quite a few). At no point did I ever get a professor spend several days trying to make me understand the significance of *,**,*void, &, ., -> or all those other rather strange glyphs that make reading C++ much like trying to understand the Chicago Manual of Style with 95% of the words removed.

The other day, as I was reviewing my Stroustrup, it occurred to me that a whole lot of programmers out there, especially in the web space, as likely as not never sat through that C++ class either, and so I began a thought experiment - if your only experience to programming was web development, how exactly could you teach someone about C++ in those terms? Curiously enough, the more I dug into this conceit, the more I realized that this was actually a useful exercise in understanding some fairly deep notions about how we deal with the concept of reference in programming and on the web.

One of the core ideas that permeates most programming systems is the notion of a resource. A resource can be a data structure (or the header block for an object), HTML or XML from the web, the state information coming from some formal media or interrupt channel, but in every case the resource is essentially that thing that you want to do something with.

As important as the notion of such a resource is the notion of its addressability. If I have no way of addressing it, the resource is useless to me. The modes of addressability of course vary significantly depending upon the resource itself and the medium in which it is embedded - blocks in memory, file systems, web URLs, sockets, the list can get pretty extensive. In C++, the most important of such media is memory, because in general even if it is available from some external medium the significant actions within C++ ultimately tend to be on in-memory representations of such objects.

This memory-centric model is perhaps more important to C++ than it is to more contemporary languages largely because C++ (and its antecedent C) worked within a domain where the memory space was both constrained and often non-linear, thus memory management was (and still is) a key rationale for the use of the language. In C++, then, there exists an operator & (the address of operator) which acts upon a block of memory and returns its address in some idealized linear space. You can think of this operator in terms of a relationship, for a given resource r and an address a, the & operator is defined such that a = &r.

Now, let’s think about this in terms of the web. There is no formal operator or function that says for a given resource on the web, return the address of that resource, but nonetheless, there is (more or less) an equivalent notion (call it an abstract url() function) for which the relationship a' = &r'  where (a’ and r’ are the corresponding web analogs) is true.That is to say, for a given resource on the web, there does exist a URL address for that web - if there wasn’t, then the resource isn’t addressable … or put another way, it cannot be retrieved via any direct agency of the web. In the initial years of the web, this was actually true for most data that existed in the noosphere. However, with the rise of web services, the percentage of addressable resources versus total resources on the web has steadily climbed, and the types of resources so supported similarly continues to grow, appearing increasingly either as XML (of which I’m considering HTML to be a somewhat eccentric instance), JSON (which is appearing increasingly in the role of a light-weight messaging layer) or binary encoded media assets that are too rich to encode in metadata formats.

Drop back down to the C++ level for a bit. If you have an address, then it stands to reason that you should be able to retrieve the resource from that address. In C++, this role is handled by the pointer (*) operator. This operator handles the inverse operation - if p is a variable that holds an address (a pointer), then r = *p. Since it is usually a lot easier to pass around addresses than it is to pass around large, variable-sized blocks of memory holding resources, a significant portion of both C and C++ focus fairly heavily on this notion of pointer manipulation.

Note that there are a few instated conventions here. For starters, an address only indicates the start of a resource - it is up to the language itself to determine the conventions used to indicate the boundaries of that resource. For instance, a string of characters (char *p) could conceivably go on for ever - C uses a specific character (in this case the null character \0) to terminate the string, and then spends a great deal of effort on supporting functions that can both determine and work with the positioning of that null character.

In HTTP, the closest analog to the pointer * operator is the HTTP GET command. This command takes the address of the resource as a URL and passes this to a web client - a web browser, an XHTMLHttpRequest, a curl or wget command or any of a myriad of other tools designed for retrieving resources from the web. The command then retrieves this content, such that the content effectively becomes local. In this way, GET and * are not really all that different - they retrieve a resource into a local context so that the resource can be manipulated in some manner. Again, the statement r' = *p' is true.

Note that there is one fairly major difference between the two, however - latency. In a C++ context, the time it takes for a pointer to retrieve a resource is on the order of hundreds of microseconds down to the upper nanosecond range; fast enough that it is, for all intents and purposes, instantaneous. This is not true, unfortunately, on the web. There, the latency factors can be millions of times slower (on the order of seconds or tens of seconds). One consequence of this is that web communication in general places a fairly high premium upon data abstraction, letting the user agent in essence handle the details of processing from a highly abstract structural starting point whereas in C++ the memory latency is small enough that it makes more sense to deal with micro-operations. 

A second consequence, one which tends to complicate things fairly significantly, is that in general you have to deal with web operations asynchronously, which in turn both complicates the acquisition process and in general requires that you have more states (the various HTTP 100 through 500 codes) to indicate potential ranges of success or failure, while in C++, you usually only get at the lowest level a binary flag indicating whether an operation is successful or not. Note that because this facility is not native to C++ at the pointer level, C++ usually requires that exception handling becomes an inherited set of classes with its own facilities (and its own significant overhead).

Note that both C++ and the web treat addresses somewhat abstractly. In C++, this abstraction usually involves creating a linear address space that often hides a considerably more complex addressing space in the background that reflects differing memory architectures. Indeed, for all but the most basic of operations in C++, the manipulation of information is typically performed not on pointers at all but on pointers to pointers (languages such as Pascal used to refer pointers to pointers as handles, though this nomenclature is far from universal. In general, the idea here is that you have an address which points to a location in memory that in turn contains a second address to the final resource. Worked in this manner, the operating system can then reallocate memory locations internally for resources during garbage collection runs while at the same time preserving the linkages to live data structures.

On the web, this provenance is handled by the HTTP 30x commands - the various redirects - and in XML is handled by the use of hrefs and idrefs. HTTP first - typically when a resource has been relocated on the web, the server will return a redirection code - 301 for a permanent redirect, 307 for a temporary redirect, and 302 for a redirect that hasn’t been formally defined. Some user agents will resolve such redirections automatically, in essence performing the second and subsequent redirections transparently to the users, while others will return the redirect code but will place the onus of redirecting on the user. In either case, the notation - r = **p, where p is the initial pointer variable - illustrates the nature of the redirect, and is itself a shorthand notation for r = *p' and p' = *p.

When the resource in question is itself metadata, the notion of pointers gets to be a considerably more complex proposition. A web page typically includes within it a number of “links” - from hypertext links that cause a redirect to a different URL when activated to embedded media links (such as  <img src=”myImage.jpg”/>) that integrate external content into the page to more sophisticated links (seldom used) that define more generic relationships. The W3C xlink  standard defines these characteristics in more formal xml terms, defining a fairly sophisticated vocabulary of linkages, and RDF takes such bundled linkages to its obvious next step, where linkages begin to bleed into the realm of taxonomy, classification and semantics.

One particular linkage structure should be examined fairly closely, however, in light of the C++ discussion. One thing that differentiates C++ from C is the notion of method and instance encapsulation (both C and C++ have structural encapsulation). C++ defines the notion of an object theoretically as being an instance of a class. In practice, what this means is that when a new class instance is defined, the class structure is cloned from the class template, including pointers to various functions. Such functions are privileged in that they have access to the internal state variables of the instance. Initially, such functions were often statically bound - when an instance of a class was created, there would be only one set of method functions for all possible instances. This means that if you changed the internal representation of a given method, such a change would end up getting reflected in all instances of that particular class.

As such, these functions were defined as part of a function map that would create pointers to these functions for the given class. In the instance itself, the dot (.) notation would provide a pointer to the named map entity: r.fn indicates to the system that it should look up the pointer to the named function fn and pass the resource identifier r to it in order to determine the context of operations. A similar notation p->fn indicated that the pointer which holds the resource instance should be resolved to the resource itself, then the language processor should look up the function which that name (and signature) from the function map and retrieve the appropriate function block. Put another way (r.fn) = (p=>fn) if *p = r.

Is there a corresponding analogy in HTTP land? I think there is, and it’s called syndication. A syndication feed - an RSS or Atom feed contains an instance block (because it has a unique identifier, such as atom:id) that identifies a specific type (in Atom, this might be specified in the corresponding atom:category element) and a reference to its own internal link address. Indeed, this link address might also be a good candidate for the url() function, as it is a resource that nevertheless can identify its own URL. The entries in an atom feed, on the other hand, bear a striking resemblance to the function entries in a function map of a C++ object - they aren’t the subordinate resources themselves, but they provide pointers to these resources.

Again, it’s worth remembering here that a function is simply a resource, albeit one that has some imperative functionality associated with it. If the resources are themselves services (i.e., web loci that are capable of taking parameters and returning functional result based upon those parameters), then the object mapping becomes even stronger. Indeed, specifications such as OpenSearch have quietly been promoting the fairly radical concept of specifying URLs with templated parameters, something that WSDL was supposed to have done but that has never really translated cleanly outside of the realm of SOAP-based XML-RPCs. See RESTful Web Services by Leonard Richardson and Sam Ruby for more discussions on that front.

One of the key differences between C++ and XML based architectures (I’m including AJAX/JSON under the latter bailiwick) traditionally can be seen in the question about where exactly the formal public interfaces for an XML resource really reside. C++ works upon a localized containment model; the data is encapsulated within the instance, the interfaces are defined to act upon the instance data and as such typically are also contained within the object instance (as any benefit to two instances utilizing the same methods in memory generally are outweighted by concurrency and deadlock issues).

XML, on the other hand, provides an encapsulated data object but no intrinsic method associations. Indeed, the typical processing model for XML tends to be stream and filter oriented - XML is streamed into a processing filter, possibly causing some residual side effect changes, but from an architectural standpoint the filter either consumes the XML, or uses the XML stream(s) to generate other XML(ish) content. Indeed, this model only begins to break down when you try to treat XML in a pure RPC model where the explicit goal is inducing state change at the linked nodes - i.e., when you try to make XML “behave” properly, like a C++ class.

For this reason, it turns out that XML-based systems generally tend to work best when the formal API is a publishing API. There’s a number of very good reasons for this, and reasons which work just as effectively when the XML is modelling data objects as it does when modelling documents. The first is that the GET method does provide an analog to object pointer reification; you do not know until you have reified the pointer what exactly the entity is (or what its type is) that you have invoked; if I needed a separate method for resolving an XHTML document from an SVG document (not displaying, just resolving) then the number of get functions explodes geometrically.

Similarly, the RESTful PUT operation makes no assumptions concerning the nature of the “data store” that it is embedding the resource in; the exact mechanics are irrelevant from server to server, only that such PUT documents can be retrieved with a GET statement at some later time based upon the unique identifier (in this case a URL) associated with that particular instance.

POST gets a little more complex (especially since it tends to be badly abused on the web for doing RPCish things) but in essence its role is the creation of subordinate entities in the space without prior knowledge of the identify of such entities. I need to explain this concept in a little more detail. One of the critical tasks of any sort of object oriented system is the internal maintenance of identity; two instances may be property be property identical with one another, but if their internal identifiers are different then they are different instances - changing the properties of one will not otherwise change the properties of the other. Such identity assignment is not something that can be handled by the objects themselves, because to do so they would need to be able to communicate at all times with all other instances of that particular object in order to verify that their internal identity does not match someone else’s.

Instead, this identity assignment must be handled by a factory object of some sort that, among other things, is responsible for the generation of new unique identifiers between objects and providing to these objects some mechanism for comparing these identifiers without publicly exposing them for change purposes. In the comparatively local space that C++ operates in, this is the role of the new operator, which works by taking the relevant class templates and creating the base internal state depending upon any initialization parameters, then making a new identifier that doesn’t collide with existing identifiers (usually by applying some sort of hash function on it).

On the web, you don’t have a new operator. Instead, as you’re generally wanting to pushing out to the data store from your client, the model that has emerged is for the user to send an XML structure to the server via POST, which then adds in the new instance to the data store, though providing a unique identifier for that XML structure in the process. In other words, this POST mechanism acts in a manner somewhat analogous to a factory, but a factory where in general you drive in much of the raw materials on a flat bed and then end up getting a car made from those materials on the other side.

Much of the RPCification of the web also has occurred because people do not in general understand the potential that XML based services provide with regard to presentation. The web is fundamentally built on a series of concentric model/view/controller (MVC) architectures. There is a tendency to class MVC as a design “pattern” but I’ve become increasingly convinced that this tends to make people look upon MVC as simply another handy template for designing, rather than what I see as the “native” design of the web. The distinction between a web page and a web service that people make in their heads tends to be due to dismissing the importance of MVC here. A web page is simply one form of web service, and increasingly such a page is a constructed view on a reasonably complex data model. If you can “skin” that model with an Atom skin, then the same page becomes a syndicated atom feed; skin it with an XML skin, it becomes grist for an XML consuming service, skin it with a JSON skin and the resulting content can feed an AJAX widget. These representations are simply different views on the same data model, and if you can decompose the space into model vs. presentation operators, you’ve radically simplified your overall design model.

Realistically, this leads into the last RESTful arena, the one that isn’t formally covered by the HTTP model: search. A search can be thought of as a filter made upon a data model that restricts that model to a more limited subset (or reorders the items within that model based upon given criteria). Apply a view onto this filtered data and you have a a web page or web service. Provide one filter for handling the search on a given resource, another filter for handling the view (perhaps  an editor page for creating or modifying new instances of that data) and you can build astonishingly sophisticated applications with comparatively little work. This is in fact the goal that I’m hoping to achieve with x2o, discussed previously in this space.

Add the search aspect to the publishing methods associated with RESTful web services and you have an API that is anonymous with regard to the semantics of the underlying model, meaning that you can work with many different kinds of models using the same core set of tools. Moreover, if you assume then one other aspect of OOP - the ability to trigger method invocations based upon different phases of the publishing process - then realistically you can keep your core interfaces simple, can perform critical validation and messaging upon invocation of a given publishing process, and can integrate workflow management, user permissions and ACLs into the system cleanly while keeping it simple to publish against, either with XForms, AtomPub or some similar set of services.

Ultimately, I believe that as with move into a post OOP era, that the things that we need to examine most critically from the lessons of Bjarne Stroustrup are not the way to build structured data, but the way that we link that structured data together. Linkages build networks, and ultimately it is the shape of that network that will determine how the web itself evolves. It’s an interesting lesson to take home.