November 2005 Archives

Bob DuCharme

AddThis Social Bookmark Button

Related link: http://www.snee.com/bobdc.blog/

This weblog has been an experiment for me. By keeping it focused on linking, my plan was to accumulate ideas about linking technology over history and to gather feedback about those ideas as I researched a potential book on it. I tried to examine all the linking technologies that led up to the web, starting with the twelfth century. I wrote about a nineteenth-century linking application that used typed, one-to-many links and that remains a multi-million dollar business, and I wrote about the use of unique IDs in usenet postings that allowed for linking on the Internet in 1982. I looked at more modern linking technologies and the role of metadata in these technologies such as trackback, RSS metadata, and the linking possibilities opened up by Jon Udell’s work with streaming audio.

Like a lot of weblogs, mine ended up focusing more on interesting new things that showed up. One nice bit of linking technology that alerted me to a lot of new developments was a del.icio.us feed I created for all new del.icio.us entries tagged with the word “linking.” An overly large proportion of these turned out to be by and for the search engine optimization crowd, whose mercenary attitude can be a bit annoying, but I can’t help but be fascinated by their tremendous efforts to quantify the value of a link in dollars and cents.

I’ve decided to start a new weblog on my own domain without such a narrow theme. I still promise never to discuss what I had for breakfast or my new favorite CD. Linking is a form of navigation of information, and there are other information navigation technologies that use or don’t use computers. I’ve been more and more interested in the assembly, organization, and retrieval of information throughout history, and would like to discuss those more. Doing it under my own domain name instead of on O’Reilly will give me more control over the weblog and therefore give me more opportunities to play with the various aspects of blogging technology.

I’ve learned from recent reading that most histories of computers focus on computers that specialized in the most advanced math possible for their time, which was as much of a niche application in 1900 and 1945 as it is now. Many key tasks that we use computers for today—particularly database tasks—were being carried out by automated, usually electrical machines since the nineteenth century in a separate but parallel history to the Collosus-Mark I-ENIAC-EDVAC history of computers that you typically read about. Did you know that during World War I the U.S. Army could run automated queries against a database to find, for example, French-speaking soldiers with a chauffeur’s license? Lately, I’ve been fascinated by large-scale database applications that predate any database technology that geeks currently take seriously. A lot of people now consider any pre-relational technology to be prehistoric; that’s a pretty limited perspective.

The history of computing and computing applications (oddly, often a separate field of research) has a lot to teach us about the problems and innovations we’re working on now. I’m sure I’ll be spouting opinions on more developments as well, especially XML-related ones, which I’ve worked with and written about a lot since the days when XML was a four-letter word. When I see interesting linking-related news, I’ll probably add new entries to this O’Reilly weblog, but my main weblog from now on will be bobdc.blog. If you follow only one of these, that will be the more active one.

Kendall Clark

AddThis Social Bookmark Button

Related link: http://my.opera.com/community/sparql/

I’ve been pushing the convergence line lately: there’s no fundamental conflict between Web 2.0 and the Semantic Web. They may not be quite two sides of the same coin, but they’re definitely the same currency. Or something like that. Anyway.

The folks who run the Opera community portal have just done a really smart thing — they’ve deployed a SPARQL web service for their data. It’s the community’s data, and what better way to let the people at their stuff?

This is better than a bespoke API because it’s not another API-to-API integration problem. Rather, it exposes, essentially, a domain-specific (or “little”) language for arbitrary use by arbitrary third parties.

And it does so using a standard, lightweight web service interface, REST HTTP, with the possibility to deploy SOAP pretty easily too.

Want to add some Opera community data to your latest Web 2.0 mashup? It’s as easy as writing a few SPARQL queries and some code to handle the XML those queries return. Easy peasy. (And I’ve been working on a serialization of SPARQL query results in JSON, which will make it even easier to do in AJAX apps.)

There are some areas for improvement here, since the SPARQL query engine being used here isn’t the speediest and there’s no support for DESCRIBE queries yet. But still, this is a big deal!

I’ve hinted at this before, but I think I’ll put a stake in the ground and say it clearly:

The developer(s) of every Web 2.0 app/service should seriously consider exposing their data with a SPARQL query service.

What else ya gonna use?

Dan Zambonini

AddThis Social Bookmark Button

We’re currently re-writing our Personalisation application, which can tailor content, style and process (e.g. the order of a checkout form) to the type of user browsing a site. “Type of user” is a bit vague, so let me expand on that. Nearly all industries — from banking to medicine, retail to education — have histories of ’segmenting’ their user bases; grouping their customers into generic stereotypes. This is typically based on common demographic attributes, such as location, age or gender, or sometimes through behavioural or psychographic analysis (such as personal interests or brand loyalty).

So, to place an individual user into a pre-defined segment, we need to match their attributes (age, location, etc.) to those of a particular segment. Obtaining the user’s attributes can be achieved through two types of data collection: explicit (asking the user for specific information, possibly even asking them to define a profile) and implicit (using only implied information, without asking any direct questions).

The purpose of such tailoring is (theoretically) to better meet the needs of the user; provide information and functionality that better suit a specific type of user. Usually, the ultimate goal is to make the process of purchasing something easier/more probable.

Whether or not you agree with the ethical, commercial or practical aspects of transparently building user profiles, it makes for an interesting technical challenge. So, what can we guess about a user without asking them any questions?

Let’s take a look at the data we can collect:

  • HTTP Request
    • User Agent String
    • IP Address
    • Referrer
  • Javascript/Client Side
    • Display Information
    • History Length
    • Cookie Information
  • Click-Stream (the history of clicks on our site)
    • Content viewed
    • Searches performed
    • Timing information

We can expand on these to give us more specific information:

  • HTTP Request
    • User Agent String
      • Browser Product
      • Platform
    • IP Address
      • Location (GeoLocation/IP Lookup)
      • Service Provider (Reverse DNS Lookup)
    • Referrer
      • Referring Website
      • Search Terms (e.g. keywords entered into Google)
      • Marketing/Advert used (e.g. which Google AdWords advert was clicked on)
  • Javascript/Client Side
    • Display Information
      • Colour Depth
      • Resolution
    • History Length
      • Whether or not the URL was typed directly
    • Cookie Information
      • Previous Visits/Repeat User
  • Click-Stream (the history of clicks on our site)
    • Content viewed
      • Type of content (e.g. subject of content, format of content)
      • Order of content viewed/Path through site
    • Searches performed
      • Keywords
      • Types of search (e.g. boolean)
    • Timing information
      • Date/Time of clicks
      • Time between clicks

Some of these — such as location and service provider — are not always accurate (AOL users, roaming users, using the web at work, and many other complexities), but we’re only building a best guess, so lets use all of the information at our disposal.

We still need to translate this technical information into user attributes. So what kinds of guesses might we make about a user, from this information? Here are some starting points to get you thinking (again, these are not meant to be 100% accurate, but “the user is more likely to fall in this segment if they match this data“). Apologies if some of these seem to be based on groan-worthy stereotypes, but much of this comes from real evidence.

  • Affluent Users
    • HTTP Request > Browser > Platform > Mac
    • HTTP Request > IP Address > Location > Urban centre
    • Javascript > Screen Information > Resolution > High
  • Female Users
    • HTTP Request > Browser > Product > NOT Firefox
    • HTTP Request > Referrer > Website > Yahoo/AOL/Ask Jeeves
    • Click Stream > Content Viewed > Type of content > Female targeted content
  • Younger Users
    • Click Stream > Searches performed > Boolean search performed
    • Javascript > Screen Information > Resolution > NOT Low
    • Click Stream > Timing Information > Short time between clicks

What other rules might exist for matching a particular demographic? I have some thoughts, but these are currently based on my limited view of the world, rather than hard statistics (e.g. If you are browsing early in the morning, are you more likely to be male/young? If you have an old browser version, are you more likely to be less computer literate/older?).

Depending on the type of content on your site, it is sometimes relatively straightforward to only use a small subset (the type of content viewed) of this available information to build up an accurate profile. Amazon is an obvious example. If you browse the music of Kelly Clarkson and Avril Lavigne, the system will probably assume you are a white, middle class teenage girl who is in the middle of a temporary and short-lived attempt to ‘rock out’, whilst completely missing the point of rebellion. Look at Dixie Chicks and Green Day, and the communist left-wing checkbox will be ticked and an automatic email to the White House triggered. On the other hand, if you’ve been looking at The Flaming Lips and The Decemberists, then it could assume that you are highly intelligent, witty, with a deep cultural and political understanding of the world and a bright and happy future ahead of you.

Kurt Cagle

AddThis Social Bookmark Button

Related link: http://www.understandingxml.com

Massachusetts has become the latest skirmish in the Open Source wars. A decision made in early 2005 by Eric Kriss, Secretary of Administration and Finance in Massachusetts led to his recommendation that Massachusetts adopt the Open Document Format, an open standard promoted by OASIS, for all state government work, after a formal review of all document standards, including Microsoft Word’s new XML format, a format designated (somewhat cynically, as the Microsoft Office Open XML Format (MOX - my own acronym)).

This announcement was greeted favorably by a number of other large vendors, including Sun (which has supported the Open Document Format (ODF) in Open Office.org 2.0) and Adobe, but was on the other hand not unexpectedly derided by Microsoft.

At that point, however, a couple of things happened; two additional advocacy groups — “Citizens Against Government Waste” and “Americans for Technology Leadership” have also weighed in against the adoption of the ODF standard, claiming that as nearly 100% of the state currently runs Microsoft Word, this would cause an undue burden upon the state to shift to Open Office, necessitating intense training, especially since Microsoft has announced that it has no plans to support ODF. (Article Continues …)

So do you think that this is the best way to get US Government officials to start implementing open source and open standards technologies?

Kurt Cagle

AddThis Social Bookmark Button

Related link: http://www.understandingxml.com

I hadn’t quite planned on turning the XML 2005 coverage into a single continuous blog, but I figure that one last time at that well couldn’t hurt, especially since it helps to springboard me into discussions for this week.
The Once and Future XForms

Without really intending to, I spent a great deal of time this last week in the domain of forms. Now, you have to understand the irony of this from my standpoint. I’ve long had a more or less consistent battle on with “the bureaucracy” for nearly as long as I’ve been alive - one of these people who, if I could fill out a form incorrectly I would, usually resulting in some dire calamity down the road because I put a period where a comma was expected … I suspect that if I had ever worked at NASA I would have been the hapless programmer who caused a billion dollar satellite to blow up half a mile from the launchpad because a stray comma in the source told it that it was now under attack by little green men from Proxima Centauri, and that it should self-destruct right NOW!! (Article Continued …)

Simon St. Laurent

AddThis Social Bookmark Button

Sometimes the easiest path from Linux to Windows is via the Macintosh.

I’ve been building a new Linux box, my first in a few years. (More on that later.) Thanks to a complicated series of mistakes on my part, I wound up with a spare 40GB drive on which I’d already installed Linux. It seemed simple enough to put it in an enclosure and use it as a USB drive for my perpetually short of space Windows laptop. Windows detected the drive, and everything seemed fine, except that I couldn’t do anything at all with it. (When did Windows obliterate the disk admin tools it used to have in NT 4.0? [Corrected: Not obliterated, as noted in comments. Just buried.])

I was still reinstalling Linux on the other system, so I finally plugged the drive into my iMac to see what it could do. Sure enough, options for partitioning and formatting came right up. Five minutes later, the drive was in boring old FAT32, and now my laptop recognized it immediately as extra space.

Now I just need to figure out why my laptop insists on connecting all USB devices as 1.0, when it had perfectly good 2.0 support until last week. (It still says it has USB 2.0 in the Device Manager - it just doesn’t in practice.)

Have third party interventions helped you deal with contending operating systems?

Jim Alateras

AddThis Social Bookmark Button

I have just read the OASIS’s Reference Model for Service Oriented Architectures Working Draft 10 document. It has changed significantly from Draft 7, which was the previously version I read. This version is a lot lighter (28 vs 42 page) and deals with SOA at a more abstract level.

It does a good job in identifying and describing the various SOA concepts and has a good section on the importance of incorporating semantic information though ontologies and vocabularies to facilitate richer interactions between web services. Unfortunately, it has done away with the the layered SOA diagram, which illustrates the relationship between the various SOA concepts. In general the document can do with more illustrations.

A Web Services binding document, illustrating how the reference model can be applied to web service context would be a largely appreciated and a good next step for the working group.

In general it will be interesting to see how the reference model is adopted by various industries and in particular whether it will be used to define more domain specific reference architectures (i.e finance, telecommunications).

Definitely a worth while read for software architects.

Michael Fitzgerald

AddThis Social Bookmark Button

Related link: http://www.lucenebook.com/

I was visiting with Erik Hatcher at the Rails gig in Reston last Saturday. I really like what he has done with his book Lucene in Action (which he coauthored with Otis Gospodnetić). Essentially, he created a search engine for his book on Lucene with Lucene. You can’t get all the content of the book, but you get plenty. I think it is a really cool way to present a book, a sort of recursion on the topic itself, by showing how quickly the contents of the book can be searched using the means described by the book. Whoa! I also think it says a lot about how books could be presented from now on. If writers and publishers don’t do something like what Erik is doing with his book, we are a little behind the times. I intend to follow Erik’s lead in making books more accessible to readers, as early as possible.

Did you check out lucenebook.com?

Michael Fitzgerald

AddThis Social Bookmark Button

Related link: http://pragmaticstudio.com/

Just got back from Reston, Virginia today after attending a three-day workshop on Ruby and Ruby on Rails put on by the folks at Pragmatic. Lots of Java people there; one said, “It’ll be hard to go back.” I don’t think he meant it as a ding on Java, or as a signal of abandonment, it’s just, that, well, it’s hard to curb your enthusiasm after an event like this. Lots of information from Dave Thomas and Mike Clark, with a keen eye on what might be flawed in Rails and Ruby. Not a lot to worry about there. And people are putting up DB-enabled sites in hours with Rails. I know, I know: The joy will wear off after time. For now, just let me be excited about something.

Have you tried Rails?

Niel M. Bornstein

AddThis Social Bookmark Button

Related link: http://www.xtech-conference.org/2006/call.asp

Way back in 2004, my .NET and XML tutorial was accepted by the program committee for XML Europe. Unfortunately, there was insufficient interest and my first trip to Amsterdam was cancelled in a shower of humiliation and recrimination.

That was followed by a similar fiasco at Extreme Markup 2004 in Montreal. Finally, I was vindicated by going all the way at XML 2005 in Washington DC.

Now that XML Europe is XTech, the old .NET and XML tutorial no longer fits in with the Web 2.0 agenda. So I won’t be proposing anything for XTech 2006.

It’s sure to be a great conference. I sure wish I could go.

But just because I’m not going to be there, don’t miss out! Be sure and send in your proposal for an exciting talk on the subject of your choice.

Kurt Cagle

AddThis Social Bookmark Button

Related link: http://www.understandingxml.com

Covering a show like the XML 2005 conference has been an intriguing experience in trying to capture the fleeting moments, worthwhile news and impressions while at the same trying to make sense of a technical movement that is simultaneously up to date and stretching back decades.

The story of XML is, for all the dry specifications and eye-crossing syntax, ultimately a very human story of people who recognized a need – the need to communicate, to express ourselves in an open manner that could carry through the ages not just in the dialect of humans but in the often finicky and precise language of computers – and who have spent much of their lives going through the difficult task of getting people to reach come to a consensus on the very nature of language itself. (Article Continues …)

Kurt Cagle

AddThis Social Bookmark Button

Related link: http://www.understandingxml.com

Welcome to the last day of the XML 2005 Conference. I gave a talk this morning, and as a consequence am a little slow in catching up with this log, but will be fleshing this out This is a live report, so please come back over the course of the day. Live Article Continued …

David A. Chappell

AddThis Social Bookmark Button

While giving a presentation at the Enterprise Architect Summit in Barcelona, on the subject of ESB and other SOA-related infrastructures, I got into a public altercation with a couple of Microsoft guys who were in the audience. It turned into a pretty heated debate over core architectural fundamentals, which we hammered out in front of an audience of about 75 unsuspecting conference-goers. I have to say that in all my years of public speaking this sort of spontaneous public debate has never happened quite like this. The conversation that ensued brought to light some really important issues regarding some fundamental differences between the use of an ESB as the foundation of building a Service Oriented Architecture vs. using a combination of Biztalk and WCF (formerly known as Indigo).

As part of defining your strategy for building and deploying a SOA, you may be considering a variety of SOA infrastructure support products. As an enterprise architect you are probably faced with these various approaches regularly as vendors offer their wares to you, positioning their software as the best choice available for building Service Oriented Architectures upon. You should consider that there are key differences in how those SOA infrastructure offerings are architected, and the ramifications associated with how the infrastructure allows you to build and deploy your SOA across an extended enterprise.

The debate arose out of my discussion of the ESB lightweight service container model, and how that compares and contrasts to a hub-and-spoke integration broker architecture. The focus of my presentation turned toward scalability issues. In my discussion I was talking about how an ESB allows for the selective deployment of specific integration functionality as independently scalable mediation services. I used an example of an XSLT-based transformation service, and talked about how an ESB allows an instance of that transformation service to be separately deployed in its own lightweight ESB container.

It’s no secret that the parsing and manipulation of XML, including an XSLT-based data transformation, can be an expensive operation in terms of consuming computing resources. Using the ESB distributed container model, multiple instances of a particular transformation can be scaled and load-balanced across multiple containers across multiple machines in order to be able to support increased demands on the particular transformation as the transformation becomes more complex or the service invocation traffic increases.

I then talked about the contrast of the EAI/Integration broker approach, which typically employs a monolithic architecture that includes data transformation, messaging and connectivity, routing of messages based on business rules or scripting, application adapters, and process control all in one server implementation (an ESB, by the way, also does all these things, but each capability is separated out into its own separately deployable and independently scalable piece).

In order to scale up the XSLT transformation using the monolithic EAI broker approach, you have to install that EAI broker on a really big machine, or if the EAI broker supports the notion of clustering for scalability purposes, you would have to install that entire EAI broker stack across multiple machines. Keep in mind that we are simply trying to support the scaling up of the one XSLT transform that sits between two popular applications! All the while that transformation will still be trying to compete for computing resources with all the other things the EAI broker is trying to do – business rules, process control, execution of other services, application adapters, etc.

OK, I started this writeup by saying it was about an altercation with a couple of Microsoft guys……

I then said something like – “…even Biztalk and Indigo (WCF) with all of its fanfare still suffers from this problem! Indigo provides a nice Web Services enabled messaging bus, but when you’re doing the rest of the integration piece you need Biztalk, and you can’t selectively deploy and independently scale individual integration components within Biztalk”…. I was pretty fired up when I said it too.

I knew there were two Microsoft guys in the room—I was talking to them earlier that day. I didn’t mean to pick on Biztalk per se. I was just using that as an example—this same situation exists with any SOA infrastructure that relies on an EAI broker architecture – TIBCO BusinessWorks, webMethods Integration Server, etc. Sometimes I pick on them too.

Just then Jeromy Carrière, Sr. Technical Evangelist for Microsoft, interrupted and said “excuse me, I hate to interrupt your talk, but actually you can do that. You can actually selectively deploy individual components in a Biztalk server”. Then the other Microsoft guy, Arvindra Sehmi, Lead Architect, EMEA Developer and Platform Evangelism Group, also chimed in and spoke of a couple of document numbers and indicated that they were how-to documents. Then he said “well, its your talk, you can say what you want up there…we’ll take this off line. Please continue” So–he was basically saying that I was full of crap, but that’s allowed because I had the floor. I wasn’t happy with that situation so I decided to stop the rest of my presentation and debate the issue right then and there in front of the rest of audience.

In the end, we determined that it was apparently possible to strip down a Biztalk server and deploy it with just one transformation engine in it. However, and this is a pretty BIG however, Jeromy did concede in front of the whole room that this stripped down Biztalk deployment would be a much heavier weight entity than the ESB container model that I was describing, and that there was a cost associated with that. We never did get to talking about whether this pared down Biztalk server could be easily deployed across multiple machines for the purpose of load balancing an individual mediation component. We also didn’t get to talking about the licensing models of either approach (using the Sonic ESB licensing model you are free to deploy thousands of containers across the extended enterprise without incurring additional license cost). Any by the way, how is it that Arvindi was able to cite those document numbers from memory anyhow? It must be a pretty hot topic for them lately.

That wasn’t the end of it. Arvindi didn’t like how this conversation was going, so he then decided to attack my example and say that it was unrealistic. He tried to make a point that it was a completely invalid use case to have a separately deployable XSLT transformation service…that data transformation belongs at the application endpoints! How intriguing, I thought to myself. Just then another audience member chimed in, announced himself as someone who had been doing distributed computing since the DCE days, and stated that my argument for having a separately deployable and independently scalable transformation service is a very valid deployment scenario for anyone who has ever done any kind of n-tier architecture. Thank goodness! Someone else stood up to support me. It was getting pretty hot up there :).

I have been pondering this portion of the discussion since then. Its interesting that the Microsoft way of thinking–having the transformations co-located with the application endpoints–represents a very endpoint-centric point of view about building a SOA. What exactly did he mean by that anyhow? Was he suggesting putting a Biztalk server with every application? Perhaps he was talking about using a message handler. There are so many SOA advocates that talk all day long about how SOA and web services is all about the SOAP stack, the WSDL interface, and how stuff gets serialized and deserialized across the wire. While those things are really important, its not the kind of thing a SOA architect should be thinking about when building a SOA. In fact, I would submit that the entire design center of building a SOA is about the whitespace between the endpoints! When I refer to what’s between the endpoints, I’m not just talking about reliable and secure protocols (which are important too!)—I’m talking about mediation. Mediation comes in many forms. It can be provided in the form of intermediary services that provide content based routing and data transformation. Mediation can also be in the form of protocol mediation, for example being able to plug one application into a SOA using one type of protocol such as FTP, being able to plug in another application using an adapter, and being able to plug yet another application into the SOA using SOAP and web services. The mediation that a SOA infrastructure such as an ESB can provide in such a situation is 1) the abstraction away from the details of the protocol from the services being implemented, 2) mediation between the interaction models that each connection model might imply (batch, sync RPC, async, event-driven), 3) mediation by providing a unified service invocation model, and 4) a consistent process control mechanism that controls the interactions between the services.

I also agree conceptually that data transformation should be logically associated with the application endpoints, in order to allow for a canonical data model to represent data as it passes between applications and services, and provide transformations to and from the particular proprietary formats as needed at the “edge”. In fact I wrote about this concept in my ESB book, and also in other articles on the subject of the VETO pattern (Validate, Enrich, Transform, and Operate). However the difference in opinion here is that the philosophy of using an ESB to build a SOA is based on the notion that the association should be a logical one. The physical entities should be capable of being deployed anywhere you choose to deploy them based on the machine resources you have, and the horsepower required to execute each operation. If you wish to have a transformation step co-located with an application endpoint, you should be able to do that, but you should not be forced to do that.

In the end, who cares if my XSLT example is valid or not? It doesn’t matter. Substitute another mediation component such as a content based router service, an application adapter, or a third party EDI to XML translator for that matter. In the ESB model that entity is still deployed in a lightweight service container that is remotely managed, load balanced, and scalable across as many machines as you need to deploy in order to support any increase in demands on those particular parts of your SOA. And those containers can be spread across a Linux machine, a Windows machine, a Solaris machine…anything you happen to have available for allocation. The management layer can deal with the fact that these deployment artifacts are distributed and make it just as easy to configure and deploy as if they were all in one location.

Finally, Arvindi said “Well, if its an XSLT transformation service you want, then Indigo provides that for you. You can put a service anywhere you wish using the Indigo architecture”. My reply was “yes perhaps, but we’re not talking about Indigo, we’re talking about Biztalk”. At this point, I thought maybe I should end the debate and try and salvage the rest of the allotted time to finish giving my presentation. However, this last comment got me thinking as well. We started out the debate over whether Biztalk servers could be stripped down and separately deployed, then ended up with the subject of XSLT transformation services deployed in Indigo (WCF). These two subjects are very different things when you are using those technologies together. The XSLT transformation service in WCF got me to thinking of another issue, which is about configuration rather than coding.

Using the WCF framework, you can plug in a XSLT transformation as a separate service. However, it has to be plugged into a known endpoint and coded into place rather than configured. In Sonic ESB, an XSLT transformation service is extracted from the service repository, configured through the tool, and added to an ESB process definition using visual drag and drop just like any other kind of service. You don’t have to code anything. What happens if further down the road the requirements for transformation change such that XSLT is not sufficient, and you need to swap in a third party transformation engine using an adapter? In Sonic ESB you simply change the process definition using visual configuration tools to swap out one service for the other.

Back in July of this year I was at the Burton Group Catalyst Conference in San Diego, where the subject of hard-coded services in Indigo came up during a Q&A session with Ari Bixhorn (Indigo PM). A member of the audience was asking if that coding restriction was going to be removed, and would Indigo/WCF moved more towards a configuration friendly approach. Ari explained to the audience that the hard-coded approach was by design, and it was a decision to remain that way based on feedback from the last Microsoft PDC. Well, c’est la vie! …or should I say c’est la guerre!

What’s really funny about this altercation is that just a week prior to this spontaneous debate, Arvindra posted a blog entry
complaining about how he was uninvited to one of our SOA Architect Forums in London. In his blog entry Arvindra said – “Is it that Sonic Software (and perhaps Dave Chappell himself) is going to say something about Microsoft and our strategy that they don’t want me to hear? Are they embarrassed I might challenge them or kick up a stink? Not likely in a public forum!” How ironic is that?

Dave

A recording of the whole presentation, including the comments from the audience members, can be found here -
http://www.ftponline.com/channels/arch/reports/easbarc/2005/video/ Look for the presentation entitled “How the Enterprise Service Bus Delivers on the Value of SOA”

For more information on plugging in WCF services, here’s a great article –
Introduction to Building Windows Communication Foundation Services
Clemens Vasters
http://msdn.microsoft.com/webservices/indigo/default.aspx?pull=/library/en-us/dnlong/html/introtowcf.asp

or here’s another one written by my namesake –

Introducing Indigo: An Early Look
David Chappell (the other one)
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnlong/html/introindigov1-0.asp

Dan Zambonini

AddThis Social Bookmark Button

Related link: http://www.w3.org/2006/webapi/

The W3C have recently chartered a Web APIs Working Group, which could help with the development and standardization of underlying Web 2.0 technologies.

The group has been asked to look at a number of important issues, including:

  • An API specification for HTTP functionality, starting with XMLHttpRequest (the basis of AJAX).
  • An API specification for a client interface, including interfaces beyond the humble desktop browser.
  • API specifications for other network communication methods, which will look at adding (and standardizing) functionality such as instant messaging into web applications.
  • An API specification for persistent storage on the client. This one could be pretty exciting (and scary at the same time); allowing client side applications to manage their own cookie equivalents.
  • An API specification for drag and drop
  • An API specification for monitoring the progress of resources as they are downloaded
  • An API specification for file upload

And more. All we’d need then is for the major browsers to support these new APIs! How hard can that be…?

Kurt Cagle

AddThis Social Bookmark Button

Sitting here listening to Dr. Jim Hendler, a professor at the University of Maryland, talking about the “Semantic Web” - in other words, spending a great deal of time talking about RDF and RDFS and attempting to make the case that Semantic Web should be viewed as the natural counterpart to the XML document explosion. (Article continued (live)…).

Dan Zambonini

AddThis Social Bookmark Button

I was reading Pitchfork’s Worst Album Covers article, and wondered if a similar list could be compiled for programming books? OK, so coming up with ‘not very interesting’ programming book covers is a bit too easy, but there are many that make you think “Huh? Why?”.

Programming book covers

Here are some examples of the kind of thing I mean. In particular, pretty much all of the Wrox and Manning programming books follow the themes above (the first two on the left).

The Wrox ones in particular worry me. Credit where it’s due, and all that, but do we really need the author(s) on the front cover? It’s hard enough walking up to the attractive sales person with a computer programming book, without it having a picture of a 50 year old bearded man staring out from the cover.

Why can’t we have something cool, but still ambiguously relevant, on the front? Maybe some impressive extreme sports might work; Perl books could have Extreme Diving, C could have Extreme Yachting (for poor phonetic reasons only) and .NET technologies could have Extreme board games; maybe Monopoly?

Have you come across any dodgy book covers? Let me know.

Kurt Cagle

AddThis Social Bookmark Button

Related link: http://www.understandingxml.com

I’ve just recently arrived in Atlanta, getting hotel sorted out and grabbing a bite to eat with a friend who was on the same flight from Vancouver. Besides being singularly amused that most of the people I met in the first few minutes after reaching the hotel were Vancouverites - Paul Prescod and Philip Mansfield - I also JUST missed the speaker’s reception, so will likely be doing my networking tomorrow.

I have noticed that the Hilton’s internet access, even in the rooms, could definitely use some work - though with dozens of geeks no doubt all pounding the server all at the same time, perhaps this isn’t as unusual as it seems. Still, paying $10 american a night for iffy Internet access is definitely a little disappointing. (Article continued …)

Do you have any questions you’d liked asked of speakers at the XML 2005 conference?

Kurt Cagle

AddThis Social Bookmark Button

Related link: http://www.understandingxml.com/

I’ve rather started a thread here concerning startups and the difficulty of making money off the new web. From my earlier comments, I have indicated that I think it will be difficult for startups to become the next Google or Microsoft, and that the era of such rags to riches story was over, especially in the context of the Long Tail.

I believe strongly that this is the case, but I wanted to talk about the flipside of this - the area where I think there is potential, so long as expectations are managed. Somewhere between 2000 and now, I believe that we entered a new domain, one that is just really beginning to play out. It is marked by a number of factors, some of which are “web 2″-ish, some of which are simply the dynamics of the web and networked technologies carried out beyond the linear portion of the curve into the non-linear, and hence, more unpredictable portion of the IT space. In no particular order, these factors are as follows: (Article continues …

What changes are you seeing in the way that you approach work in the new tech economy. Are these comments consistent, or is the new look a lot like the old?

Kurt Cagle

AddThis Social Bookmark Button

Lloyd Budd, author of the “A Fool’s Wisdom” blog, a colleague and former coworker, recently wrote a comment on my Long Tail posting recently, generally complementary (Thanks!) but taking exception to my statement:

This has meant that it is becoming increasingly difficult for software shops to make money, even though the economic difficulties of years past have largely eased.

On the heels of that, I saw another essay on the web, Paul Graham’s The Venture Capitalist’s Squeeze in which the author argued that its becoming increasingly difficult for VC fund managers to invest in IT startups because the cost of such startups is so low now on the marketing, hardware, and production side that the amount that most fund managers wish to invest is far larger than these companies need or increasingly want.
Article Continued

So what do you think of the dynamics of the long tail? Is Kurt right or is he full of it?

David A. Chappell

AddThis Social Bookmark Button

Related link: http://www.cio.com/archive/110105/wsquiz.html

Enterprise readiness for adoption of SOA and Web services is a very hot subject these days. Christopher Lindquist of CIO Magazine has published a nifty online survey,Are You Ready for Web Services, that can help an organization score where they are in their readiness, and offers some guideance based on your answers.

Dave

Kurt Cagle

AddThis Social Bookmark Button

Related link: http://www.understandingxml.com

Welcome, one and all, to the Metaphorical Web. My name is Kurt Cagle, author of books on XML, occasional blogger, computer consultant, and pundit wannabe. I want to thank O’Reilly for giving me the privilege of writing a blog under their banner, and hope to be able to take advantage of it to expound pell-mell on XML, the industry, programming, programming ethics, and the world of the technically bizarre.

I’ve noticed an interesting trend lately, an extension of a practice that’s been going on for a while but seems to be gathering steam. Recently, A9, the “new technologies” off-shoot of Amazon, launched a service called the Amazon Mechanical Turk. The concept here refers to a rather intriguing scam performed by the Hungarian nobleman Wolfgang von Kempelen in the late 1760s. Herr Von Kempelen had a most ingenius automotaton that he created, in which a mechanical man astride a complex block would challenge (and as often as not beat) all comers in a game of chess. He intrigued much of Europe with this particular robot, including such luminaries as Benjamin Franklin and Erasmus Darwin, and it was only much later that it came out that the secret of the automaton consisted of a midget with a prodigious talent as a chess grand-master, who was carefully hidden within the box amidst mirrors and gears.

Amazon’s service uses the idea of the Turkish Automaton to handle a rather intractible problem with many tasks today. In essence, the idea is to take the notion of web services and turn it on its head - take those tasks which can’t be readily done by computer - anything from identifying photographic content to writing copy to evaluating essays - and turn them into web services that people can bid upon to complete. Once the process is completed and approved, the results are then dropped back into whatever computation or application is currently being done and life goes on.

Similar ideas have been tried before, with mixed results - posting jobs to be done in a marketplace, answer services such as Google Answers, and related efforts have occasionally run afoul of simple market dynamics, the difficulty of setting up “atomic” tasks and a need to establish a reasonable means for determining reliability of service both serving as barriers to adoption to these types of offerings. However, I’ve also seen, and used, other types of “bulletin board” services, including in my most recent move (with both good and bad results, including some damaged furniture).

However, I suspect that while the kinks may still be a few years getting out, the future does seem to point toward the notion that business is transforming into the individual entrepeneur model that was extolled (perhaps somewhat ridiculously) by much of the Wired digiterati. I’m not necessarily sure that this is altogether a good thing, but before going into that, its worth looking at where this process is leading.

If you take a look at eBay, what you see is the rise of a whole new class of mid-range entrepeneurs amidst a sea of weekend merchants. The weekend merchants may occasionally put a few items up on eBay, but the mid-range entrepeneurs are now acting as middlemen to aggregate multiple other sellers into a package, and then selling into that package. They are, in essence, acting as the electronic equivalent of department stores, producing little themselves but acting as agents for others who have neither the time nor the desire to be involved with eBay full time. For these people, eBay IS their business.

I suspect that this will likely end up being the path taken within the Amazon model as well. Having run my own business, I can say that a significant proportion of the time and energy in that business goes into the process of securing work and accounting for that work when its done - operations which are necessary, but for which in general you do not get paid. Initially, the “contractors” on the other end of the equation will be independents, but while the overhead is reduced somewhat in this kind of model, it is replaced by establishing the business process and handling the monitoring of contracts. Those that succeed will not be the ones doing it themselves, but rather small to medium businesses that essentially specialize in being generalists.

A system like this is not, however, necessarily all that beneficial to the seller of the services (nor to the buyer, but for different reasons). The seller ends up with the headaches associated with running any small business, but without many of the legal or governmental benefits that acrue to businesses. Indeed, because current tax law requires that you maintain taxes on every 1099 type job (i.e., contract based) that you take, doing the taxes for a business that’s heavily built upon this model can become nightmarish. You become responsible for your own health care and benefits packages on top of that, and depending upon how you’re declared you may get hit with self-employment taxes as well.

Additionally, while the model currently proposed for Amazon requires that the money for the job be placed in escrow, the possibility that the work will not be found satisfactory is high, and the risk of litigation in this regard should be seen as significant.

For the buyer of these services as well, there’s a certain degree of risk, accounting overhead and unexpected costs that enter into this equation. I’ve worked on a contract basis for years, and have never been in one where a client can specify exactly and precisely what he or she needs out of the gate. Instead, most such projects are iterative in nature, with the client refining their vision with each successive cycle of work and review.

Ultimately, this raises a question that has disturbed me more than once about such technology-based solutions … in the attempt to create ever increasingly automated systems are we reaching a point where we’re sacrificing many of the things that contribute to quality - solid design, effective communication between parties, succesive refinement of concepts and the time to think through problems? Moreover, are we destroying the social aspect of business in attempting to make business more machine-like?

One of the concepts I’ve always felt to be somewhat ludicrous was the use of UDDI as a means of automating the sales process - you post a web service with your request to the appropriate UDDI server, which will in turn send you back the vendor which best meets your criterion, based upon the SOAP bundle you send. From a technological standpoint, this sounds like a can’t lose proposition. From a sales standpoint, however, this is an unmitigated disaster, because sales are ultimately social transactions, not automated ones. I do not find it surprising that UDDI has not even remotely taken off except in very narrow environments.

The principle challenge of this decade of the twenty first century is to move the web from being a largely static medium for displaying content into a framework for social networking. I suspect that true eBusiness will likely only take off once we recognize that the most powerful aspect of computers is not in their processing powers but rather in their ability to act as communication devices that best facilitate human to human transactions, that take advantage of the social networks, and not just the physical ones, that bind us all together.

Should be interesting to see how well we do with that…

So what do you think about the Turkish Automaton? Flash-in-the-pan idea, or a harbinger of things to come?

Dan Zambonini

AddThis Social Bookmark Button

Giles Turnbull has already written an introduction to the basics of GarageBand, for those who have recently joined the world of digital recording. In this article, I’ll assume that you’ve already recorded your tracks, and/or used the pre-recorded loops, to assemble the basic parts of your song.

The next step is to mix the different elements together; transforming the raw samples, loops and midi sounds into a professional sounding combination. The importance of “the mix” should not be underestimated — major recording artists will spend at least as long mixing songs as recording them. A good mix creates a clear, balanced, full sound; a bad mix can be lifeless, erratic and confusing.

Before you begin

Before you start mixing, there are a few chores to perform:

  • Check the quality. Your tracks should have been recorded as cleanly, loudly (without ‘clipping’, or distorting), noise-free, in-tune and in-time as possible. If you have any bad notes or unsatisfactory parts, fix them now — the ‘Garbage In, Garbage Out‘ programming adage applies to mixing too.
  • Organise your tracks. Label your tracks with an appropriate name and icon; it’ll save you time in the long-run. To set the name of a track, click the track (to select it), then click again on the current name. To choose an icon, double click the track (to open the track properties window), then click the arrow in the icon box. You can also re-order tracks by dragging them above/below other tracks — this will have no effect on their sound.
  • Spend a little money. If you’re serious about achieving a professional sound, you’ll need a pair of monitor speakers hooked up to your audio output. If these are out of your price range, consider a set of entry level monitor headphones. These speakers or headphones are specifically designed for mixing, as they handle a wide range of sounds, and transmit each type of sound equally (called a flat response). Most other speakers, such as those in your Hi-Fi system or guitar amplifier, do not. If you use these to mix, you will be mixing your song based on a biased version of the sound; your final mix could then sound wildly different when played on a friends Hi-Fi.
  • Prepare your Mac. During the mix process, you’ll be making greater and greater demands of your Mac. Change the Energy Saver settings (under System Preferences) to Highest Performance. Stop any unnecessary applications. Ensure that you have enough RAM (at least 512Mb), free space on your disk (so that you can Lock Tracks) and — if you’re using a PowerBook — check it’s plugged in.
  • Know what your want. Mixing is partly science, but mostly an art — you’ll need to make many subjective personal choices. Listen to some of your favourite sounding songs, and try to identify patterns or ideas that you like in how the sounds are used. Write down any ideas that you’d like to try or emulate — which leads us on to:
  • Prepare for ideas and change. As your mix progresses, you’ll be re-mixing the same tracks over and over again. You’ll be listening to your mixes in a range of environments; on your MP3 player, in your car, on your PC at work. Each time you listen, you’ll need to be ready to write down any changes or new ideas that come to mind, so always have a notebook and pen ready.
  • Finally, make a rough guess. Play your song through, and roughly adjust the volume sliders so that each track of the song is about as loud as you’d like. This is your ‘rough mix’, and the starting point of your recursive mixing journey.

The basics of mixing

Let me introduce you to something that I like to call The Cube Of The Night of The Destiny of the Goose:

A cube showing left to right (stereo) on X, up to down (frequency) on Y, and in to out (depth) on the Z axis

The three axes of the cube are:

  • Left to Right (the X-axis): The stereo ‘pan’.
  • Up and Down (the Y-axis): The frequency - from low bass frequencies to high treble frequencies.
  • In and Out (the Z-axis): The depth - from “in-your-face” to the distant horizon.

The basic mixing process — apart from some special effects — is all about positioning sounds within this cube. And my secret of good mixing can be boiled down to one golden rule:

Balance sounds within the cube, so that no single point has a concentration of sounds

As you’re mixing your song, you will be moving sounds (instruments/tracks) inside the cube, trying to balance the distribution, and avoid any clashes.

Now that you know the challenge, you need to know how to move a sound around the cube. These are the basic tools in GarageBand that you’ll be using:

  • Left to Right: The pan dial. This is the most straightforward axes to adjust, with a simple dial that can be twisted to the right or left, to move the sound accordingly. If you’re using a PowerBook, you may find it easier to position the cursor over the dial, then use two finger scrolling to adjust the value.
  • Up and Down: Equaliser effects. GarageBand ships with a wide range of equaliser effects, that let you adjust the shape of an instruments frequency spectrum (i.e. the strength of the sound at each frequency). You’ll find all of these in the effects settings for each track (by double clicking the track). There are some presets available, such as ‘Clear Vocals’ and ‘Improve Guitars’, which you may find useful when starting your mixing journey. As you progress, you’ll want more fine-grained control — try moving on to the basic manual equaliser (which just lets you adjust ‘bass’, ‘middle’ and ‘treble’). And finally, when you want to really tweak every little frequency, the multiple-band graphic equaliser gives you the most power. You may also want to try the Bandpass filter, which completely removes (or rather, only permits) ‘bands’ of frequency. This is useful if you want to remove all rumbling and bassiness from a vocal line, for example.
  • In and Out: Volume, reverb, echo and little equaliser. If you think of a person talking to you in a large room, try to imagine how the sound would change as she moved further away from you, towards the back of the room. The sound of her voice, from your perspective, would become quieter, a little ‘wetter’ (basically, more echo-y) as her voice echoes off the walls, and also slightly bassier (because lower frequency bass notes have more ‘energy’ and hence are the better travellers). You can therefore push a sound further away, along the Z-axis by decreasing the volume, adding reverb and/or echo, and possibly tweaking the equaliser treble parts down a little.
GarageBand pan dial
GarageBand basic manual equaliser
GarageBand graphic equaliser
GarageBand bandpass equaliser
GarageBand reverb and echo settings
The GarageBand Pan Dial GarageBand equalisers GarageBand reverb and echo

You probably won’t want to fill the entire cube with sound; to get a natural sounding mix, you’ll want to avoid the extreme edges of the cube. So, none of your sounds should be panned far right or left, have extremely high or low frequencies, or be dry and overpowering (in-your-face) or have maximum reverb and low volume (distant horizon). Of course, if you want to try a more experimental sound, go wild!

Now you know how to move sounds around the cube, where do you start? Well, if you want a natural sounding mix, many of the sounds will have natural/intrinsic positions within the cube, which you shouldn’t alter too much.

The natural (starting) position of a sound will be decided by the type of instrument (e.g. a double bass will naturally have low frequencies, hence be low on the Y-axis) and by the typical position that we’re used to hearing it at. If you watch a band live, the members of the band will typically follow traditional positions, as in the following diagram:

Traditional band member positions. The singer is happy because he gets all the attention

So, the starting point on the Y-axis (frequency) is determined by the type of instrument, and we can use the diagram above to place the sound in the stereo (left to right) and depth (near to far) fields. For example, you’ll notice that the drums, bass and lead vocal tracks should all appear close to the middle of the stereo field, and you could mix a guitar track to each side. Similarly, backing vocals and strings (or keyboards) could be further ‘back’ in the mix (in depth/Z-axis) - with more reverb and slightly less volume.

Probably the most difficult axis to handle is the Y-axis, the frequency. GarageBand’s graphical volume display gives you useful feedback about how loud and panned each track is, and the track properties window (double click the track) quickly shows you the amount of echo and reverb. However, the frequencies that each track transmits is not directly available. So, to get you started, here are the typical (equaliser) changes you should try making for common instruments, to help condense (move) them into less cluttered parts of the Y (frequency) spectrum:

  • Vocals: Like many people, I don’t have a fantastic condenser microphone (I use a gig-quality Shure). As with many of the cheaper microphones, this doesn’t have a wonderful dynamic response, so it tends to give quite a muddy, bassy sound. Try reducing around at 200Hz or 250Hz, and increasing at around 3kHz and 5kHz. You should also try adjusting at 10kHz, which may need a kick up or down, depending on your mic.
  • Guitar: You won’t want to keep much beneath 100Hz (which will interfere with the Bass drum), but you can try adding at anywhere between 150Hz to 5kHz to get the correct sound; possibly even higher frequencies (up to 7kHz) if you directly recorded your guitar (i.e. not a Mic’ed amplifier)
  • Bass Guitar: As with most instruments, drop the very low frequencies that could detract from the drive of the bass drum — for the bass guitar, this means a drop at about 250Hz, 300Hz. You could kick some life into your bass guitar by increasing at 2.5kHz to 5kHz.
  • Bass Drum: Keep it nice and tight; increase at around 80Hz, maybe 100Hz. Drop above this, from 150 up to 600Hz. Again, you can add some bite at around 2.5kHz to 5kHz.
  • Snare Drum: A tricky one; the snare drum is often said to be one of the most important sound shapes in the mix, as it defines the rhythm. Spend some time to get the sound you want, but you could try cutting some boxiness at 800Hz to 1kHz, and adding at 8kHz to 10kHz.
  • Cymbals: You won’t want any bassiness from these, cut below 200Hz, and again at 1kHz to 2kHz.

There were some additional topics I wanted to cover, but this entry is now getting far too long, so I’ll just quickly jot them down and leave them as exercises for the interested reader:

  • Start mixing with the rhythm/drum tracks.
  • Using the Compressor is critical for reducing ‘wandering’ on the Z axis (sounds changing from very loud to very quiet).
  • Lock tracks when you’re not mixing them; reverb and echo have a high impact on the CPU.
  • Use the repeat/loop feature whilst mixing.
  • Learn the keyboard shortcuts! And for added efficiency, check out iControl.
M. David Peterson

AddThis Social Bookmark Button

Related link: http://www.xsltblog.com/archives/2005/08/via_my_mother_u_1.html

I plan to leave the peanut gallery comments in the peanut gallery and I ask that you be willing to do the same. This post isn’t about differences of political opinions nor is it a post for or against the United States Government, President George W. Bush, and Senator Orrin Hatch. As such I ask you to please set aside your opinions regarding any of the above and instead focus on the content of this letter, which is a response from Senator Orrin Hatch of the United States Senate to an email I sent him regarding the current crisis taking place in Zimbabwe. In this email I asked, in no uncertain terms, what plans, if any, the United States Government has in bringing a stop to the atrocities currently taking place in this country.

In a position of seniority, I think the content of this letter from Senator Hatch, while not an official US Government response, does represent well the current views held by the Senate currently in place. As such there is value in this response in representation as to what, if anything, might take place in regards to physical actions by the US Government and as such I feel is of importance and needs to be brought to the surface. His response(or, more than likely, written by one of his staff members and signed with a signature machine — but still a proper representation of the views of Senator Hatch none-the-less), is as follows (NOTE: Based on size restrictions this image is more than likely too small to be legible. A larger version of this letter can be viewed here.)

image

[NOTE: You might notice that this letter is dated as being written during the first part of September. One of the strange anomalies of the US Government is that there mail system is different than the services provided by the US Postal Service. But instead of this meaning better, it actually mean slower. In some cases slower by several magnitudes. I’m not sure of the exact reasoning behind this, but I can state that it tends to mean that any letters you might receive from US Government representatives tend to arrive much later than you would expect. But while the date it was written is almost 2 months back, the content is still very much relavent to the current mindset of the United States Senate in general (currently a Senate who’s majority is of the Republican Party, of which Senator Hatch belongs, and of which, I believe, he is still the majority leader, but need to verify this to be certain).]

As many of you may know, Zimbabwe is currently in a state of intense crisis, labeled (I believe by Amnesty International but I need to verify this to be certain) one step away from complete and total genocide. The United Nations recently released a report assessing the current situation, calling for an immediatte end to the current “Slum blitz” taking place. According to this same linked report from the BBC:

“The scale of suffering is immense,” it said. About 700,000 people have lost their homes or livelihoods and another 2.4 million people have been affected.

However, and again, according to this same report:

But Zimbabwe said the allegations in the report were “definitely false”.

With an ongoing and horrific history of abuse, I feel we can only assume that the response from the Zimbabwe government can not be trusted and that the allegations brought forward by Mrs. Anna Kajumulo Tibaijuka, the author of the above linked United Nations report, are true. As such I feel that as an international community we must act on these allegations accordingly and we must do so swiftly and with much urgency in mind.

In the United States tomorrow is the day we head back to the polls to place our votes regarding various positions up for reelection and ballot measures of importance to our individual state and local governments. While this is an off year election every election is an important one and should not be overlooked or set aside as anything else. As such, please take the time to vote tommorrow if you are currently registered as such in any district in the United States.

Again, I plan to keep comments specific to this letter to myself. As such, I don’t plan an attempt to make any “political pursuasion” speaches. However, I can state that when it comes time to vote again for positions in the Congress and Senate of the US Government as well as in the next Presidential election the candidates in which I plan to place my vote for are candidates who have been willing to stand up and publicly take a stance against any government, whether that be local US or international governments alike, who opress, murder, or in any way violates the rights of a human being in which they hold positions of power over. If you are an individual who plans to run for Congress, Senate, or for President of the United States and my vote is important to you (I am currently registered in Salt Lake City, Salt Lake County, Utah, United States) then you may want to keep this in mind if you are desirous to sway my vote in your direction.

Thank you for taking the time to read this post. Our response as an international voice is an important one. I hope we are able to make that response soon.

[NOTE: For more details as to why the topic of Zimbabwe and its current state of affairs are important to me that go beyond the obvious, please see this post from my personal blog.]

Thanks again for keeping your comments to this post as neutral to the US Government, President George W. Bush, and Senator Orrin Hatch as possible, and instead on the specifics of the content contained in the letter received in response to my concerns from Senator Hatch.

Jennifer Golbeck

AddThis Social Bookmark Button

I have several posts in mind about XFN – the XHTML Friends Network.

I am a Semantic Web person, and partial to FOAF (mentioned in earlier posts of mine) , but unlike many RDF and OWL aficionados, I am very much in favor of microformats, folksonomies, and the like. I think they have potential to allow us to do powerful things.

XFN, however, seems questionably useful to me.

According to their website :

XFNTM (XHTML Friends Network) is a simple way to represent human relationships using hyperlinks. In recent years, blogs and blogrolls have become the fastest growing area of the Web. XFN enables web authors to indicate their relationship(s) to the people in their blogrolls simply by adding a ‘rel’ attribute to their <a href> tags, e.g.:

<a href=”http://jeff.example.org” rel=”friend met”>

They go on to detail what kind of information can go in the “rel” attribute. I will eventually write another post to discuss that, but for now I want to stick to the core idea: build a social network by putting a bit of metadata on a link.

The problem with this is that we are annotating URLs of webpages, not people. This annotation above reads to me as “http://jeff.example.org is a friend and someone I met”, not “The person described at http://jeff.example.org is a friend and someone I met”. How do we actually know who the person is that is described on the webpage? Short answer: we don’t. This format is only allowing us to annotate links between web pages, not links between people. It may be a socially knowledgeable network of webpages, but it is not a social network.

Perhaps that is ok – if the end consumer of this information is human, he or she can figure out that the person who owns one page knows/is friends with/has met the person who owns the linked page. If that were the intent, I may look more favorably on XFN. However, their website claims it as something that can be machine processed and aggregated. The site talks extensively about the potential for extracting social networks from XFN, but contains no concrete details about how we actually identify who the people are.

A section of their website is called “Delusions of Grandeur” and I think it is exactly that. For example, one item on this list reads

2. Commercial services like Amazon, which currently ask users to manually register all their friends in order to make “wish list” and other information sharing simpler, may find it easier simply to crawl XFN relationships on the open Internet. This would allow a user to enter the URL of their site, and let the service programmatically analyze XFN relationships to build a list of friends.

I do not see how Amazon, or any service, could build a list of friends from XFN data. Sure, they could find a link that was marked with XFN. But that link does not point to a person; it points to a website. So Amazon may know that I have met the person who owns a particular website, but there is no way of knowing who that person is. The data that can be extracted is worthless to Amazon for their purposes.

There are places where quick and easy annotation is good, but for machine processing of information that will be used in applications, XFN is just not going to cut it for representing social networks. RDF and OWL may be more than a basic user can handle on their own, but to achieve certain goals, something much more structured than XFN is necessary. This is where FOAF has its place.

Dan Zambonini

AddThis Social Bookmark Button

Ignoring the fragility, performance and even architectural arguments associated with superclasses (I like to just put my fingers in my ears, repeat ‘na na na’ over and over, and pretend these problems don’t exist), I’m fond of declaring a single base class that can be extended by almost every other class in an application. (For the sake of this question, I’m thinking of PHP4, so will ignore the loveliness of interfaces, and other functionality beyond my means.)

So, if you were to implement such a ubiquitous base class, what methods would you put it in? To begin with, I’d consider the following:

  • A save or persist method, which could record the current object data to a configurable location: database, disk, memory (e.g. memCache), or session (cookie). This would also require a set of methods that define the mapping of properties to database tables/fields, and relations between classes (e.g. compositions).
  • Conversely, a load method, based on a unique identifier or condition.
  • A getAsXml() method that returns an XML representation of the object (either as a string, or appended to a provided DOM Node). Actually, maybe this should just be an ultra-flexible export or getRepresentation type method, that can also return the object as HTML, CSV and other formats.

What else might be useful, for debugging, logging, portability or otherwise? What about useful abstract methods, maybe for comparing or merging?

Advertisement