May 2002 Archives

Simon St. Laurent

AddThis Social Bookmark Button

Related link: http://www.xml.com/pub/a/2002/05/29/perry.html

Walter Perry is well known at XML conferences as the heavy-duty contrarian who tells XML developers that conventional standardization practice is not only not helpful, it’s harmful. While this message is rarely well-received, it’s well worth deep consideration.

An enormous portion of the XML community overreacts to claims that “XML is the Tower of Babel” by retreating to descriptions of XML as a meta-language for defining standardized vocabularies that happen to share a syntax. This emphasis on standardization gives life to a growing number of projects, many of them crossing organizational boundaries and themselves giving life to various consortia.

Walter Perry runs the opposite direction, insisting that local understandings of information are far preferable to global understandings, pushing against the increasing tide of insistence that “there can be only one”. To some extent, he reminds me of environmentalists who point out that the monocultures which have served us well have their own set of potentially explosive dangers.

Even if you insist that agreement is a prerequisite to communications, I strongly recommend this (densely-written) piece. There are a few similar pieces out there worth exploring, notably Edd Dumbill’s “The Selfish Tag“. I’ve written a somewhat weaker piece questioning how standards and communities interact - or don’t.

One contradictory area I’ll admit to finding quite amusing is the reluctance of standard-creators to require conformance to their specifications. The “error-suppressing” nature of HTML (and some RSS) processing, while in some ways convenient, demonstrates all too well how local processing that doesn’t care about global rules can in fact subvert the global rules quite drastically. Local understandings can be picky or they can be loose, depending on the situation. At least with XML developers are likely to have that choice for themselves, and not be at the mercy of far-off vendors.

Why can’t we all just conform?

Ben Hammersley

AddThis Social Bookmark Button

Two days ago, Matt Griffith has come up with the smart idea of using the HTML link element to point to a site’s RSS feed from the site itself - thus allowing automatic discovery of RSS feeds.

By adding a line like this:

<link rel=”alternate” type=”text/xml” title=”XML” href=”http://rss.benhammersley.com/index.rss” />

a site would be providing metadata as to the location of its feed - and this would allow newsreaders, browsers and search engines to automatically locate the feed.

Things are moving quickly with this. This morning, Mark Pilgrim released bookmarklets for Radio and Amphetadesk that allow users to subscribe to RSS feeds found in this manner.

As Mark says, “If we can persuade existing weblog authors to insert this one line of code, and then get it into the default templates of Radio, Manila, and Movable Type, and we could make news aggregation an order of magnitude easier.”

Many already are. From my Content Syndication with XML and RSS blog to all of the channels at News is Free.

There is much good thinking about this to be found at today’s DiveintoMark

UPDATE: Bill Kearney already posited this idea last year.

UPDATE TWO: Syndic8 is now on the memewagon. All of the feed info pages now have their very own <link> tags pointing back to the feed shown on the page. And so is Meerkat. Also,
Bloxsom has also been updated to add the correct code.

UPDATE THREE 22:47GMT: Dave Winer has just created the feature in Radio Userland. Instructions are here - and a Manila upgrade is in progress, he tells me.

Ben Hammersley

AddThis Social Bookmark Button

The full blogging dream, of two-way links, semantic web, and a massively powerful related-items search facility in every browser may be a few years off, but the chaps responsible for Metalinker are certainly doing their bit.

Every time you add a link to your page, Metalinker uses javascript to add a link next to it, that links to the page on Blogdex that lists other sites that mentioned that link. So, a link to Oreillynet [b] looks like that.

Edd Dumbill

AddThis Social Bookmark Button

One of the most striking differences between the href="http://www2002.org/">Eleventh International World Wide Web
conference (WWW2002) and its previous edition in 2001 is the large
increase in activity related to the Semantic Web. The premise of Tim
Berners-Lee’s brainchild is that the Semantic Web fulfils the other half of
his dream for the Web: for computers to be able to communicate generally over
the web, thus making the vast amount of information out there a lot more
useful to human beings.

Let me give one of the more common examples: currently if you wish to book
a flight on the web, you need to look up several sources of information: your
own schedule, flight timetables, connecting transportation, frequent flyer
details. Through various means you juggle all this information and come up
with some candidate flights, which you then attempt to purchase. If all this
information were available to a program on your computer, it would be a
relatively simple matter for the computer, your “user agent”, to come up with
the likely flights for you, just requiring you to choose between them.

In order to make this sort of thing possible in the general case, a lot of
infrastructure needs to be put into place. First, it will help if all the
sources of information are able to present the same syntax to the user agent.
The work to enable this is almost done, in the shape of XML and RDF: the W3C
RDF Core Working Group are now well advanced in their revision of the
original RDF specification, aimed at tidying up the first edition from 1998.
Second, all the sources of data need to be able to name and describe the
various shapes their data can take. This is where the semantics start to come
in. The fancy word for such descriptions is “ontologies” (practically a
synonym for “schema”, but that word is somewhat overloaded by SQL and W3C XML
Schema these days). The W3C’s WebOnt Working Group is charged with this piece
of the puzzle, and they’re building on existing work such as href="http://www.daml.org/">DAML. As somebody remarked to me, there were
a surprisingly large number of papers focusing on ontologies at the WWW2002
conference. There is a lot of practical research work going on in the
field.

One other vital element is that of trust. If your computer’s going out to
fetch flight schedules, you want to be able to trust it to return accurate
information about such a high cost purchase. If it is sending your personal
information out in order to do this, you also want to know it’s being done
securely. Enter the work on XML Encryption and Signatures. This effort is in
a pretty mature state for XML itself, but work needs to be done to integrate
this technology with other Semantic Web technologies to create the “web of
trust” that will enable users to decide how trustworthy a source of
information is.

That Semantic Web development work is progressing is all well and good,
but clearly what matters to developers and users is the question of when all
this will emerge from the research lab. That it is research is undeniable,
something that makes many normal developers think it dusty and obscure. That
it is incomplete and emerging is also true, something that makes some experts
in knowledge management think it ill-conceived. However, Berners-Lee met with
these kinds of reactions on the creation of the Web itself: it isn’t
difficult to imagine the same doubts being raised in the early 90s. Simply
because he pulled it off once is no indicator he can repeat the feat, but
other factors are pointing to a turning in the tide of opinion about the
Semantic Web.

Despite the official positions of their employers, I have noticed several
prominent members of the Web community taking a harder, more serious look at
the Semantic Web. They may not be on the road to Damascus just yet, but
there’s a rise in credibility of the Semantic Web idea not seen last year.
Another significant activity is the creation of a European research project
specifically focused on delivering useful software components, which will
take into its sphere the excellent Java and C RDF frameworks, href="http://www.hpl.hp.com/semweb/">Jena and href="http://www.redland.opensource.ac.uk/">Redland.

The Semantic Web effort is not just Tim Berners-Lee. There are now a large
number of people committed to the idea and its development. While work
continues in research and specifications, these people also need to start
finding ways of making Semantic Web technology useful in everyday computing
scenarios. This is the critical issue: the Web succeeded because of the
problems it solved, and the aptness of the solution. The developers of the
Semantic Web need to see past their academic surroundings to address, however
simply, some of today’s information management problems.

Related content:

What’s your view on the Semantic Web? Share your opinion in the forum.

Simon St. Laurent

AddThis Social Bookmark Button

Related link: http://xml.coverpages.org/patents.html

Robin Cover, bibliographic genius of the markup world, has assembled a draft document exploring a collection of resources covering “Patents and Open Standards”.

Non-proliferation, armageddon, or neither?

Edd Dumbill

AddThis Social Bookmark Button

In his opening keynote today at the Eleventh World Wide Web
Conference
in Honolulu, Hawaii, Tim Berners-Lee made a strong
appeal for the development of the web to continue unencumbered by
patent royalties.

In a talk entitled “Specs Count”, Berners-Lee
outlined how important it was that today’s web technology
specifications remain open and freely implementable. He described
how accessing a web page today involved many layers of standards –
ethernet, IP, TCP, HTTP, MIME, XML, Namespaces, XHTML — each layer
of which relies critically on the layer below. As Berners-Lee is
fond of noting, the web is “not done yet”, therefore it
is not unreasonable to imagine a future with a similar number of
layers built upon the existing ones. For that reason, it is still
highly critical that the “communal” nature of the specifications is
preserved.

Berners-Lee took off his hat as W3C Director for his speech,
stressing that it was delivered as personal opinion: he was highly
pointed in his support of royalty free licensing for web
technology, a position that doesn’t meet universal approval within
the consortium. The W3C has itself had a difficult journey through
issues of licensing its own standards. Reacting to a large amount
of dissent from the web and free software community, it reversed
plans to allow RAND (”reasonable and non-discriminatory”) licensing
terms on its specifications. The new patent policy is that every
working group will aim to achieve royalty free licensing terms by
the time a spec reaches the final Recommendation stage at the
W3C.

Outlining both the pros and cons of enforcing royalties on open
specifications, Berners-Lee speculated that if the specifications
driving the web had not been royalty-free, then none of the
900-strong audience would actually be at the conference. Enforcing
royalties discourages adoption both by the open source community,
who simply cannot pay royalties, however “reasonable”, and other
companies who will shy away from the issues associated with
licensing the technology.

In closing, Berners-Lee encouraged the delegates to get involved
in the patent and licensing debate. He mentioned the effect that
the large amount of public feedback on the W3C RAND debate had had,
which included a change in W3C patent policy and the invitation to
the table of representatives from the open source world. He
encouraged continued involvement and contribution to the debate,
stressing that thoughtful contribution to the ongoing debate was
important.

  • href="http://www.w3.org/2002/Talks/www2002-tbl/slide25-0.html">Slides
    from Tim Berners-Lee’s presentation

Share your comments in the forum.

Simon St. Laurent

AddThis Social Bookmark Button

Related link: http://lists.xml.org/archives/xml-dev/200205/msg00303.html

Uche Ogbuji has nicely summed up why XML has done so well at cutting across environments, platforms, and mindsets. In a posting on the xml-dev mailing list, he notes:

It is precisely the fact that every
programming language, platform, tool, DBMS, etc. out there
has a different and usually mutually incompatible notion of
core data types that makes it valuable that XML is grounded
in text.

This divorces data expressed in XML from physical representation issues (save Unicode), and I think it is the single most significant reason for XML’s success as an integration tool.

XML’s textual approach, relative verbosity, and structures that fit neither relational databases nor object structures are often criticized by developers, but Ogbuji sees a much graver threat in force-fitting XML’s structures to match programmer expectations:

As soon as you start to inject the welter
of all these other systems into the foundation of XML, you
lose this facility, or more precisely, as the Schema group
did with their data types, you invent yet another different
and incompatible type system.

So where does this leave programmers? They should “impose the desired view on what is lexically expressed in XML in a separate layer.”

XML does more by doing less, and will likely do less if it is expected to do more.

How strong do you want your types?

Sam Ruby

AddThis Social Bookmark Button

Paul Prescod has submitted a request for an ETCON BOF. Hopefully shortly the details will be posted here. Meanwhile I have a few suggestions for topics to be discussed.

Don Box commented that
"If it takes three minutes for a response, it is not really HTTP any
more".  In situations where responses may take five days, is it
possible to apply the architectural style that made the web so successful, or is
a new approach required?  Put another way, is REST tightly bound to HTTP or
can the architectural principles it embodies be applied to other protocols?

When the request and response are sufficiently temporally separated, one
essentially end up with the equivalent of UDP datagrams. Is it possible to layer
on the equivalent of what TCP provides in terms of error recovery, flow control,
and  reliability in such a context?

The REST wiki suggests
that the REST architectural style is most closely related to that of  TupleSpaces
One important difference is that in TupleSpaces the sender does not identify the
recipient.  Data is addressed and routed based on content.  Is there a
place for such a model in "Alternative Web Services Architectures"?

Advertisement