Related link: http://www.mysqluc.com

On the last day of the

2005 MySQL conference
,
I finally heard a speaker who stretched the audience’s assumptions and
pointed toward a liberating path forward. This is the sign of a good
conference, incidentally–most of the sessions deal intensively with
the problems of today, but one or two keynotes prepare the listeners
for tomorrow.

I wrote in

my earlier weblog about this conference

that MySQL was becoming conventional. Many people are doing innovative
things with it–I sat in today, for instance, on a session about MySQL
as an embedded server or library–but the largest attendance has been
reserved for traditional topics such as replication and performance
tuning. MySQL AB itself is concerned with catching up to its
competitors in terms of SQL features that centralize more and more
control in the database engine.

Adam Bosworth, in his keynote today, threw all that out and set his
ship headed in a different direction. The problem he found with
centralizing processing–with stored procedures and triggers and so
forth–is that it doesn’t scale. His talk also implied that it
restricts users from making innovative connections. Google, his most
recent landing place during Bosworth’s long and impressive career,
illustrates an entirely different way to handle data.

Adam Bosworth’s view of an open data query protocol

The promise of the Web was to aggregate the contributions of
individuals everywhere and make retrieval easy along any lines one
chose to use. As the volume of content became unmanageable, XQuery was
supposed to provide a Web-aware search mechanism, and Web Services the
infrastructure and protocols to connect sites. XQuery and Web Services
were too big and came too late, however. Nobody actually wants to use
them, even if they know how.

So the gap has been filled with RSS, the model highlighted by Bosworth
for the next stage in search. RSS and Atom are lightweight and easy to
understand. The put control in the hands of the content providers and
the potential viewers.

Bosworth’s extended vision is for a protocol that provides raw access
to data, somewhat as XQuery is supposed to do. It would be a very
simple and database-independent protocol that would make all data in
the world open. Then, he says, everybody could do what Google
does. And more–we could provide distributed updates too.

Where to impose structure

The Google approach to data, carried through in Bosworth’s vision,
runs head-on up against the ideals of the relational database model.
The entire relational approach, from the canon of Third Normal Form
(three is a holy number) to the enormously complex collection of
analytic functions, subqueries, and other ways to impose structure in
SQL, is an attempt to be as precise as possible about the data chosen
and returned.

Bosworth isn’t interested in that. If the user gets a few hundred
results and has to scroll through them a little bit, that’s fine. We
don’t need no stinkin’ metadata or knowledge management.

The philosophical debate underlying relational database design

Bosworth evoked earlier debates that I’ve found valuable and aired
several concerns of mine; his views of the XML specs and RSS/Atom are
familiar. But his brief critique of the trend toward putting more and
more features into the database engine–a critique that he whisked
through on the way to grander visions–left open a question about the
basic philosophy of SQL.

When MySQL was bare-bones and lightweight (which it still is compared
to commercial database management systems or PostgreSQL), it put
responsibility in the hands of the application programmer. If a value
was supposed to be limited to a particular range or two columns were
supposed to be entered in tandem, it was the application programmer
that made sure of it.

In contrast, traditional database design takes as much control away
from the application as possible and puts it in the database. A
constraint or trigger or stored procedure or foreign key can make sure
that no one gives someone an absurdly high salary or fires an employee
while leaving his phone number in the database.

This centralized control is a relic of the 1970s, when corporate staff
would sit at command-line processors and type in SQL to do what they
wanted. Nowadays, when an application and even a Web interface stand
between the user and the database engine, the never-trust-the-user
philosophy is less valid. At the very least, an application has to
know the rules the database is enforcing and translate error messages
into something the user can understand. The wall between application
and database engine is porous, so the application can take on more of
the validation and logic.

But both philosophies are valid, and now MySQL offers a choice. I
suggested to Arjen Lentz, the organizer of this year’s conference,
that he offer a debate next year between the application-aware
philosophy and the database-aware philosophy–when is each
appropriate?

Most of us still need to find that phone number for an employee and do
other everyday tasks; we’ll be using a relational database for that,
and MySQL will be providing that service for more and more sites. The
people with day jobs who came this year to find out whether MySQL
could bring home the bacon got their answers. But MySQL can also
support fun applications, and I hope to see more coolness next year.