Related link: http://www.mysql.com/news-and-events/users-conference/

If you attend enough computer conferences, you run into
every occupation on Earth. At the
MySQL Users Conference
last week, I sat next to a person at lunch who announced
proudly that his job was to destroy data. He works for a
firm that specializes in data services for law cases, and at
the end of many cases the judge orders the total destruction
of data related to the case.

On the other side of me, at that lunch, sat a database
administrator whose facility is planning a migration from
Oracle to MySQL. A few years ago, people might assume a site
would start with MySQL and move up to Oracle as its needs
grew. Now there’s a quiet trend in the other direction. (I
should mention here, though, that MySQL managers downplay
the obviously competitive situation and like to say that the
different products are for different markets.)

My lunch partner said his firm would save an enormous amount
of money on both licenses and support. I was left with the
impression that Oracle took a big risk by moving from a
perpetual license to a four-year one: they set the timetable
for this company’s move.

In this article I’ll cover:


The business model: why did MySQL grow so fast?

MySQL represents the most impressive market success, exceeded only
perhaps by Apache, in free and open source software. In terms of
installed base, MySQL has left the technically impressive rival
PostgreSQL in the dust. It has marginalized mSQL, SQLite, and SAP DB
(the last of which I’ll return to
later).
It has started to challenge the proprietary database
companies on their own turf, as already mentioned. Nobody
can say why licensing costs for proprietary databases have
plummeted in recent years, but one suspects that it’s due to
MySQL’s competition, as are the large discounts Microsoft
has offered certain customers.

Not convinced yet?

  • MySQL AB claims an installed base of five million systems,
    the largest of any database engine.

  • The mysql.com domain sees almost as much traffic as
    ibm.com.

  • Six hundred attendees flocked to the recent conference.

  • MySQL AB has recently started, and has been heavily
    marketing, its own publishing outlet,
    MySQL Press.

  • MySQL has made the use of a database so commonplace that
    industry observer Clay Shirky, in his recent article
    Situated Software,
    writes:

    You can of course build these kind of features [rapidly
    developed applications for small, localized groups of users]
    in other ways, but MySQL makes the job much easier, so much
    easier in fact that after MySQL, it becomes a different kind
    of job. There are complicated technical arguments for and
    against using MySQL vs. other databases, but none of those
    arguments matter anymore. For whatever reason, MySQL seems
    to be a core tool for this particular crop of new
    applications.

So how did MySQL achieve this charmed status?


A textbook case of a disruptive technology

MySQL, first of all, illustrates in almost pure form the
sequence of events Clayton M. Christensen documented as a
“disruptive technology” in his ground-breaking book The
Innovator’s Dilemma
. Early versions of MySQL lacked the
basic features, such as ACID transactions and referential
integrity, that experienced users expected from a relational
database. In a pattern familiar to anyone who has read
Christensen’s book, knowledgeable observers dismissed MySQL
as a toy.

But MySQL’s very simplicity made it so small and fast that
it quickly won over small users who wouldn’t even understand
what they were missing and how to use the fancy features
offered by “real” database engines. In particular, MySQL
proved ideal for the exploding area of dynamic Web content.

Most indicative of its mantle as a true disruptive
technology, MySQL proved that many of the missing high-end
features weren’t as indispensable as people used to
claim. For instance, referential integrity (jeez, who could
be opposed to integrity?) wasn’t required in a database
when it could be achieved in the application code, often
more reliably. You could also achieve efficient locking
without row-level locks; in fact, supporting row-level locks
took so much overhead that the application was almost better
without them.

Having rewritten the rules for what constituted a useful
relational database engine, MySQL AB proceeded to invest
resources to implement the very features which they were
originally sneered at for lacking. Bit by bit they have
added check-off items to their T-shirts. And what’s most
interesting is how they found the resources to pull off this
kind of upgrade cycle.


The importance of dual-licensing

Of course, any agreement under which you release free
software (other than the public domain) is a license, but
“licensing” usually refers to selling licenses. And MySQL AB
has become one of most successful companies with a
completely complementary dual-licensing model: they offer
everything under an open license for certain users, but
charge money for everything under other circumstances.
(These circumstances will be discussed further down under
The licensing of free
software
.”)
As we’ll see, the parallel existence of GPL licensing and
commercial licensing leaves a mark on every aspect of the
company.

The CEO of MySQL AB, Marten Mickos, said that more than half
of their money comes from license fees. This contrasts with
an impression of open source software left by Novell vice
president Chris Stone in his keynote (described
later).
Stone, claiming that Novell had already settled on a
maintenance model for revenue, suggested that, because of
this, the move to open source will not be as hard for Novell
as for other traditional computer companies. The remarks
implied that an open source business model has to be a
support model, but MySQL AB staff pointed out that support
contracts have been shown to be insufficient to fund software
development. It may be enough in the future, but it’s not
yet.

The other side of dual licensing is equally important. In
terms of adoption, open licenses do more for a software
project than twenty thousand billboards and glossy ads. The
GPL allowed MySQL to penetrate millions of sites that would
never have otherwise known about it.

But the GPL also created a hotbed of user participation that
can be witnessed to this day, as MySQL AB employees
repeatedly ask their users for feedback. MySQL AB also
benefits directly from contributions; for instance, its most
feature-rich storage engine, InnoDB, started as an outside
project.

But MySQL would have remained a stepping stone to other
databases for many people, were it not for its continual
growth and improvement. This rate of improvement is not
exceedingly fast (managers stress that they always check for
stability, correctness, and performance before releasing
enhancements) but it’s fast enough to give customers the
impression that features are worth waiting for–that what
they want will in due time be added to the product.

And there’s a symbiosis between technical development and
payments for licenses. Each requires the other. If a
substantial body of enhancements to MySQL grew up outside
the company–even if they were put under the GPL and MySQL
AB could incorporate them into its version of MySQL–they
would not be part of the value MySQL AB could offer paying
customers. There would thus be few paying customers, and MySQL AB
could not afford to hire people to keep up development. In
order to keep up with customer needs, MySQL AB has managed
one of the coolest tricks in open source development:
keeping most development in-house. And making users happy
about it!

Founder David Axmark told me there’s tremendous power in
having a product unambiguously associated with a single
company. Whereas Linux and Apache belong to everybody and
nobody, MySQL is taken seriously by large companies with
money to spend because there’s a company that owns a
trademark on it and markets it like a proprietary product.

So MySQL succeeds at maintaining two faces. To paying
customers, it’s a traditional, responsible vendor. To
programmers and database administrators, it’s a flexible,
responsive network of independently-minded developers in
free-software style.


SAP adds its muscle

Nobody would be sorry to have the backing that comes from
such a large and well-established corporation as SAP. But in
addition to SAP’s prestige and endorsement of MySQL, what is
the main contribution of the partnership?

Not MaxDB. This is the new name for SAP DB, and was honored
with several sessions at the conference, all poorly
attended.

And probably not the money SAP invested in MySQL AB as part
of the partnership they announced in May 2003. Certainly
this helped to spur the enormous hiring campaign MySQL has
been on during the past year. (They announced that they
doubled the size of their company to 134 staff.)

The impression given by Kaj Arnö, in his presentation
on the SAP partnership, was that the best part lies in the
expertise SAP brings to areas where MySQL needs to upgrade.
SAP DB contains a number of features that MySQL AB would
like to implement, and through the partnership they can do
so much more quickly. In particular, MySQL 5.1 is supposed
to contain server-side cursors, views, standard error
handling, standard security handling, schemas, and
constraints.

There are three reasons for incorporating SAP DB features
into MySQL:

  1. They are genuinely useful.

  2. They are needed to run SAP.

  3. They are ANSI-compliant.

We have to start with the understanding that complete
compliance with the ANSI SQL standard (one always has to
ask, “Which standard?”) is pretty much impossible. See, for
instance, the negative assessments by SQL standards experts
Michael M. Gorman
and
Peter Gulutzan
(the latter now a MySQL AB employee). But MySQL AB would like to
approach compliance with the core SQL standards. They don’t plan
complete compliance even with this limited part, because it would
require them to sacrifice other crucial selling points: speed, and
ease of use and management.

Meanwhile, Arnö laid out a roadmap for merging MySQL
with MaxDB, beginning with a proxy that translates the
protocol used by a MySQL client (and eventually, the
particular syntax of MySQL commands where they differ from
MaxDB) into a format a MaxDB server can recognize.


The licensing of free software

As I said
earlier,
dual-licensing is central to MySQL’s business model. So
under what circumstances must you license MySQL? There’s a
“nice guy” answer that’s fairly clear, and a formal legal
answer that’s considerably murkier.

The nice guy answer is (I believe I am quoting Monty
Widenius directly here): “If you distribute MySQL for free,
you get it for free, but if you charge money for it you give
us money.” I believe this covers most cases neatly. For
instance, I think everybody agrees that a store can run its
inventory application on MySQL, or an airline book tickets
through its Web page backed up by a MySQL database, without
paying for it. These businesses are not making money by
distributing MySQL; they’re just users. And I’m pretty sure
the GPL covers them.

Most situations requiring payment are also clear. If you
enhance the MySQL source code in some way and sell it
without distributing the source code, you have to pay a
license.

But what about the case of the application service provider?
This is a common problem in GPL-land that I don’t believe
has even been resolved. At a Birds of a Feather session at
the MySQL conference one evening–a session well attended by
about 25 very interested people–one programmer for a game
company laid out a situation where they run their multi-user
game on a server backed up by MySQL, and distribute only a
client. Do they have to pay a license? After all, they’re
not distributing MySQL itself.

Zak Greant, a long-time MySQL public figure (listed in the
conference brochure as their “community advocate”) said the
game company should pay. The game could not run without
MySQL, and the client was the means of access by paying
customers.

Several attendees then tried to extend Greant’s
reasoning. Why, then, shouldn’t users of Web browsers pay
license fees for accessing Web pages backed by MySQL? Well,
besides the absurdity of trying to enforce such a payment
regime, the Web server does not use a proprietary,
specialized protocol as the game does.

I found Greant’s argument strained, but I appreciate the
need for MySQL AB to share in the profits from services that
depend on MySQL.

(UPDATE, April 22: Zak wrote to explain the thinking behind
asking a game company to pay a license on a system where the client
and server use the MySQL protocol. It makes a lot more sense now. If
the client and server communicate using the MySQL protocol, the client
is no doubt written with the MySQL library that implements the
protocol. (Who would reinvent the wheel just to save a few bucks?)
Under the GPL–although perhaps not the LGPL–the game client is an
extension of MySQL and qualifies for the commercial license.)

The length and heat of this late-night
argument shows that open source licensing still has to shake
out. But let’s remember that proprietary licensing is an
even deeper pit.

There are clauses in most software licenses (such as
prohibitions of reverse engineering) that are flat-out
illegal. Many more clauses are so ambiguous that any guess
about their interpretation by the courts would be as good as
a coin flip. Many organizations probably pay a lot more in
license fees than they’d have to pay if they took the time
to examine the licenses with a fine-toothed comb and showed
a willingness to go to court. And of course, we’re still
arguing over what’s covered by fair use, what constitutes a
trade secret, and whether the DMCA outlaws Web links to
illegal code.

So let’s see if we can pull ahead of the pack in free
software. Let’s see whether the field can establish a system
that’s readily understandable, fair, and conducive to
growth. I’ll return to this question when covering
Brian Behlendorf’s keynote.


Cluster around and take a close look

My own close look at the new MySQL Cluster product leaves me
puzzled, and several other people I talked to at the
conference had the same feeling.

At recent LinuxWorld conferences, I’ve noticed several
companies marketing cluster solutions that support MySQL.
MySQL AB has apparently decided these companies had a good
idea. At this conference, they announced their own
clustering solution and offered several sessions on it.

MySQL Cluster is a separate network of nodes that replicate
data through striping. The key for each table row (which is
added behind the scenes if the programmer does not specify
it) is hashed to determine which nodes store the row.

At the MySQL server, the clusters are supported by a new
storage engine (a.k.a. table type) that has many of the
features of InnoDB, but apparently not all. Other than
specifying the new storage engine, programmers don’t need to
make any changes to their code, although some types of
optimization are different when working with clusters.
Developer Mikael Ronström–who has been working on this
technology for over 15 years and did an implementation for
phone company Ericsson before coming to MySQL AB–claimed
that MySQL AB offers five to six nines of availability.

Now for the catch. All databases handled by the cluster have
to be stored in primary memory. One can spread the data
across several nodes, but their combined memory is a limit
on the size of databases.

In discussions, it seemed to several of us that any company
willing to devote 6, 8, or 12 systems to their database will
have more data than fits in a few system’s memory. MySQL
Cluster will add disk storage eventually, but it will take
some time to come, and when it does it will probably erode
some of the vaunted speed advantages of MySQL Cluster. For
instance:

  • Updates will no longer be so fast (nearly as fast as reads,
    currently).

  • Restarting nodes will take longer.

  • Restarting as the main way of recovering from
    inconsistencies may become less appealing.

Emic Networks
gave me a data sheet that compared their product, Emic Application Cluster, to
MySQL Cluster. Everybody is very polite about these matters, of
course, and says that different products are appropriate for different
markets. Essentially, MySQL Cluster offers speed–particularly for
updates–whereas Emic offers larger data sizes. Emic is also more
robust at handling soft failures, such as a node overwhelmed by a high
volume of queries. The key market for MySQL Cluster seems to be
telecom (where the technology emerged), whereas Emic has customers in
more traditional business areas.


So, what is Novell’s Linux client environment?

Not desktop! No–a Linux client. That’s the word I heard
from keynote speaker Chris Stone, the vice president on
whose advice Novell spent 250 million dollars to buy the
companies Ximian and SUSE. When I asked how Novell would
combine all those assets into something new and
synergistically superior, Stone said he couldn’t announce
anything yet, but promised something he called a “Linux
client environment,” something “completely new and
different” and “much better than simply substituting Linux
desktop systems for Microsoft desktop systems” as
Münich did.

Stone also said during his keynote (perhaps in answer to the
anticipated questions about Ximian being based on GNOME
while SUSE features KDE) that people shouldn’t ask “KDE or
GNOME?” but rather that, “The money lies in giving each
customer what it needs.” This might be a Linux-based kiosk
for call centers, a PDA environment for mobile users, and so
on. Specialization is the path to success.

I thought, as I listened to Stone’s keynote, how vendors
switching to open source tend to go through stages.

  1. First, a tentative recognition of the historic shift to free
    software.

  2. Then a phase of loudly announcing over and over (in words attributed
    to Steve Jobs), “We love open source.”

  3. A mingling of their traditional proprietary offerings with
    open source software they licensed from elsewhere.

  4. A serious commitment to adding value in the open source
    area. Further stages are likely to emerge, but I haven’t
    seen them yet during the evolution of major vendors.

HP appears to be in the third stage, whereas IBM has reached
the fourth. Apple lies between the third and fourth stages,
because few people use Darwin on its own (or other software
released under a free license by Apple). Sun is the outlier
here, having jumped into the fourth stage through its
release of OpenOffice.org and JDS, while barely sticking
their toes into the second.

Stone’s speech reflected the second stage of development,
and Novell’s offerings the third. They already sell a number
of their products on SUSE, and can use them to tie together
SUSE with Netware. These products include Novell’s directory
offering, eDirectory, which offers single signon and other
sophisticated identity services, and their clustering
filesystem, Novell Storage Services.

While the 250 million dollar expenditure shows the grit in
Novell’s teeth as it determines to reach the fourth stage, I
can’t say they’ve reached it yet. Ximian is still Ximian and
SUSE is still SUSE. But Stone is hinting that Novell has a
broader vision, and in fact sees the Linux market as broader
than most conventional vendors do.


Snips from the discard bin

Brian Behlendorf of Apache likes to see software development
as an art as well as a science. In his keynote he decried
the approach to development where “software engineers as
cogs.” He also described some of the government efforts
around the world to move from proprietary software to open
source software, driven by pressure from U.S. companies to
get serious about enforcing licenses, and the resulting new
laws that countries have to pass to conform to World Trade
Organization regulations. “The WTO is the open source
software field’s best friend,” Behlendorf put it.


Apple Computer faces a challenge that precisely mirrors
Linux: having captured hearts and minds as a desktop system,
Apple’s Macintosh is trying to push its way into heavier
applications as a server and a basis for clusters. Dr.
Ernest Prabhakar of Apple gave a keynote listing the many
levels where Apple uses free software and insisted they try
to conform to standards when innovating (”to enhance and
open, rather than embrace and extend”). And in classic free
software style, Apple includes development tools on every OS
X system shipped–and not just standard tools such as
gcc, but Apple’s finest programming
environments–so that every user in theory can be a
developer.


Why would
Trolltech,
the vendors of the cross-platform Qt toolkit, show interest
in a conference about a database? While Qt is most famous
for building interfaces–particularly as the basis of the
KDE desktop–its APIs form an umbrella over a huge range
programming activities. Now these include connecting to
relational databases. Thus, Qt takes its place next to Perl
DBI, JDBC drivers, and other APIs from the many other
languages that interface to MySQL. And I suppose this is a
benefit to people who want to build interfaces for many
different platforms, because they can settle on a unified
programming style and expect such conveniences as having
data types from different parts of the application conform.

Most of the API is familiar to any programmer who has made a
connection to a database, but Trolltech went a bit farther
and offered a C++ class that replaces SQL altogether. This
was perhaps going too far. SQL syntax is very flat and very
frustrating–a legacy of its origin in the 1970s, when
language designers expected end-users to type in their
queries manually–but it fits the job it has to do. Trying
to specify the same activities in C++ syntax is even more
awkward and less streamlined. Trenton Schulz of Trolltech
told me that many people expressed the same opinion I had,
and that the non-SQL interface might be removed.


The annoying but irreplaceable syntax of SQL continued to
show its face in MySQL Query Browser, a new graphical tool
for viewing and manipulating data from a MySQL database.
This tool is in some ways an IDE for writing SQL, complete
with such debugging aids as single-stepping and breakpoints.
In other ways, the tool is just a convenient way to look at
and change data, or compare two results from different
queries.

What are the important trends in MySQL adoption?