An Interview with Chris Date
Pages: 1, 2, 3, 4, 5
Tony: On the dedication page you quote Leonardo da Vinci on theory without practice. Do you feel there is a lack of theory in most practice of database design? Do you feel this is a problem for the industry?
Chris: There are several issues here. If by "database design" you really do mean design of databases as such, then (as I explain in Database in Depth) there really isn't very much theory available anyway--though it's true that what little theory does exist is ignored far too often (especially in data warehouses, where some truly dreadful designs are not only found in practice but are actually recommended by certain pundits). But even if designers do abide by what design theory we have, there are still far too many decisions that have to be made without any solid theory to help. We need more science!
On the other hand, if by "database design" you really mean "DBMS design"--if not, then forgive me, but use of the term "database" for "DBMS" and/or "DBMS instance" is all too common and is (in my opinion) very confusing and deserves to be thoroughly resisted--then most certainly there are all too many departures from theory to be observed. And yes, it's a huge problem for the industry, and indeed for society at large (since society at large places such reliance on those defective products). This is an issue I do feel very strongly about, but I'm afraid it'll take a few minutes for me to explain why. Again, please bear with me.
"...a scientific theory doesn't just explain: It also makes predictions, predictions that can be tested and--at least in principle--can be shown to be false."
First of all, it's a very unfortunate fact that the term "theory" has two quite different meanings. In common parlance it's almost a pejorative term--"oh, that's just your theory." Indeed, in such contexts it's effectively just a synonym for opinion (and the adverb merely--it's merely your opinion--is often implied, too). But the meaning in scientific circles is quite different.
To a scientist, a theory is a set of ideas or principles that, taken together, explain something: some set of observable phenomena, such as the motion of the planets of the solar system. Of course, when I say it explains something, I mean it does so coherently: It fits the facts, as it were. Moreover (and very importantly), a scientific theory doesn't just explain: It also makes predictions, predictions that can be tested and--at least in principle--can be shown to be false. And if any of those predictions do indeed turn out to be false, then scientists move on: Either they modify the theory or they adopt a new one. That's the scientific method, in a nutshell: We observe phenomena; we construct a theory to explain them; we test predictions of that theory; and we iterate. That's how the Copernican system replaced epicycles;1 how Einstein's cosmology replaced Newton's; how general relativity replaced special relativity; and so on.
As another example, consider the current debate in the USA over the theory of evolution versus creationism (also known as "intelligent design"). Evolution is a scientific theory: It makes predictions, predictions that can be tested, and in principle it can be falsified (namely, if those predictions turn out to be incorrect). In particular, evolution predicts that if the environment changes, the inhabitants of that environment will probably change too. And this prediction has been widely confirmed: Bacteria evolve and become resistant to medical treatments, insects evolve and become resistant to pesticides, plants evolve and become resistant to herbicides, animals evolve and become resistant to disease or parasites (in fact, something exactly like this seems to have happened in the US recently with honeybees). There's also the well-documented case of the Peppered Moth (Biston betularia) in England, which evolved from a dark form, when the air was full of smoke from the Industrial Revolution, to a light form when the air was cleaned up. By contrast, creationism is not a scientific theory; it makes no testable predictions, so far as I know, and as a consequence it can be neither verified nor falsified.
By the way, Carl Sagan has a nice comment in this regard:
In science it often happens that scientists say, "You know, that's a really good argument, my position is mistaken," and then they actually change their minds, and you never hear that old view from them again. They really do it. It doesn't happen as often as it should, because scientists are human and change is sometimes painful. But it happens every day. I cannot recall the last time something like that happened in politics or religion.
Now let's get back to databases. The point is, the relational model is indeed a theory in the scientific sense--it's categorically not just a matter of mere opinion (though I strongly suspect that many of those who criticize the model are confusing, deliberately or otherwise, the two meanings of the term "theory"). In fact, of course, the relational model is a mathematical theory. Now, mathematical theories are certainly scientific theories, but they're a little special, in a way. First, the observed phenomena they're supposed to explain tend to be rather abstract--not nearly as concrete as the motion of the planets, for example. Second, the predictions they make are essentially the theorems that can be proved within the theory. Thus, those "predictions" can be falsified only if there's something wrong with the premises, or axioms, on which the theorems are based. But even this does happen from time to time! For example, in Euclidean geometry, you can prove that every triangle has angles that sum to 180 degrees. So if we ever found a triangle that didn't have this property, we would have to conclude that the premises--the axioms of Euclidean geometry--must be wrong. And in a sense exactly that happened: Triangles on the surface of a sphere (for example, on the surface of the Earth) have angles that sum to more than 180 degrees. And the problem turned out to be the Euclidean axiom regarding parallel lines. Riemann replaced that axiom by a different one and thereby defined a different (but equally valid) kind of geometry.
In the same kind of way, the theory that's the relational model might be falsified in some way--but I think it's pretty unlikely, because (as I said in my answer to Question 3) the premises on which the relational model is based are essentially those of set theory and predicate logic, and those premises have stood up pretty well for a very long time.
So, to get back (finally!) to your question--do I feel the lack of attention to theory is a problem?--well, of course my answer is yes. As I said in another recent interview: This is the kind of question I always want to respond to by standing it on its head. Database management is a field in which, in contrast to some other fields within the computing discipline, there is some solid theory. We know the value of that theory; we know the benefits that accrue if we follow that theory. We also know there are costs associated with not following that theory (we might not know exactly what those costs are--I mean, it might be hard to quantify them--but we do know there are going to be costs).
If you're on an airplane, you'd like to be sure that plane has been constructed according to the principles of physics. If you're in a high-rise building, you'd like to be sure that building has been constructed according to architectural principles. In the same kind of way, if I'm using a DBMS, I'd like to be sure it's been constructed according to database principles. If it hasn't, then I know things will go wrong. I might find it hard to say exactly what will go wrong, and I might find it hard to say whether things will go wrong in a major or minor way, but I know--it's guaranteed--that things will go wrong.
So I think it's incumbent on people not to say "Tell me the business value of implementing the relational model." I think they should explain what the business value is of not implementing it. Those who ask "What's the value of the relational model?" are basically saying "What's the value of theory?" And I want them to tell me what the value is of not abiding by the theory.



