Tired of not knowing what SQL is? Any serious application developer is going to need to learn some SQL at some point in their career. And once you start learning it, you'll use it. A lot.
This column is all about SQL and to get things started, this time we'll be focusing on what SQL is and why you'd want to use it. Trust me, it'll be fun.
Back in the old days, the 1970s, computers were expensive but crucial ways to process data. Banks, large companies, and governments had the need to not only store but manipulate huge stores of information. Of course IBM was the prime source of all things computer at that time, and a research group in San Jose came up with a new database paradigm based on relational calculus. System R was a giant step forward because it focused on the relationships between data instead of efficient tape storage or pure sequential data dumps. Relational databases may seem passà in the era of object-oriented programming, but at the time it was a giant leap forward.
Part of the development of System R was a non-procedural, roughly English-like language for structuring database queries. This language was dubbed Structured English Query Language, or SEQUEL. Other vendors saw the power in relational databases as a tool, so Oracle and other companies released their own relational databases that included languages based on the same principles as SEQUEL. As more relational database products were developed, it became clear that a standard was necessary, so in 1986 the so-called SQL-86 standard was adopted by the American National Standards Institute (ANSI) and early the following year by the International Standards Organization (ISO). Since 1986, three new versions of the SQL standard have been developed. SQL-89, SQL-92, and last year's SQL-99 (also called SQL3) all add additional functionality to the specification and tend to be driven by developments in the database tool community.
The SQL standard is not a simple document - over 2000 pages of hardcore technical jargon, relational calculus, logic, and standards-speak does not make compelling reading! But SQL is probably one of the few 30-year-old standards you're likely to run into in your daily programming life. Even though object-oriented databases are currently far more sexy, SQL plays a fundamental role in virtually any significant application that reads, stores, or manipulates data. Long a client-server workhorse, SQL has been leveraged by a number of web tools, ranging from Perl and Java to PHP, ColdFusion, and ASP. If you haven't run into SQL yet, you're going to. Prepare now, gentle reader -- we'll have you speaking SQL in no time!
All right, let's be honest right off the bat. No tool fully implements the SQL3 standard. In fact, a lot of tools don't implement the full SQL-92 standard. Or the SQL-89 standard. You get the picture. Most implement large portions of the standards and for all practical purposes, everything you're likely to need. In my mind, SQL is much more of a guideline than a strict standard. Of course ANSI and ISO would argue with me, but for all intents and purposes, SQL serves as a common glue for fundamental relational database development. SQL also is key to making database applications portable not only across operating systems, but also across database platforms as well. Plus it provides relatively easy access to a great deal of data manipulation power for folks who aren't too technical. As we'll see in future columns that delve into the arcana of SQL, it also normally reads intelligibly, unlike Perl for example, where phrases like '/a(.*)b\1c/;' have meaning to the initiated.
The SQL standard defines three major classes of entities: objects, data types, and language elements. Table 1 has a quick overview of these classes and some examples. If you've done any work with databases, most of these terms should ring bells.
Table 1: SQL Language Entities
table, column, procedure, function
float, real, int, but, char, boolean
keywords, tokens, and special characters that make up SQL (
So the standard provides the basic definitions for the functionality of an arbitrary database. The next step is up to the development team for a specific database tool - they decide which features to include and ignore; which features to add that aren't in the specification; and which parts of the specification they'll implement in a non-standard method. The extent of any single product's conformance with the SQL standard is basically driven by the standard programming principles of costing - time and effort required for development compared to the essential nature of that feature plays a strong role. It's easy to see why Oracle costs thousands (tens or hundreds) with its rich feature set compared to something that's free like mSQL. But it's important to note that both use standard SQL objects, data types, and language elements so that most (or even all) of your application will port between the two databases.
If you've ever taken a good computer science course, you've talked about all the fundamental programming concepts -- algorithms, control constructs, and data structures, for example. But once you've created those data structures for your program, you need to store them somewhere. In some instances, especially for simple data, a straightforward sequential text file is fine. In other cases, you may implement a random-access binary file for your data records. But when your data starts to increase in size and complexity, and your application needs to be robust and fast, you'll probably wander towards the world of database tools. The advantage of choosing a database tool to manage your application data is that you get to be lazy. I know that I can't even write a good bubble sort off the top of my head, let alone some seriously efficient searching algorithm for more complex data. But if I abstract the data functionality into a database, I can simply deal with the interface to that database and rely on thousands of person-years that have been spent optimizing relational database functionality. As far as the programmer is concerned, that interface to the world of high-powered data crunching is SQL.
Imagine writing an algorithm to search through a text file of tab-delimited (columnular) data to find a particular last name (Ashenfelter) in the 3rd column of the file of authors. You'd have to deal with all sorts of hard stuff like arrays, comparison operators, recursion, and other things you'd hope to leave behind in your computer science courses or the unread computer books stacked on the bookshelf next to your computer. I mean, all you want to do is select the text "Ashenfelter" from the column that holds the last name in the data file! Well, I can rewrite that request in SQL while you write the algorithm in the language of your choice (I'd vote for Perl). Go ahead, I'll wait…
SELECT Lastname FROM Authors WHERE Lastname='Ashenfelter';
I bet I beat you, and that my code is shorter. Note the similarities between the SQL code in the example above and the English request in the previous paragraph. Basically, you've got a verb (SELECT), a subject the verb acts on, a few modifiers, and a final semi-colon. I don't have to know details of the algorithm, physical data storage methods, or even the brand of the database I'm querying - SQL hides all of that behind a simple, standard interface. That, in a nutshell, is the beauty of a standard language.
Note: All SQL statements must end with a semicolon (;), but capitalization is not required.
Most languages and application platforms provide some underlying technology to connect to a relational database, as shown in Table 2. These technologies provide the tools to open a database connection between the programming environment and the database engine, and basically allow the developer to pass SQL code into the database on that connection. Many higher-end databases also provide native call-level interfaces that can be called from programming languages like C and C++.
Table 2: Database--application technology interfaces
Visual Basic, ColdFusion, ASP, and many others
Microsoft tools (ASP, VB)
Open DataBase Connectivity (ODBC) is a standard interface for vendor-independent communication between an application and a database. ODBC provides an interface where database-specific drivers are used to translate an ODBC request into a native database call. The other tools in Table 2 basically provide the same vendor-independent layer for communicating with a database -- as long as there's a driver for a given SQL database, you can swap the databases on the back end of the application by copying the data to the new platform and installing the new driver.
In many real-world scenarios, you may be using ODBC to connect an application to a database and then using SQL to frame the query. This is especially true of web application built using tools like ColdFusion and ASP. This make SQL a great tool for all the lazy programmers, which included most good code hackers. You learn SQL once and you're golden -- the server folks install the proper ODBC drivers, the DBA manages the database and creates the structure, and you simply stick the data in there with SQL. You can develop on an NT box using ColdFusion and Access and then deploy on a multiprocessor Solaris box running Oracle simply by swapping in the appropriate ODBC driver. Ditto for Perl, or JDBC, or just about any other tool. The only caveat is that you need to make sure that your SQL is fairly cross-platform. That means no proprietary extensions, no esoteric structures or unusually complex nesting. In most cases, that won't limit you a bit.
I hope you're now convinced that SQL is the best thing since chocolate-covered expresso beans or H2Joe. This column is dedicated to getting you into SQL and saving you development time and effort. I'll be coming at you weekly with information on SQL keywords, functions, tricks, and hacks. We'll talk about the quirks of commands in different database, and we'll occasionally explore the wider world of database applications, particularly those delivered over the web. Until then, feel free to send questions, comments, criticism, and flames to firstname.lastname@example.org.
Next: In the second half of his inaugural column, John Paul Ashenfelter introduces us to several good SQL learning tools and begins to define database terms. Continue
Getting Started with SQL
OpenBSD as a Database Server
Uploading, Saving and Downloading Binary Data in a MySQL Database
Discuss this article in the O'Reilly Network Linux Forum.
Return to the Linux DevCenter.
Copyright © 2009 O'Reilly Media, Inc.