Relational databases can be excellent tools for system administration. A relational database is accessed and administered using Structured Query Language (SQL) statements. As a result, it is a good idea for system administrators to learn at least the basics of SQL. The goal of this appendix is not to make you a full-time database programmer or even a real database administrator; that takes years of work and considerable expertise. However, we can look at enough SQL that you can begin to fake it. You may not be able to speak the language, but you’ll at least get the gist if someone speaks it at you, and you’ll know enough to go deeper into the subject if necessary. These basic building blocks are used extensively in Chapter 7, SQL Database Administration, where we integrate SQL and Perl.
This excerpt is from Automating System Administration with Perl, Second Edition . Thoroughly updated and expanded in its second edition to cover the latest operating systems, technologies, and Perl modules, Automating System Administration with Perl will help you perform your job with less effort. The second edition not only offers you the right tools for your job, but also suggests the best way to approach particular problems and securely automate pressing tasks.
SQL is a command language for performing operations on relational databases and their component parts. Tables are the component parts you’ll deal with most often. Their column and row structure makes them look a great deal like spreadsheets, but the resemblance is only surface-level. Table elements are not used to represent relationships to other elements—that is, table elements don’t hold formulas, just data. Most SQL statements are devoted to working with the data in these rows and columns, allowing the user to add, delete, select, sort, and relate it between tables.
Let’s go over some of the operators offered by SQL. If you want to experiment with the operators we’ll be discussing, you’ll need access to a SQL database. You may already have access to a server purchased from Oracle, Sybase, IBM, Microsoft, or elsewhere. If not, you can download an excellent open source database called MySQL from http://www.mysql.org. Another, simpler (no server required) open source database engine can be found at http://www.sqlite.org.
For this appendix, we’ll be using a mostly generic SQL dialect, though each database server has its own SQL quirks. SQL statements particular to a specific database implementation will be noted.
The SQL code that follows will be shown using the capitalization standard found in most SQL books. This standard capitalizes all reserved words in a statement.
Most of the example SQL code in this appendix will use a table that mirrors the flat-file machine database we saw in Chapter 5, TCP/IP Name and Configuration Services. As a quick refresher, Table D.1, “Our machine table” shows how that data looks in table form.
Prod: Not sure this will work as a table with the long values and 9 columns, but AU says it was OK in 1st edition?
Table D.1. Our machine table
shim shimmy shimmydoodles
sandy micky mickydoo
CREATE DATABASE sysadm ON userdev=10 LOG ON userlog=5 GO
This SQL statement creates a 10 MB database on the device
userdev with a 5 MB log file on the
userlog device. This statement is
Sybase/Microsoft SQL Server-specific: database creation (when performed at
all) takes place in different ways on different servers. In this case, it is placing the database in a predefined
storage device (an area defined as part of the storage allocation for the
database server) and keeping the logging information (all of the info
about the operations on the database and other housekeeping info) in a
GO command is used with
some interactive database clients to indicate that the preceding SQL
statement should be executed. (These clients also often provide
additional commands beyond just plain SQL for working with the
databases; e.g., MySQL has a
command for displaying information about tables.) It is not a SQL statement itself. Other databases require you to type
a semicolon at the end of each statement. In the following examples,
we’ll assume that
GO or a semicolon
will follow each individual SQL statement if you are using one of these
clients. We’ll also be using the SQL commenting convention of
-- for comments in the SQL code.
DROP DATABASE sysadm
Now let’s actually create an empty table to hold the information shown in Table D-1:
USE sysadm -- Last reminder: you need to type GO or ; here (if you are using -- an interactive client that requires this) before entering the -- next statement CREATE TABLE hosts ( name varchar(30) NOT NULL, ipaddr varchar(15) NOT NULL, aliases varchar(50) NULL, owner varchar(40) NULL, dept varchar(15) NULL, bldg varchar(10) NULL, room varchar(4) NULL, manuf varchar(10) NULL, model varchar(10) NULL )
Next, we create a table by specifying the name, data type/length,
NULL/NOT NULL settings for each
column. Let’s talk a little bit about data types.
It is possible to hold several different types of data in a database
table, including numbers, dates, text, and even images and other binary
data. When each column is created, the kind of data it will hold is
specified. Our needs are modest, so this table is composed of a set of
columns that hold simple strings of
varchars (non-space-padded characters). Some SQL
servers allow you to create user-defined aliases for data types, like
employee_id. User-defined data types are used in
table creation to keep table structures readable and data formats
consistent between columns across multiple tables.
The last parameter in our previous command declares a column to be
either mandatory or optional. If this parameter is set to
NOT NULL, a row cannot be added to the table if
it lacks data in this column. In our example, we need a machine name and
IP address for a machine record to be useful to us, so we declare those
NOT NULL. All the rest (though
highly desirable) are optional, so we declare them
NULL. There are other constraints besides
NULL that can be applied to a column for the purposes of data
consistency. For instance, in some SQL dialects, we could ensure that two
machines cannot have the same name by changing this:
name varchar(30) NOT NULL,
name varchar(30) NOT NULL CONSTRAINT unique_name UNIQUE,
unique_name is the name of
this particular constraint. Naming your constraints makes the error
messages generated by constraint violations more useful. See your server
documentation for other constraints that can be applied to a table.
USE sysadm DROP TABLE hosts
USE sysadm INSERT hosts VALUES ( 'shimmer', '192.168.1.11', 'shim shimmy shimmydoodles', 'David Davis', 'Software', 'Main', '309', 'Sun', 'Ultra60' )
The first line tells the server we are going to work with objects in
sysadm database. The second line
hosts table and adds a row,
one column at a time. This version of the
INSERT command is used
to add a complete row to the table (i.e., one with all columns filled
To create a new row with a partial record we can specify the columns to fill, like so:
USE sysadm INSERT hosts (name,ipaddr,owner) VALUES ( 'bendir', '192.168.1.3', 'Cindy Coltrane' )
INSERT command will fail if
we try to insert a row that does not have all of the required (
NOT NULL) columns.
INSERT can also be used to add
data from one table to another; we’ll see this usage later. For the rest
of our examples, assume that we’ve fully populated the
hosts table using the first form of
As an administrator, the SQL command you’ll probably use the most
SELECT is used to query information from a
server. Before we talk about this command, a quick disclaimer:
SELECT is a gateway into a whole wing of the SQL
language. We’re only going to explore some of its simpler forms. There is
an art to constructing good queries (and designing databases so they can
be queried well), but more in-depth coverage of this topic is best found in
books entirely devoted to SQL and databases.
SELECT form is used
mostly for retrieving server- and connection-specific information. With
this form, you do not specify a data source. Here are three
-- Sybase/MS-SQL - retrieve server name SELECT @@SERVERNAME -- MySQL - retrieve current version SELECT VERSION(); -- Oracle - retrieve STARTUP_TIME SELECT STARTUP_TIME from v$instance;
These examples show significant differences in the retrieval of database-specific information.
USE sysadm SELECT * FROM hosts
This returns all of the rows and columns, in the column order in which our table was created:
name ipaddr aliases owner dept bldg room manuf model --------- ------------- -------------------------- --------------- --------- ----- ----- ------ ------------- shimmer 192.168.1.11 shim shimmy shimmydoodles David Davis Software Main 309 Sun M4000 bendir 192.168.1.3 ben bendoodles Cindy Coltrane IT West 143 Apple Mac Pro sander 192.168.1.55 sandy micky mickydoo Alex Rollins IT Main 1101 Dell Optiplex 740 sulawesi 192.168.1.12 sula su-lee Ellen Monk Design Main 1116 Apple Mac Pro
If we want to see specific columns, we just need to specify them by name:
USE sysadm SELECT name,ipaddr FROM hosts
When we specify the columns by name they are returned in the order we specify them, independent of the order used when the table was created. For instance, to see IP addresses per building, we could use this query:
USE sysadm SELECT bldg,ipaddr FROM hosts
bldg ipaddr ---------- --------------- Main 192.168.1.11 West 192.168.1.3 Main 192.168.1.55 Main 192.168.1.12
USE sysadm SELECT * FROM hosts WHERE bldg='Main'
name ipaddr aliases owner dept bldg room manuf model --------- ------------- -------------------------- ------------- --------- ----- ----- ------ ------------- shimmer 192.168.1.11 shim shimmy shimmydoodles David Davis Software Main 309 Sun M4000 sander 192.168.1.55 sandy micky mickydoo Alex Rollins IT Main 1101 Dell Optiplex 740 sulawesi 192.168.1.12 sula su-lee Ellen Monk Design Main 1116 Apple Mac Pro
The set of available conditional operators for
WHERE clauses contains the standard
= > >= < <= <>
Unlike Perl, SQL does not have separate string and numeric comparison operators.
Conditional operators can be combined with
AND/OR and negated
NOT. We can test for an empty
IS NULL, or for a
non-empty column with
IS NOT NULL.
For instance, this SQL code will show all of the machines without owners
listed in our table:
USE sysadm SELECT name FROM hosts WHERE owner IS NULL
USE sysadm SELECT name FROM hosts WHERE dept IN ('IT', 'Software')
This shows all of the machines in use in either the IT or software
departments. SQL will also allow you to return rows that match a certain
range of values (most useful with numeric or date values) with
Here’s an example that shows all of the machines in the main building
between the 10th and 19th floors (presuming you use a simple convention
for room numbers):
USE sysadm SELECT name FROM hosts WHERE (bldg = 'Main') AND (room BETWEEN '1000' AND '1999')
WHERE clause can
be used with
LIKE to choose rows
using weak pattern matching (weak, that is, in comparison to Perl’s
regular expressions). For instance, this will select all of the machines
that have the string “doodles” somewhere in their aliases:
USE sysadm SELECT name FROM hosts WHERE aliases LIKE '%doodles%'
Table D.2, “LIKE wildcards” lists the supported
Table D.2. LIKE wildcards
Closest Perl regexp equivalent
Zero or more characters
A single character
A single character that is one of a specified set or range
Some database servers have added extensions to SQL to allow for
regular expression use in
SELECTs. For instance, MySQL offers the
operator for use with
REGEXP doesn’t have all the power of
Perl’s regular expression engine, but it offers a substantial increase
in flexibility over the standard SQL wildcards.
USE sysadm SELECT COUNT(*) FROM hosts
The second allows us to eliminate duplicate records returned by a
query. If we want a list of all of the distinct manufacturers
represented in our
hosts table, we
USE sysadm SELECT DISTINCT manuf FROM hosts
The third clause allows us to specify the order of the returned
results. If we want to see our data returned in a sorted order, we can
USE sysadm SELECT name,ipaddr,dept,owner FROM hosts ORDER BY dept
Experienced database users often habitually add
ORDER BY clauses to queries that return
multiple rows because it makes dealing with the returned information
SQL has several operators that can be used to modify the output
returned by a query. They allow you to change column names, do summary
and intra/intercolumn calculations, reformat how fields are displayed,
perform subqueries, and a whole host of other things. Please see a
dedicated SQL book for more detail on
SELECT’s many clause operators.
USE sysadm SELECT name,ipaddr INTO itmachines FROM hosts WHERE dept = 'IT'
This statement works just like those we’ve seen previously, except
that the results of the query are added to another table called
itmachines. With some servers, this table is
created on the fly if it does not exist. You can think of this operator
clause as the equivalent of the
operator in most Unix- and Windows-based operating system command-line shells.
Some database servers (like MySQL) do not support
INTO; they require the use of different syntax. For example,
Oracle uses something like this:
CREATE TABLE COPY AS SELECT name,ipaddr FROM hosts WHERE dept = 'IT'
Some other servers instead use an
INSERT command to
perform this action. Still others, such as Microsoft SQL Server and
Sybase, require that a special flag be set on a database before
SELECT INTO can be used within that
database, or the command will fail.
Our working knowledge of the
command comes into play with other commands as well. For instance, the
INSERT command we saw earlier can also
SELECT clause. This
allows us to insert query information into an existing table. If our
software department were to merge with IT, we could add their machines to
USE sysadm INSERT itmachines SELECT name,ipaddr FROM hosts WHERE dept = 'Software'
If we want to change any of the rows in our table, we can use
UPDATE command. For
example, if all of the departments in the company moved into a single
Central, we could
change the name of the building in all rows like so:
USE sysadm UPDATE hosts SET bldg = 'Central'
USE sysadm UPDATE hosts SET dept = 'Development' WHERE dept = 'Software'
That changed the name of the Software department to Development. This moves the machine called bendir to our Main building:
USE sysadm UPDATE hosts SET bldg = 'Main' WHERE name = 'bendir'
USE sysadm DELETE FROM hosts WHERE bldg = 'East'
While there’s no standardized way to undo a straight
DELETE operation, you can gain some safety using transactions (outside the
scope of this appendix). In many cases you can run the
DELETE command as a
SELECT first to gain an understanding of just
what effect the
DELETE will have.
Still, be careful with these operations.
Relational databases offer many ways to forge connections between the data in two or more tables. This process is known as “joining” the tables. Joins can get complex quickly, given the number of query possibilities involved and the fine control the programmer has over the data that is returned. There are different flavors of joints (inner, outer, etc.) but we’re not going to get into those here. If you are interested in this level of detail, your best bet is to seek out a book devoted to SQL.
Here is one example of a join in action. For this example we’ll use
another table called
contains information on the maintenance contracts for each of our
machines. That table is shown in Table D.3, “Our contracts table”.
Table D.3. Our contracts table
Here’s one way to relate our
hosts table to the
contracts table using a join:
USE sysadm SELECT contracts.name,servicevendor,enddate,bldg,room FROM contracts, hosts WHERE contracts.name = hosts.name
The easiest way to understand this code is to read it from the
hosts tells the server that we wish to relate the
contracts.name = hosts.name says we will match a row in
contracts to a row in
hosts based on the contents of the
name field in each table. Note that we say
because we need to distinguish which
name field we are using (the one on the
contracts table). Finally, the
SELECT line specifies the columns we wish to
appear in our output.
Before we close this tutorial section, there are a few more advanced SQL topics you may encounter in your travels.
Most SQL servers allow you to create different
views of a table. Views are like magic permanent
SELECT queries. Once you create a
view using a special
query, the specification of your query sticks around. Each time
you access anything from the view, the original query is run to provide
that information. Views can be queried like any other table.
Modifications to a view, with a few restrictions, are propagated back to
the original table or tables.
Note I said tables. Here’s where the magic of views comes in: a view on a table can be created that consists of a join between that table and another. This view behaves as one large virtual table. Changes to this view are propagated back to the original tables that are part of the join that created the view.
A view can also be created with a new column consisting of
calculations performed between other columns in that table, almost like
in a spreadsheet. Views are also useful for more mundane purposes, such
as query simplification (e.g., you may be able to
SELECT fewer columns) and data restructuring
(e.g., table users see a view of the data that doesn’t change, even if
other columns in the underlying table structure are modified).
Here’s a view-creation example that demonstrates query simplification:
USE sysadm CREATE VIEW ipaddr_view AS SELECT name, ipaddr FROM hosts
Now we can use a very simple query to get back just the information we need:
USE sysadm SELECT * FROM ipaddr_view
The result of this query is:
name ipaddr ------------------------------ --------------- shimmer 192.168.1.11 bendir 192.168.1.3 sander 192.168.1.55 sulawesi 192.168.1.12
Like tables, views are dropped using a form of the
USE sysadm DROP VIEW ipaddr_view
Dropping the view has no effect on the underlying data tables.
In all the queries we’ve seen thus far, we’ve asked the server to hand us back all of the results once the query has completed. But sometimes it is preferable to receive the answer to a query one line at a time. This is most often the case when embedding SQL queries in other programs. If your query returns tens of thousands of lines, chances are pretty good that you’ll want to process the results one line at a time, rather than storing them all in memory for later use. SQL programming in Perl often uses this line-at-a-time method. Here’s a small native-SQL program that demonstrates cursor use on a Sybase or Microsoft SQL Server:
USE sysadm -- declare our variables DECLARE @hostname varchar(30) DECLARE @ip varchar(15) -- declare our cursor DECLARE hosts_curs CURSOR FOR SELECT name,ipaddr FROM hosts -- open this cursor OPEN hosts_curs -- iterate over the table, fetching rows one at a time, -- until we receive an error FETCH hosts_curs INTO @hostname,@ip WHILE (@@fetch_status = 0) BEGIN PRINT "----" PRINT @hostname PRINT @ip FETCH hosts_curs INTO @hostname,@ip END -- close the cursor (not strictly necessary when followed -- by a DEALLOCATE) CLOSE hosts_curs -- undefine cursor def DEALLOCATE hosts_curs
This produces the following output:
---- shimmer 192.168.1.11 ---- bendir 192.168.1.3 ---- sander 192.168.1.55 ---- sulawesi 192.168.1.12
Most database systems allow you to upload SQL code to the
server, where it is stored in an optimized, post-parsed form for faster
execution. Such uploads are known as stored
procedures. Stored procedures are often a critical component
of SQL for administrators, because large parts of server administration
for some servers rely on them. For example, to change the owner of the
sysadm database in Sybase, you might
USE sysadm sp_changedbowner "jay"
Some databases also support something called “triggers.” Triggers are stored
procedures that automatically fire when some event takes place in the
database (e.g., when a row gets
INSERTed). Each database vendor implements
triggers slightly differently, so check the documentation of the
database you are using for the details on how to use
CREATE TRIGGER and
Now that you’ve seen the basics of SQL, you’re ready to tackle Chapter 7, SQL Database Administration.
 In fact, different servers even have different ideas about what the meaning of “database” is. The term is broader for an Oracle DBA than it is for a MySQL DBA.
 Experienced SQL users would probably suggest you always specify
the column destinations for each piece of data (even when inserting a
complete row), as per the next example. This makes the
INSERT statement more robust, because it
isn’t prone to errors in order (e.g., if you should add another field
to the database).
 Just to be clear: MySQL 5.x does have a
SELECT .. INTO, but it dumps data to a
regular file, not to a database, as we’ve been discussing. For
MySQL, you’ll want to use
 Oracle 10g and beyond offer a flashback facility that can undo
DROP operations, depending on the amount of
data in play and how much the database has changed since the
destructive operations were performed.
If you enjoyed this excerpt, buy a copy of Automating System Administration with Perl, Second Edition .
Copyright © 2009 O'Reilly Media, Inc.