The Fifteen-Minute SQL Tutorial: Appendix D - Perl for System Administration
by David N. Blank-EdelmanThis excerpt is from Perl for System Administration.
Perl for System Administration is aimed at all levels of administrators on the Unix, Windows NT, or MacOS platforms. Assuming only a little familiarity with Perl, it explores the pockets of administration where Perl can be most useful, including filesystem management, user administration, directory services, database administration, log files, and security and network monitoring. Perl for System Administration is for anyone who needs to use Perl for administrative tasks and needs to hit the ground running.
Relational databases can be an excellent tool for system administration. A relational database is accessed and administered using Structured Query Language (SQL) statements. As a result, it is a good idea for system administrators to learn at least the basics of SQL. The goal of this appendix is not to make you a full-time database programmer or even a real database administrator; that takes years of work and considerable expertise. However, we can look at enough SQL so you can begin to fake it. You may not be able to speak the language, but you’ll at least get the gist if someone speaks it at you, and you’ll know enough to go deeper into the subject if you need to. In Chapter 7, we’ll use these basic building blocks extensively when we integrate SQL and Perl.
SQL is a command language for performing operations on databases and their component parts. Tables are the component parts you’ll deal with most often. Their column and row structure makes them look a great deal like spreadsheets, but the resemblance is only surface-level. Table elements are not used to represent relationships to other elements—that is, table elements don’t hold formulas, they just hold data. Most SQL statements are devoted to working with the data in these rows and columns, allowing the user to add, delete, select, sort, and relate it between tables.
Let’s go over some of the operators offered by SQL. If you want to experiment with the operators we’ll be discussing, you’ll need access to an SQL database. You may already have access to a server purchased from Oracle, Sybase, Informix, IBM, Microsoft, etc. If not, an excellent open source database called MySQL can be downloaded from http://www.mysql.org.
For this appendix, we’ll be using mostly generic SQL, though each database server has its own SQL quirks. SQL statements particular to a specific database implementation will be noted.
The SQL code that follows will be shown using the capitalization standard found in most SQL books. This standard capitalizes all reserved words in a statement.
Most of the example SQL code in this appendix will use a table that mirrors the flat-file machine database we saw in Chapter 5. As a quick refresher, Table 4.1 shows how that data looks in table form.
Table D.1. Our Machine Database
|
name |
ipaddr |
aliases |
owner |
dept |
bldg |
room |
manuf |
model |
|---|---|---|---|---|---|---|---|---|
|
shimmer |
192.168.1.11 |
shim shimmy shimmydoodles |
David Davis |
software |
main |
309 |
Sun |
Ultra60 |
|
bendir |
192.168.1.3 |
ben bendoodles |
Cindy Coltrane |
IT |
west |
143 |
Apple |
7500/100 |
|
sander |
192.168.1.55 |
sandy micky mickydoo |
Alex Rollins |
IT |
main |
1101 |
Intergraph |
TD-325 |
|
sulawesi |
192.168.1.12 |
sula sulee |
Ellen Monk |
design |
main |
1116 |
Apple |
G3 |
In the beginning, the server will be empty and void of objects useful to us. Let’s create our database:
CREATE DATABASE sysadm ON userdev=10 LOG ON userlog=5 GO
This SQL statement creates a 10MB database on the device
userdev with a 5MB log file on the
userlog device. This statement is Sybase/Microsoft
SQL Server-specific, since database creation (when performed at all)
takes place in different ways on different servers.
The
GO command is used with interactive database
clients to indicate that the preceding SQL statement should be
executed. It is not an SQL statement itself. In the following
examples, we’ll assume that GO will be typed after each
individual SQL statement if you are using one of these clients.
We’ll also be using the SQL commenting convention of
“—” for comments in the SQL code.
To remove this database, we can use the DROP
command:
DROP DATABASE sysadm
Now let’s actually create an empty table to hold the information shown in Table 4.1.
USE sysadm -- Last reminder: need to type GO here (if you are using an interactive -- client) before entering next statement CREATE TABLE hosts ( name character(30) NOT NULL, ipaddr character(15) NOT NULL, aliases character(50) NULL, owner character(40) NULL, dept character(15) NULL, bldg character(10) NULL, room character(4) NULL, manuf character(10) NULL, model character(10) NULL )
First we indicate which database (sysadm) we
wish to use. The USE statement only takes effect
if it is run separately before any other commands are executed, hence
it gets its own GO
statement.
Then we create a table by specifying the name, datatype/length, and
the NULL/NOT
NULL settings for each column. Let’s talk a
little bit about datatypes.
It is possible to hold several different types of data in a database
table, including numbers, dates, text, and even images and other
binary data. Table columns are created to hold a certain kind of
data. Our needs are modest, so this table is composed of a set of
columns that hold simple strings of characters.
SQL also allows you to create user-defined aliases for datatypes like
ip_address or employee_id.
User-defined datatypes are used in table creation to keep table
structures readable and data formats consistent between columns
across multiple tables.
The last set of parameters of our previous command declares a column
to be mandatory or optional. If this parameter is set to NOT NULL, a row cannot be added to the table if it lacks data
in this column. In our example, we need a machine name and IP address
for a machine record to be useful to us, so we declare those fields
NOT NULL. All the rest are optional (though highly
desirable). There are other constraints besides
NULL/NOT NULL that can be
applied to a column for data consistency. For instance, one could
ensure that two machines are not named the same thing by changing:
name character(30) NOT NULL,
to:
name character(30) NOT NULL CONSTRAINT unique_name UNIQUE,
We use unique_name as the name of this particular
constraint. Naming your constraints make the error messages generated
by constraint violations more useful. See your server documentation
for other constraints that can be applied to a table.
Deleting entire tables from a database is considerably simpler than creating them:
USE sysadm DROP TABLE hosts
Now that we have an empty table, let’s look at two ways to add new data. Here’s the first form:
USE sysadm
INSERT hosts
VALUES (
'shimmer',
'192.168.1.11',
'shim shimmy shimmydoodles',
'David Davis',
'Software',
'Main',
'309',
'Sun',
'Ultra60'
)The first line tells the server we are going to work with objects in
the sysadm database. The second line selects the
hosts table and adds a row, one column at a
time. This version of the INSERT command is used
to add a complete row to the table (i.e., one with all columns filled
in). To create a new row with a partial record we can specify the
columns to fill, like so:
USE sysadm
INSERT hosts (name,ipaddr,owner)
VALUES (
'bendir',
'192.168.1.3',
'Cindy Coltrane'
)The INSERT command will fail if we try to insert a
row does not have all of the required (NOT NULL)
columns.
INSERT can also be used to add data from one table
to another; we’ll see this usage later. For the rest of our
examples, assume that we’ve fully populated the
hosts table using the first form of
INSERT.
As an administrator, the SQL
command you’ll probably use the most often is
SELECT. SELECT is used to query
information from a server. Before we talk about this command, a quick
disclaimer: SELECT is a gateway
into a whole wing of the SQL language. We’re only going to
demonstrate some of its simpler forms. There is an art to
constructing good queries (and designing databases so they can be
queried well), but more in-depth coverage like this is best found in
books entirely devoted to SQL and
databases.
The simplest SELECT form is used mostly for
retrieving server and connection-specific information. With this
form, you do not specify a data source. Here are two examples:
-- both of these are database vendor specific SELECT @@SERVERNAME SELECT VERSION( );
The first statement returns the name of the server from a Sybase or MS-SQL server; the second returns the current version number of a MySQL server.
To get at all of the data in our hosts table, use this SQL code:
USE sysadm SELECT * FROM hosts
This returns all of the rows and columns in the same column order as our table was created:
name ipaddr aliases owner dept bldg room manuf model --------- ------------ ---------------------------- ----------------- -------- ------ ---- ---------- --------- shimmer 192.168.1.11 shim shimmy shimmydoodles David Davis Software Main 309 Sun Ultra60 bendir 192.168.1.3 ben bendoodles Cindy Coltrane IT West 143 Apple 7500/100 sander 192.168.1.55 sandy micky mickydoo Alex Rollins IT Main 1101 Intergraph TD-325 sulawesi 192.168.1.12 sula su-lee Ellen Monk Design Main 1116 Apple G3
If we want to see specific columns, we just need to specify them by name:
USE sysadm SELECT name,ipaddr FROM hosts
When we specify the columns by name they are returned in the order we specify them, independent of the order used when creating the table. For instance, to see IP addresses per building:
USE sysadm SELECT bldg,ipaddr FROM hosts
This returns:
bldg ipaddr ---------- --------------- Main 192.168.1.11 West 192.168.1.3 Main 192.168.1.55 Main 192.168.1.12
Databases wouldn’t be very
interesting if you couldn’t retrieve a subset of your data. In
SQL, we use the SELECT command and add a
WHERE clause containing a conditional:
USE sysadm SELECT * FROM hosts WHERE bldg="Main"
This shows:
name ipaddr aliases owner dept bldg room manuf model --------- ------------ ---------------------------- ----------------- -------- ------ ---- ---------- --------- shimmer 192.168.1.11 shim shimmy shimmydoodles David Davis Software Main 309 Sun Ultra60 sander 192.168.1.55 sandy micky mickydoo Alex Rollins IT Main 1101 Intergraph TD-325 sulawesi 192.168.1.12 sula su-lee Ellen Monk Design Main 1116 Apple G3
The set
of available conditional operators for WHERE
clauses are the standard programming fare:
= > >= < <= <>
Unlike Perl, SQL does not have separate string and numeric comparison operators.
Conditional operators can be combined with AND/OR
and negated with NOT. We can test for an empty
column using IS NULL or non-empty with
IS NOT NULL. For instance, this SQL code will
show all of the machines without owners listed in our table:
USE sysadm SELECT name FROM hosts WHERE owner IS NULL
If you want to find all of the rows that have a column whose contents
is one of several specified values, you can use the
IN operator to specify a list:
USE sysadm
SELECT name FROM hosts WHERE dept IN ('IT', 'Software')This shows all of the machines in use in either the IT or software
departments. SQL will also allow you to return rows that match a
certain range of values (most useful with numeric or date values)
with the BETWEEN operator. Here’s an example
that shows all of the machines in the main building on the tenth
floor:
USE sysadm
SELECT name FROM hosts
WHERE (bldg = 'Main') AND
(room BETWEEN '1000' AND '1999')Finally, the WHERE clause can be used with
LIKE to choose rows using weak pattern matching
(in comparison to Perl’s regular expressions). For instance,
this will select all of the machines that have the string
“doodles” somewhere in their aliases:
USE sysadm SELECT name FROM hosts WHERE aliases LIKE '%doodles%'
Table 4.2 lists the supported
LIKE wildcards.
Table D.2. LIKE Wildcards
|
Wildcard |
Meaning |
Closest Perl Regexp Equivalent |
|---|---|---|
|
% |
Zero or more characters |
.* |
|
_ |
A single character |
. |
|
[ ] |
A single character that is one of a specified set or range |
[ ] |
Some database servers have added extensions to SQL to allow for
regular expression use in SELECTs. For instance,
MySQL offers the REGEXP operator for use with
SELECT. REGEXP doesn’t
have all the power of Perl’s regular expression engine, but it
offers a substantial increase in flexibility over the standard SQL
wildcards.
Two useful clauses for a
SELECT statement are DISTINCT
and ORDER BY. The first allows us to eliminate
duplicate records returned by a query. If we want a list of all of
the distinct manufacturers represented in our hosts table, we could
use DISTINCT:
USE sysadm SELECT DISTINCT manuf FROM hosts
If we want to see our data returned in a sorted order, we can use
ORDER BY:
USE sysadm SELECT name,ipaddr,dept,owner FROM hosts ORDER BY dept
SQL has several operators that can be used to modify the output
returned by a query. They allow you to change column names, do
summary and intra/intercolumn calculations, reformat how fields are
displayed, perform subqueries, and a whole host of other things.
Please see an SQL book for more detail on
SELECT’s many clause operators.
A new table containing the results of a
query can be created on the fly by using an INTO
clause on some SQL servers:
USE sysadm SELECT name,ipaddr INTO itmachines FROM hosts WHERE dept = 'IT'
This statement works just like those we’ve seen previously,
except the results of the query are added to another table called
itmachines. With some servers, this table is
created on the fly if it does not exist. You can think of this
operator clause as the equivalent of the “>” operator
in most Unix and NT command-line shells.
Our
working knowledge of the SELECT command comes into
play with other commands as well. For instance, the
INSERT command we saw earlier
can also take a SELECT clause. This allows us to
insert query information into an existing table. If our software
department were to merge with IT, we could add their machines to the
itmachines table:
USE sysadm INSERT itmachines SELECT name,ipaddr FROM hosts WHERE dept = 'Software'
If we want to change any of the rows in our table, we can use the
UPDATE command. For example, if all of the
departments in the company moved into a single facility called
Central, we can change the name of the building in
all rows like so:
USE sysadm UPDATE hosts SET bldg = 'Central'
It’s more likely that we’ll need to change only certain
rows in a table. For that task, we use the handy
WHERE clause we saw when discussing the
SELECT operator:
USE sysadm UPDATE hosts SET dept = 'Development' WHERE dept = 'Software'
That changed the name of the Software department to Development. This moves the machine called bendir to our Main building:
USE sysadm UPDATE hosts SET bldg = 'Main' WHERE name = 'bendir'
If we wanted to remove a row or set of rows from a table instead of
updating them, we can use the DELETE command:
USE sysadm DELETE hosts WHERE bldg = 'East'
There’s no way to undo a straight DELETE
operation, so be careful.
Relational databases offer many ways to forge connections between the data in two or more tables. This process is known as “joining” the tables. Joins can get complex quickly, given the number of query possibilities involved and the fine control the programmer has over the data that is returned. If you are interested in this level of detail, your best bet is to seek out a book devoted to SQL.
Here is one example of a join in action. For this example we’ll
use another table called contracts, which
contains information on the maintenance contracts for each of our
machines. That table is shown in Table 4.3.
Table D.3. Our Contracts Table
|
name |
servicevendor |
startdate |
enddate |
|---|---|---|---|
|
bendir |
Dec |
09-09-1995 |
06-01-1998 |
|
sander |
Intergraph |
03-14-1998 |
03-14-1999 |
|
shimmer |
Sun |
12-12-1998 |
12-12-2000 |
|
sulawesi |
Apple |
11-01-1995 |
11-01-1998 |
Here’s one way to relate our hosts table to the contracts table using a join:
USE sysadm SELECT name,servicevendor,enddate FROM contracts, hosts WHERE contracts.name = hosts.name
The easiest way to understand this code is to read it from the middle
out. FROM
contracts, hosts
tells the server that we wish to relate the
contracts and hosts tables.
ON contracts.name = hosts.name says we will match
a row in contracts to a row in
hosts based on the contents of the
name field in each table. Finally, the
SELECT... line specifies the columns we wish to
appear in our output.
Before we close this tutorial section, there are a few more advanced SQL topics you may encounter in your travels.
Some SQL servers allow you to create
different views of a table. Views are like magic
permanent SELECT queries. Once you create a view
using a special SELECT query, the results of your
query stick around and behave like their own table. Views can be
queried like any other table. Modifications to a view, with a few
restrictions, are propagated back to the original table or tables.
Note I said tables. Here’s where the magic of views comes in: a view on a table can be created that consists of a join between that table and another. This view behaves as one large virtual table. Changes to this view are propagated back to the original tables that are part of the join that created the view.
A view can also be created with a new column consisting of calculations performed between other columns in that table, almost like a spreadsheet. Views are useful for more mundane purposes also, like query simplification (i.e., may be able to select fewer columns) and data restructuring (i.e., table users sees a view of the data that doesn’t change, even if other columns in the underlying table structure are modified).
Here’s a view creation example that demonstrates query simplification:
USE sysadm CREATE VIEW ipaddr_view AS SELECT name, ipaddr FROM hosts
Now we can use a very simple query to get back just the information we need:
USE sysadm SELECT * FROM ipaddr_view
The result of this query is:
name ipaddr ------------------------------ --------------- shimmer 192.168.1.11 bendir 192.168.1.3 sander 192.168.1.55 sulawesi 192.168.1.12
Like tables, views are dropped using a form of the
DROP command:
USE sysadm DROP VIEW ipaddr_view
In all of the queries we’ve seen above, we’ve asked the server to hand us back all of the results once the query has completed. Sometimes it is preferable to receive the answer to a query one line at a time. This is most often the case when embedding SQL queries in other programs. If your query returns tens of thousands of lines, chances are pretty good that you’ll want to process the results one line at a time, rather than storing them all in memory for later use. Most SQL programming in Perl uses this line-at-a-time method. Here’s a small native-SQL program that demonstrates cursor use on a Sybase or MS-SQL Server:
USE sysadm
-- declare our variables
DECLARE @hostname character(30)
DECLARE @ip character(15)
-- declare our cursor
DECLARE hosts_curs CURSOR FOR SELECT name,ipaddr FROM hosts
-- open this cursor
OPEN hosts_curs
-- iterate over table, fetching rows one at a time,
-- until we receive an error
FETCH hosts_curs INTO @hostname,@ip
WHILE (@@fetch_status = 0)
BEGIN
PRINT "----"
PRINT @hostname
PRINT @ip
FETCH hosts_curs INTO @hostname,@ip
END
-- close the cursor (not strictly necessary when followed
-- by a DEALLOCATE)
CLOSE hosts_curs
-- undefine cursor def
DEALLOCATE hosts_cursThis produces the following output:
---- shimmer 192.168.1.11 ---- bendir 192.168.1.3 ---- sander 192.168.1.55 ---- sulawesi 192.168.1.12
Most database systems allow you to
upload SQL code to the server where it is stored in an optimized,
post-parsed form for faster execution. These uploads are known as
stored procedures. Stored procedures are often a
critical component of SQL for administrators because large parts of
server administration for some servers rely on them. For example, to
change the owner of the sysadm database in
Sybase, you might do this:
USE sysadm sp_changedbowner "jay"
See Chapter 7 for examples of calling stored procedures. Now that you’ve seen the basics of SQL, you’re ready to tackle Chapter 7.
If you enjoyed this excerpt, buy a copy of Perl for System Administration
