AddThis Social Bookmark Button

Print

Learning the Mac OS X Terminal, Part 5

by Chris Stone
07/02/2002

The series continues in Learning the Terminal in Mac OS X, Automating Mail from the Mac OS X Terminal, Configuring Email from the Mac OS X Terminal, and Customizing the Mac OS X Terminal.

Today we'll take another look at cron. You'll learn how to have your own crontab run a script regularly and email reports to you just like the system cron jobs do. Your new script will copy a directory of your choosing to another drive for backup. While this is not a complete substitute for backing up with a commercial solution like Retrospect, for example, it does provide a free and easy way to ensure that at least your most important data exists on two drives.

Readers of Derrick Story's article, Taming the Entourage Database will know how important it is to keep your Office X Identities directory regularly backed up, and I'll use that directory as an example for this article. If you don't use Entourage, you can choose instead to back up any other directory, including even your entire ~/Documents folder.

Choosing the Right Tool

The heart of this procedure, then, is the actual command line that does the copying, and that's where we'll start. As with most Unix procedures, there's more than one way to skin a potato (to coin a cat-friendly phrase), so Mac OS X comes with several command-line tools that can copy files. Of these, there are four that you might consider for this job, but really only one that will do just what we need.

Related Reading

Learning Unix for Mac OS X
By Dave Taylor, Jerry Peek

For this task, the most important considerations for choosing the right file copy utility is the ability to preserve the permissions, resource forks, and creator/type codes of the original files. If we can ensure that these file attributes will be maintained, we can then safely copy any directory and be confident that the copy will be as good as the original. This is how the four tools meet those criteria:

Utility Preserves Permissions? Preserves Resource Forks & Creator/Type?
cp yes no
CpMac no yes
ditto yes yes
rsync yes no


ditto

As you can see, only one of these utilities, ditto, fills the bill. Briefly, ditto is an Apple-developed tool for copying entire directories. Most important, ditto has a -rsrc option flag that ensures resource forks as well as creator/type codes are preserved in the copies. For example, this command will copy the Office X Identities directory from my home directory to the Backup directory on the external drive "Secondary":

ditto -V -rsrc ~/Documents/"Microsoft User Data/Office X Identities" "/Volumes/Secondary/Backups/Office X Identities"
The above command is all one line.

The additional -V flag turns on verbose copying, which instructs ditto to print a line for each directory and file copies. These lines, then, will make their way into the cron job's emailed report looking something like this:

>>> Copying Documents/Microsoft User Data/Office X Identities/
copying file ./Main Identity/Database ... 340328640 bytes
copying file ./Main Identity/Database Cache ... 17432 bytes
copying file ./Main Identity/Mailing Lists ... 20784 bytes
copying file ./Main Identity/Rules ... 20784 bytes
copying file ./Main Identity/Signatures ... 12560 bytes
copying file ./Newsgroup Cache ... 8 bytes

Here are a few things you should note about the two pathnames in the command:

  • Just like with a cp command, the first pathname names the source directory to be copied, and the second names its destination directory.
  • Normally, you would surround an entire pathname with quotes to "escape" any special characters it held (spaces, in this case). However, the source directory pathname begins with a tilde ("~"), a special character whose meaning we want to preserve as the shortcut to the home directory. One way to have our shortcut and spaces too, is to just quote the section of the pathname containing the spaces. We could instead just preface each space with a backslash to escape them, but that doesn't look quite as good.
  • In almost all cases, the path to your external and network volumes starts from /Volumes.
  • Leaving the trailing slashes off the pathnames will ensure that ditto will correctly create that destination directory if it doesn't yet exist.

To learn more about ditto, consult its man page with the man ditto command.

So, for our purposes, using ditto as shown in this example will do just what we need. There is, however, another option you should be aware of, rsync.

rsync

rsync is a directory synchronization tool that's smart enough to copy only new or changed files, thus speeding up some backups significantly. If you plan to back up a large directory that only partially changes from day to day (your entire ~/Documents directory, for example), rsync would seem to be the best solution. Unfortunately, as you see in the table, rsync doesn't preserve resource forks.

However, the good news is that there is a resource-fork-aware version of rsync in development called RsyncX. It's available here. (The installation contains a GUI front-end as well.) If you decide to use RsyncX instead of ditto, the documentation at the RsyncX site will get you going. Test your RsyncX commands well, making sure that all data copies properly. To use RsyncX in place of ditto in the command line above, for example, you would use:

rsync -ave /usr/bin/ssh ~/Documents/"Microsoft User Data/Office X Identities" "/Volumes/Secondary/Backups/Office X Identities"
The above command is all one line.

Note that RsyncX still uses the rsync command. In fact, this command line would run with the original version of rsync installed with Mac OS X, but would not preserve resource or creator/type codes forks properly. Note also that, in my tests, when RsyncX does copy a file, it does so much more slowly than ditto. In the case of the Office X Identities directory, then, where most data does change between backups, ditto will in fact perform a faster backup, and therefore is the better choice for this task.

The Shell Script

Your next step should be to determine the proper ditto command line for your system and test it several times from the prompt, making sure that it works repeatedly, and that all copied data is in good shape.

Once you've worked out a good ditto command line, the next step is putting it into a shell script file. Just like the jobs in the system crontab refer to the actual daily, weekly, and monthly scripts that do the heavy work, so will your crontab refer to a "backup" script. Having this command in a separate script file will, for one thing, allow you to easily run it manually whenever you like.

The conventional directory for storing user scripts is ~/bin, which you should create if it doesn't already exist:

mkdir ~/bin

The ~/bin directory is good place for your scripts since, by default, the shell will look for executables there whenever it receives a command. This directory path is one of several known collectively as your search path. This list allows the shell to quickly execute a file without it having to search the entire filesystem, or you having to include the full pathname to each executable.

We'll call the file backup.sh, using the convention of naming Bourne shell scripts such as this with the .sh extension. Create the file with pico using this command:

pico ~/bin/backup.sh

Once in pico, enter this first line, which tells the shell to use /bin/sh (the Bourne shell) to run the script:

#!/bin/sh

Next use echo to output what will become the first line of your cron report:

echo "Results of the daily backup:"

Then enter your version of this ditto command line:

ditto -V -rsrc ~/Documents/"Microsoft User Data/Office X Identities" "/Volumes/Secondary/Backups/Office X Identities"

Your pico session should then look something like this:

pico Session

Remember that pico will not display wrapped lines, but instead use the $ symbol, where a line goes beyond the edge of the window.

Finally, type control + O, return, and then control + X to save the file and exit pico as usual.

The script will work fine as is when called from cron, but if you ever want to run it by name from the command line, you'll first need to make it executable, and then have the shell rebuild its list of executables found in its search path. To do this, use chmod to set the file's executable bit:

chmod +x ~/bin/backup.sh

Then enter the rehash command to rebuild the list of executables:

[localhost:~] chris% rehash
[localhost:~] chris%

This, then, will allow you to execute the script by simply entering the name of the script:

[localhost:~] chris% backup.sh
Results of the daily backup:
>>> Copying /Users/chris/Documents/Microsoft User Data/Office X Identities

Once you're sure that the backup.sh script is working well, it is then time to create your crontab.

Pages: 1, 2

Next Pagearrow