Understanding CPIOIn the previous article, I demonstrated the usage of the tar archiver
utility. This week I'll continue by introducing the cpio archiver
utility.
While both tar and cpio will achieve the same results, the cpio
utility approaches things a little bit differently. The tar utility
assumes that you want to recursively archive everything under
the specified directory or directories, meaning that you have to
explicitly tell tar if you want to exclude certain portions of that
directory structure. In contrast, the cpio utility expects to be
explicitly told which files or directories you wish to archive; this
behavior is commonly referred to as "receiving from standard input." In other
words, cpio expects to receive a list that contains one file per line, and if
you remember from the "Finding Things in Unix" and "Find: Part Two" articles, that is exactly the type of list that the find utility creates.
The ls utility can also create this type of list, meaning that you will see
either of the ls or the find utilities used in conjunction with cpio. And
since cpio archives a list of files it receives from standard
input, you usually use a pipe (|) whenever you create an archive with the
cpio utility.
The tar utility also assumes that you want to write the archive to
your first SCSI tape drive, unless you explicitly specify a file using
the f switch. In contrast, the cpio utility writes to what is
known as standard output. This means that you will be using a redirector
(either < or >) whenever you are creating, listing, or extracting a cpio
archive file. Again, that file may be an actual file, or it may be your
floppy, or it may be a tape device, since in Unix everything is a file.
This may sound a bit more complicated at first, but a few examples should convince you that it really isn't.
Let's start by creating a cpio archive. In the last article, I created a
test user account and created a directory structure named www in this
user's home directory so I would have some files on which to practice using the
archiving utilities. I'll log in as the test user, cd into the www directory,
and see what happens if I use the ls command with the cpio utility:
cd www
ls | cpio -ov > backup.cpio
You'll note that I first cded into the directory that contained the
files I wished to archive. I used the ls utility to make a list of the
files in the current directory and used a pipe (|) to send that list to
the cpio utility. The o switch invokes what is known as "copy out mode," which tells cpio to create an archive. The v switch tells cpio to be verbose, meaning it will list each file as it archives it. Finally, I used the > redirector to write the results (the archive) to a file called
backup.cpio. I can call this file anything I like; I chose to give it a
cpio extension to remind me that it is a cpio backup file. I can verify the
file type using the file utility:
file backup.cpio
backup.cpio: cpio archive
Instead of using the redirector, I could have also used the F switch to
specify which file to write the archive to. So the following command will
achieve the same results:
ls | cpio -ovF backup.cpio
|
Related Reading Learning the Unix Operating System |
Once the archive was created, cpio told me how many blocks it wrote to
the archive; in my case, it was 48 blocks.
So to create an archive, use the o switch or copy-out mode. To either view
or extract the contents of the archive, use what is known as "copy-in mode." You invoke this mode by using the i switch. If you just want
to view the contents of the archive, also include the t switch, which
will list the contents of the archive without extracting them:
cpio -it < backup.cpio
You'll note that this time I used the other redirector (<), as I wanted
the contents of the backup.cpio file to be sent to the cpio utility. I
can also include the v switch, if I want to see a verbose listing of
the backup:
cpio -itv < backup.cpio
Remember that it is important to view the contents of an archive before
attempting to restore it, as you want to ensure that the files don't begin
with a /.
To restore this archive, I simply cd into the directory to which I'd like to
restore the archive, and repeat the above command without the t
switch. I'll cd back into my home directory and create a directory named
backupand do the restore there:
cd
mkdir backup
cd backup
cpio -iv < ~/www/backup.cpio
You'll note something interesting if you try this exercise yourself; if
you use the ls -F command, you'll see that you did indeed restore all of
the files and directories that were in the www directory. But if you cd
into any of those subdirectories, you'll note that they are empty. Even
more interestingly, if you try to remove any of those subdirectories, you
still have to use the R switch, as they are still valid directories.
What happened here? Since the cpio utility received its file list from the
ls utility (and the ls utility can only list the files in the current
directory), cpio was unaware of all of the files that existed below the
current directory. Remember, cpio will only archive the files that are
sent to it in a list. This may seem odd at first, but it is an ideal way to
archive just the files in the current directory. In order to do this with
the tar utility, you would have to create an exclude file, as tar wants
to recursively copy everything in and below the current directory.
This doesn't mean that cpio can't archive recursively; it simply means
that if you want to just archive the current directory, you use ls and if
you want to archive recursively, you use find instead.
Let's try that backup and restore again, this time using the find
utility. First, I'll remove the old backup and empty out the backup
directory:
rm www/backup.cpio
rm -R backup/*
Then I'll cd into the directory I wish to back up (www) and archive its contents:
cd www
find -d . -print | cpio -ov > backup.cpio
When using the find utility with cpio, it is always a good idea to
include either the d or the depth switch. Remember from the find
article that this switch prevented permissions from interfering with a backup.
When using this switch, either put -d right after the word find and before
the directory to search (in this case, "."), or put the word -depth after
the directory to search, like so:
find . -depth -print | cpio -ov > backup.cpio
So as a recap on the find command, I told find to search the current
directory (".") and to "print" its contents; the | was used to send those
contents to the cpio utility, which created an archive (-o) and wrote
that archive to a file called backup.cpio. When I created this archive,
I noted that cpio wrote 43097 blocks, which is many more than the 48 I
received with the ls command.
|
Now let's see what happens when I try to restore this archive:
cd ../backup
cpio -iv < ~/www/backup.cpio
I received an interesting message on my screen when I did this restore:
<snip>
cpio: mod_tsunami/Makefile: No such file or directory
cpio: mod_tsunami/distinfo: No such file or directory
cpio: mod_tsunami/pkg-comment: No such file or directory
cpio: mod_tsunami/pkg-descr: No such file or directory
cpio: mod_tsunami/pkg-plist: No such file or directory
mod_tsunami
Makefile
.
43097 blocks
It looks like cpio read all 43097 blocks but complained about missing
files or directories. Indeed, if I do an ls on any of the restored
subdirectories, I'll discover that they are once again empty! Don't worry,
all of those files and directories are in that archive file; I've simply
demonstrated the default extraction behaviour of cpio. Unlike tar, the
cpio utility does not recreate any directories during the restore unless
you specifically ask it to with the d switch. And, unlike tar, the
cpio utility will not overwrite any existing files unless you
specifically ask it to with the u switch.
So let's try that restore again, this time using the d switch to create
the directories and the u switch to overwrite the files I've already
restored:
cpio -ivdu < ~/www/backup.cpio
This time I don't receive any error messages and I've successfully restored all of the subdirectories and their files.
There're a few more switches you may consider using when backing up and
restoring with cpio. If I compare the modification times of a file
before it was archived and after it was restored, I will see this:
ls -l www/zope/Makefile
-rw-r--r-- 1 test wheel 4308 May 11 09:53 www/zope/Makefile
ls -l backup/zope/Makefile
-rw-r--r-- 1 test wheel 4308 Jun 2 11:38 backup/zope/Makefile
ls -l www/backup.cpio
-rw-r--r-- 1 test wheel 22065664 Jun 2 10:39 www/backup.cpio
You'll note that the original file was created on May 11, that it was
backed up on June 2 at 10:39, and that it was restored on June 2 at 11:38.
If you want to preserve the file's original time, include the a switch
when creating the archive, and the m switch when restoring the archive:
cd www
find -d . -print | cpio -ova > backup.cpio
cd ../backup
cpio -ivdm < ~/www/backup.cpio
If you try this and repeat the ls -l command, you'll see that the
original times of the archived files were kept intact.
The nice thing about using the find utility with cpio is that you
have all of find's switches available to you, to fine-tune which files
you would like to back up. For example, if you'd like to do an incremental
backup, use find's -newer switch. In this example, I'll back up all of the
files in my home directory that have changed since 11 PM on June 1st:
cd
touch -t 06012300 June1
find -d . -newer June1 -print | cpio -ova > backup.cpio
Here I used the touch utility to create an empty file with a timestamp
of month 06 day 01 time 2300, then I told find to use the time on that
file as the reference point when searching the current directory.
Alternatively, if I wasn't concerned so much about the time as the date, I
could have used find's atime, ctime, or mtime switches. And if I only
want to archive files of a certain size, I can use find's size switch.
Before ending today's article, I'd also like to demonstrate cpio's third
mode, which is known as "copy-pass mode." This is an interesting mode, as it
archives and extracts in the same command, making it ideal for copying one
directory structure and recreating it in another location.
Let's say I want to copy the www directory structure from the home
directory of the test user to the home directory of the user genisis. I'll
have to become the superuser, as I'll be creating the archive in one user's
home directory and recreating it in another user's home directory:
su
Password:
cd ~test/www
find -d . -print | cpio -pvd ~genisis/www
Note that I first cded into the directory I wanted to archive, in this
case the www subdirectory of the test user's home directory. Then, with the
cpio command, I invoked copy-pass mode with the p switch and
specified that I wanted the archive recreated in the www subdirectory of
the home directory of the user genisis.
If I run this command and then do an ls -l of genisis' home directory,
I'll see that I've successfully recreated the entire www directory
structure. However, I'll want to fine-tune that above command as those
restored files still belong to the user "test." I'll repeat that command
using the u switch so it will overwrite that last restore, and I'll
include the R switch, which tells cpio to change the ownership of the files
as it recreates them:
find -d . -print | cpio -pvdu -R genisis ~genisis/www
When using the R switch, follow it by the name of the user you wish to
become the owner of the files, then follow that by the name of the
directory to restore the files to.
Finally, if I want to keep the original times of the files instead
of having them changed to the time the files were restored, I'd also add the
a and m switches:
find -d . -print | cpio -pvduam -R genisis ~genisis/www
This should get you started with the cpio command. If you're planning on
using cpio to copy between different computers, you'll want to read its
manpage first, as there may be considerations, especially if the computers
are running different versions of Unix or different architectures.
In next week's article, I'll continue the archiver series by introducing
the pax command and, if space permits, the dd command.
Dru Lavigne is a network and systems administrator, IT instructor, author and international speaker. She has over a decade of experience administering and teaching Netware, Microsoft, Cisco, Checkpoint, SCO, Solaris, Linux, and BSD systems. A prolific author, she pens the popular FreeBSD Basics column for O'Reilly and is author of BSD Hacks and The Best of FreeBSD Basics.
Read more FreeBSD Basics columns.
Return to the BSD DevCenter.
Copyright © 2007 O'Reilly Media, Inc.