In last week's article, we were left hanging in the
/usr/share/man directory. In today's article, we'll manipulate the files in this directory and learn something about formatted text, unformatted text, compressed data, and control characters. And, after traveling this circuitous route, we may even learn something interesting about manpages.
Let's do an
ls of the directory where the manpages are stored:
cd /usr/share/man ls -aF./ cat4/ catn/ man4/ mann/ ../ cat5/ ja/ man5/ whatis cat1/ cat6/ man1/ man6/ cat1aout/ cat7/ man1aout/ man7/ cat2/ cat8/ man2/ man8/ cat3/ cat9/ man3/ man9/
Remember that there are subdirectories for each of the 9 sections of the manual. The subdirectories that begin with
man contain unformatted data; the subdirectories that begin with
cat contain pre-formatted data. In just a moment, we'll do an exercise that will show the difference between formatted and unformatted data.
But first, do an
ls of the
cat1 directories; I've snipped my output (indicated by
<snip>) to just show the first 10 lines of each. You can
ls more than one directory at a time; use
-C to keep the multi-column output:
ls -C man1 cat1 |moreman1: ./ indent.1.gz pkg_version.1.gz ../ indxbib.1.gz pl2pm.1.gz CC.1.gz info.1.gz pod2html.1.gz Mail.1.gz install-info.1.gz pod2man.1.gz [.1.gz install.1.gz popd.1.gz a2p.1.gz intro.1.gz pr.1.gz addftinfo.1.gz introduction.1.gz printenv.1.gz addr2line.1.gz ipcrm.1.gz printf.1.gz alias.1.gz ipcs.1.gz ps.1.gz alloc.1.gz ipftest.1.gz psbb.1.gz <snip> cat1: ./ indent.1.gz pkg_version.1.gz ../ indxbib.1.gz pl2pm.1.gz CC.1.gz info.1.gz pod2html.1.gz Mail.1.gz install-info.1.gz pod2man.1.gz [.1.gz install.1.gz popd.1.gz a2p.1.gz intro.1.gz pr.1.gz addftinfo.1.gz introduction.1.gz printenv.1.gz addr2line.1.gz ipcrm.1.gz printf.1.gz alias.1.gz ipcs.1.gz ps.1.gz alloc.1.gz ipftest.1.gz psbb.1.gz <snip>
Notice that every file has a
.gz extension. This means that all of the manpages have been compressed to conserve disk space. This is a good thing, as the online manual is huge. The utility used to compress the files is called
When I first discovered
gzip, I thought, "What a great way to conserve disk space"; at the time I had a 602 MB drive and disk space was an issue. I merrily became the superuser (mistake number one), went to the root directory (mistake number two), and told
gzip to compress every file on my FreeBSD system while giving me stats on how much space I had saved by issuing the command
gzip -rv (trust me, you don't want to try that one). After I had finished rendering that installation of FreeBSD useless, I learned a valuable lesson: Keep the files that came compressed with FreeBSD compressed, and keep the files that came uncompressed with FreeBSD uncompressed.
However, feel free to compress any files you have created in your home directory; compression can also be very useful when you want to e-mail a file to someone. Let's say I want to e-mail a friend a PDF file; PDF files are notoriously large, and my poor friend is still using his old 14.4 kbps modem. I can save him some time downloading that e-mail attachment if I do this first:
cd ~/pdf_files ls -l framerel.pdf-rwxr-xr-x 1 genisis wheel 31840 Sep 26 16:01 framerel.pdf
gzip -v framerel.pdfframerel.pdf: 32.9% -- replaced with framerel.pdf.gz
ls -l framerel*-rwxr-xr-x 1 genisis wheel 21392 Sep 26 16:01 framerel.pdf.gz
Notice that the gzip utility was able to compress this file by about a third of its original size; it also replaced the original file with its compressed counterpart and added a
.gz extension to the original name.
When my friend receives this file, he won't be able to do anything with it until he uncompresses it like so:
Note that my friend didn't have to specify the
.gz extension as
gunzip assumes the file it is unzipping will have a
.gz extension and will complain if it doesn't.
This is also a good time to introduce the
file utility; if anyone e-mails you an attachment or you happen to find a file on your FreeBSD system and don't know what type of data it contains, don't just send it to your screen using the
more commands. If it is not a text file, it may do nasty things to your screen. The
file command will tell you what type of data is contained within the file like so:
framerel.pdf.gz: gzip compressed data, deflated, original filename, last modified: Tue Sep 26 16:01:34 2000, os: Unix
gunzip frame* file framerel.pdfframerel.pdf: PDF document, version 1.2
This is very useful information as I now know that the contents of this file will appear as random garbage characters unless I use a reader specifically designed to read
Let's compare these outputs to an executable file, say the
/bin/ls: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), statically linked, stripped
whereis -b lsls: /bin/ls
And finally, let's compare it to a file I created using an editor and saved as myfile:
file myfilemyfile: ASCII text
Out of the four
file commands, the last command was the only one that revealed ASCII text; therefore,
myfile is the only file that is safe to send to the
cat commands or to a text editor.
Now, let's go back to the
/usr/share/man directory to look at the difference between the unformatted manpages contained in the
man subdirectories and the formatted manpages contained in the
cat subdirectories. Since all of the manpages have been
gzipped, you will have to first uncompress the data using the
gunzip utility, then use
more to view the data, then finally remember to re-compress the file so you can continue to conserve disk space. Fortunately, the
zcat utility seamlessly does all three of these steps for you.
whatis manpage, as it is a nice short manpage that fits on one screen. We'll start with the unformatted version:
View the output of this command here.
Notice that this doesn't look anything like the
whatis manpage as you are used to seeing it. (If you forget what the
whatis manpage looks like, do a
man whatis.) Instead, this file contains remarks and formatting commands along with the actual data. Let's compare this to the pre-formatted version
contained within the cat subdirectory:
View the output of this command here.
Notice that the pre-formatted version looks like the manpage you are used to seeing, minus the highlighting. However, something very interesting happens if we try to save a formatted manpage into a file. Let's send the output of
zcatting the formatted
whatis manpage to a test file in our home directory:
zcat cat1/whatis.1 > ~/test
Now, let's view the test file using the
Your test file should look exactly like the
whatis.1 file. Now, send the test file to the
more paging utility:
Your results should look exactly like a manpage, with highlighting included. Finally, open up the test file using your favorite text editor; I'll use Pico, but you can use any editor.
View the output of this command here.
Yuck, what a mess. It's funny how a lot of
^H characters can make a file so unreadable. However, if you look very carefully, and mentally try to remove the
^H's, you should be able to recognize the text in your file. In case you're wondering,
^H is the control character for highlighting text.
We've just discovered an interesting difference in functionality between the
cat utility, the
more utility, and an editor. By default,
cat ignores control characters,
more interprets control characters, and text editors display control characters. Thus we have an unhighlighted but readable file with
cat, a highlighted file with
more, and a mess with a text editor.
You can force
cat to display control characters instead of ignoring them by using a switch. Try this:
cat -v ~/test
Your output should display all of the
^H characters, just like the text editor did.
Understanding this behavior will come in handy if you ever want to send a manpage to a file: Perhaps you want to transfer some manpages to your non-Unix laptop or want to include some interesting snippets of a manpage when replying to an e-mail. If you just redirect the output of the
man command to a file like this:
man whatis > ~/test
your resulting file will contain all of those irritating
^H characters. However, if you pipe the output through the
col command before sending it to your file, you will lose the
^H characters. Try it:
man whatis | col -b > ~/test
~/test to your favorite text editor to see the difference.
If you read the manpage for the
col command, you'll discover why this works; the
col command discards all of the control characters it doesn't recognize. And, fortunately for us,
col doesn't recognize that many control characters.
This trick is also very handy if you ever transfer an ASCII file from an MS-DOS-based operating system to your FreeBSD system. If you've ever done this before, you've discovered that MS-DOS-based operating systems put a
^M at the end of every line to indicate the carriage return. You could use your arrow keys to navigate to each of these characters so you can press the delete key, but it is much easier to do this:
col -b < dosfile > unixfile
This command tells
col to strip the control characters from a file called
dosfile and then send the results to a new file called
Or, if it's too late and you've already opened up the file in
vi, try this:
:%! col -bx
This will remove all of those pesky
^H characters without having to leave
vi. We'll save the explanation of how that works for a later article dealing with the
Now we've finally reached the part of the article where we can tie together all of this stuff to better understand how the
man command works. When you type:
man utility searches the
/usr/share/man subdirectories, in order, for the first reference of the manpage you wish to view. You can alter this default behavior like so:
man -a name_of_manpage
which will force
man to read all of the subdirectories; this switch is useful if you think a manpage is in more than one section in the manual and you wish to view them all.
man doesn't find the manpage here, it will then look in
/usr/local/man. If you do a listing of this directory and its subdirectories, you will find the manpages for the programs you installed yourself: i.e., any ports or packages that you built.
man has found the manpage, the formatted copy is sent to a pager so it can be displayed on your screen one page at a time. If you are using FreeBSD 4.0 or earlier, the default pager is the
more utility. If you are using FreeBSD 4.1 or later, the default pager is the
less utility. In true
Unix style, the
less utility actually offers more functionality than the
more utility. Regardless of which pager your system uses, the pager will correctly interpret the
^H characters to expose the highlighted text. If you prefer to read your manpages without that glaring white text, you can start your manpage like so:
man whatis | col -b | more
You can substitute the word
less if you prefer the
less paging utility.
We've covered a lot of ground in the last couple of articles. In the next few articles I want to discuss some of the neat utilities that can be built using the ports collection.
Dru Lavigne is a network and systems administrator, IT instructor, author and international speaker. She has over a decade of experience administering and teaching Netware, Microsoft, Cisco, Checkpoint, SCO, Solaris, Linux, and BSD systems. A prolific author, she pens the popular FreeBSD Basics column for O'Reilly and is author of BSD Hacks and The Best of FreeBSD Basics.
Read more FreeBSD Basics columns.
Discuss this article in the Operating Systems Forum.
Return to the BSD DevCenter.
Copyright © 2009 O'Reilly Media, Inc.