I spend a lot of time searching text files. More accurately, I spend a lot of time searching nested directories of text files. For source code, I know I should use Ctags, but I’ve never quite made the switch.

For plain text files (books, articles, stories, weblog entries, notes, contracts, et cetera), I’m still a GNU grep fan.

I spent a few hours in the past week editing a book manuscript and producing well-formed and valid DocBook XML. (I wrote two books in DocBook XML. While it’s a great file format for producing a book, it’s a face-stabbingly hateful format for actually writing a book.) Unfortunately, the conversion process to DocBook revealed some problems in the source material. In specific, certain links from one part of the manuscript to others were invalid.

I needed to find and fix the dangling links in all fifteen book chapters, spread out in several dozen individual files. Grep and a little bit of command-line magic made the task much, much easier. I ended up with the pattern:

vi $( grep -l 'L<refactoring_strategies>' ?_*/*.pod)

That is, search all of the .pod files in directories whose names start with one character and an underscore. For all of those files which contain a link to an anchor named refactoring_strategies, print their names. Open that list of files in Vim.

I still had to edit plenty of text, but finding only the files I needed saved me a tremendous amount of time. Throw in grep’s -r (recurse into subdirectories) and -i (use a case-insensitive match) switches, and I’m very happily productive.

Thank you to everyone who’s contributed to grep and GNU grep through the years. Your work helps me work, every day.