Published on (
 See this if you're having trouble printing code examples

Writing "Learning PHP 5"

by David Sklar, author of Learning PHP 5

What are the tools and processes that I used to write Learning PHP 5?

Each chapter (and appendix) is its own file, formatted with the Docbook Lite XML dialect. I used XEmacs to edit the files. XEmacs's xml-mode provides helpful assistance with well-formedness checking and context-sensitive tag insertion. It also works with XEmacs' font-lock mode to make tags, attributes, and other XML goodies appear in pretty colors for easier readability.

One consequence of having the chapter files be XML documents and using an extremely programmable editor (or is it more accurate to describe Emacs as a programming environment that just happens to include a pretty good text editor?) is that even with my rudimentary pidgin elisp skills, I could write some simple functions and macros to automate common tasks.

Sample code, in the chapter files, generally appears wrapped in <programlisting> tags, something like this:

print "You are signed up with address: " .

One downside of using XML for tech writing is the extra work necessary to use XML's special characters like "&" and "<". For code examples, I avoided problems by wrapping the entire example in a <![CDATA[ ]]> block:

if (($age < 17) && ($parental_permission == false)) {
    print "Sorry, you can't see the movie.";

To speed the process of inserting these tags, I made two small keyboard macros and assigned them to function keys:

(defalias 'xml-programlisting-start
  (read-kbd-macro "<programlisting><![CDATA["))
(defalias 'xml-programlisting-end
  (read-kbd-macro "]]></programlisting>"))

(global-set-key [f9] 'xml-programlisting-start)
(global-set-key [f10] 'xml-programlisting-end)

After the O'Reilly production folks took a look at a sample chapter, they commented that each code example began and ended with extra white space. Having a newline after the <![CDATA[ and before the code starts (and after the code ends but before the ]]>) makes things easier to read, but not as nice in print. I cobbled together an elisp function to clean things up for me:

(defun xml-cdata-cleanup ()
  "Removes blank lines from the beginning and end of CDATA blocks"
  (setq p (point))
  (setq topmatch 0)
  (setq botmatch 0)
  (goto-char (point-min))
  (while (re-search-forward "\C-J+\\]\\]>" nil t)
    (replace-match "]]>" nil nil)
    (setq botmatch (1+ botmatch)))
  (goto-char (point-min))
  (while (re-search-forward "\\[CDATA\\[\C-J+" nil t)
    (replace-match "[CDATA[" nil nil)
    (setq topmatch (1+ topmatch)))
  (goto-char p)
  (message "Replaced %d top and %d bottom blanks" topmatch botmatch)

That way, I could write with the extra white space and then slice it all out before I submitted a chapter.

To submit chapters (or parts of chapters), I used a CVS repository and a mailing list. When I had new material for the editors and tech reviewers to examine, I committed it to the CVS repository. Commit messages went to the mailing list. Reviewers could read the new material and either commit changes and comments of their own or email me or the list with suggestions and new materials.

Not all reviewers found working with the raw XML as pleasant as I did. To make things more visually appealing, I used the db2h utility from the Docbook Lite distribution to generate HTML versions of each XML file. These HTML versions had enough markup (section headers, sidebars, code examples) to let reviewers focus on the content of what I was writing and not on the XML tags.

Another bonus of writing the book as XML is that I could write utilities outside of the editor to manipulate the data in the book. For example, here's the program I wrote to generate the downloadable code archive. It looks through all of the chapter files to find XML bits like this:

<example id="lphp-ch-7-ex-connect">
<title>Connecting with <literal>DB::connect()</literal></title>
<programlisting><![CDATA[require 'DB.php';
$db = DB::connect('mysql://penguin:top^');]]>

It writes the code inside of the <programlisting> tag to its own file, and keeps track of what's in the <title> tag for each example so it can write an index file listing all the examples in each chapter.


// Where do the chapters live?
$inDir = 'C:/Documents and Settings/sklar/My Documents/Learning PHP/chapters';
// Where is the example code going to go?
$outDir = 'C:/Documents and Settings/sklar/My Documents/Learning PHP/code';

// Make sure output directory exists
if (! (is_dir($outDir) || mkdir($outDir))) { die("Can't create $outDir"); }

// It's PHP 5, so let's use an iterator to indicate which files to handle
class ChapterOrAppendixFilter extends FilterIterator {
    public function accept() { 
	return preg_match('/^(ch\d{2}|app[a-z])\.xml$/i', $this->current()); 

// Loop through each file in $inDir that matches the regex in the filter's
// accept() method
foreach (new ChapterOrAppendixFilter(new DirectoryIterator($inDir)) as $file) {
    // Processing an XML file with a well-defined structure in PHP 5?
    // It's a job for SimpleXML!
    $chapter = simplexml_load_file("$inDir/$file");
    // Get all the examples in the chapter with an XPath query    	
    $examples = $chapter->xpath('//example');

    // Only process a chapter if there are some examples in it.
    if (count($examples) > 0) {
        /* Prepare the Chapter name/number */
        preg_match('/^(ch|app)(\d{2}|[a-z])\.xml$/', $file, $matches);
        if ($matches[1] == 'ch') {
            $chapterWord = 'Chapter';
            $chapterNum  = intval($matches[2]);
        } else {
            $chapterWord = 'Appendix';
            $chapterNum  = strtoupper($matches[2]);
        // Make a new directory for all the examples in this chapter
        $chapterDir = "$outDir/$chapterWord$chapterNum";
        if (! (is_dir($chapterDir) || mkdir($chapterDir))) { 
            die("Can't create $chapterDir");

        $exampleNum = 1;
        $titles = array();
        // Write out the code in each example (inside the <programlisting> tag)
        // into its own file. The strval() call is needed to force the SimpleXML
        // object into a string.
        foreach ($examples as $example) {
            // $titles is a list of all the titles of the examples. 
            // We want the text insidethe <title> tag in the example, 
			// but need to remove any markup inside the <title> tag 
            $titles[$exampleNum] = str_replace(array('<![CDATA[', ']]>'), '',
        // After writing each example to its own file, write an index file
        // that lists the titles of each example
        $fh = fopen("$chapterDir/index.txt",'w');
        foreach ($titles as $titleNum => $title) {
            fprintf($fh, "Example $chapterNum-%02d: $title\n", $titleNum);

When it was time to generate the downloadable code archive, I ran the program and then zipped up the directories that it created. No need for any messy cutting and pasting.

David Sklar is an independent consultant in New York City, the author of O'Reilly's Learning PHP 5, and a coauthor of PHP Cookbook.

Return to the PHP DevCenter.

Copyright © 2009 O'Reilly Media, Inc.