O'Reilly Network    
 Published on O'Reilly Network (http://www.oreillynet.com/)
 http://www.oreillynet.com/pub/a/network/2002/07/09/perlandxml.html
 See this if you're having trouble printing code examples


Perl in a Nutshell, 2nd Edition

Using XML Modules in Perl

by Nathan Patwardhan, coauthor of Perl in a Nutshell, 2nd Edition
07/09/2002

Introduction

For various reasons, updating Perl in a Nutshell was a challenge.

Given the new material that crept into the source kit between Perl 5.005 and the upcoming release of Perl 5.8, as well as things that demanded to be added or updated in the text, I knew that there was much to do. When I learned that the book update was going to be done in XML/DocBook instead of the original format (troff, then FrameMaker), the task of the update became even loftier, so I wasn't quite jumping for joy.

On the other hand, far be it for me to slam XML. In fact, prior to the second edition of Perl in a Nutshell, I hadn't used it with any regularity. Admittedly, as a system administrator, I hadn't found a specific use for XML. And I was already equally familiar with POD, HTML, and *roff, all of which offered some level of "WYSIWYGALAC" (What-you-see-is-what-you-get-at-least-after-conversion). I didn't quite know how XML would fit into my daily work, and, as it related to the update of Perl in a Nutshell, I didn't want to inflict pain on the O'Reilly production staff whilst they untangled my markup during the book production process.

But this article isn't a narrative about me writing Perl books, my favorite tool for text processing, or killing the O'Reilly production staff softly with ill-conceived markup. This article is really a revelation of a practical use I've found for XML, and a couple of things that I took with me after updating Perl in a Nutshell. I'll cover a couple of interesting XML-related Perlthings as well: XML::Simple and XMLRPC::Lite, and how I've come to use these Perlthings in my daily job.

Background

Related Reading

Perl in a Nutshell

Perl in a Nutshell
By Stephen Spainhour, Ellen Siever, Nathan Patwardhan

Table of Contents
Index
Sample Chapter

Read Online--Safari Search this book on Safari:
 

Code Fragments only

At work, we'd been generating documentation for our servers in the old-fashioned way: copying and pasting IP addresses, (disk) partition tables, patch information, and so forth, into a Microsoft Word document, and placing this document on a shared drive. Not only was the process monotonous (no offense to my employer!), it was prone to error, since it didn't take much to "lose something in the translation" between a copy 'n' paste and inserting information into a table. Given that our documentation, in part, was used to track system changes, faulty data could mean the difference between truly finding a flawed server configuration, and being the victim of a typographical error. There had to be a better way! Right?

When I think of bread and butter, I think of sourdough and margarine. But, call me hungry, I also think of Perl's text-processing capabilities. Add some nifty modules and Val Kilmer in Thunderheart (read: pure magic), and you've got yourself an application.

And so it began.

I jotted down a grocery list of functionality that I'd write into my system self- documentation application, which included:

  1. System information (network configuration, [disk] partition tables, and so on) would be auto-generated.

  2. Auto-generated system information would live in a data structure, like a %HoH (hash of hashes). This data structure would be stored (cached) to disk, and this cache would be referred to upon subsequent executions of the system-reporting.

  3. The system would pass the data structure (%HoH) over the wire to a "reporting server", which would expand the data structure as XML. This XML would represent the system information in a table-like format. That is, if you applied a stylesheet to the XML, you could easily generate a (HTML) table from the XML.

  4. The final system report would use a (XML) template from which static content, Perl code, and the data structures were merged into a single document.

While there's more to the application than listed above, this article is intended to present a couple of concepts in short-form, so I'll stick to the crux of what I did with XML. I used the following to implement the points above:

  1. Standard Perlstuff here: qx(), and so on.

  2. Standard Perlstuff here: %HoH (see: perldsc documentation).

  3. XMLRPC::Lite.

  4. XML::Simple.

At the heart of my self-documentation system is having the "client" machines do as little as possible: generate configuration information, cache it, and pass it to a "reporting server". Given this client/server model, much consideration was given to how system configuration information would be passed from the client to the server. Immediately, I ruled out things like ftp, rcp, scp, rdist, rsync, and similar. While each of these implementations would allow me to pass data from one host (a "client") to another (a "server"), I simply didn't want to pass user information (login and password) over the network; nor did I want to allow the admin user to have a null (RSA) key so that the administrator could have full permissions to pass system configuration information between the client and the server. Thus, ftp, rcp, and the others were eliminated. Further, even if I did something so evil as allowing root to gain access to a machine (even securely) without a password, such an implementation wouldn't answer the question of passing a data structure as is between the client and the server. Even if I used http and passed my system configuration in name/value pairs, I still wouldn't have answered the question of dealing with my data structure as is.

Diving Right In

Enter XMLRPC::Lite. Paul Kulchenko's XMLRPC::Lite is a Perl implementation of the XML-RPC protocol. XML-RPC does lots of things that you can read about at www.xmlrpc.com, but the most desirable feature of XMLRPC::Lite, with regards to my application, is one's ability to use it to call a subroutine -- with arguments -- on the remote end of the connection (and really using any transport, including http).

By using a hash reference, I could easily pass %HoH over the network as an argument to a function that lives on my "report server", which would - in turn - expand the %HoH as XML. Bear in mind that your XML-RPC program doesn't have to be written in Perl at all, but for the sake of this exercise, it will be!

Using XMLRPC::Lite is a two-phase process. First, you'll need to write a program that lives on the remote (server) end of your connection. My suggestion is to store your function(s) in a module that gets "used" by your XMLRPC::Lite program. Let's say that you write a module called ExpandHoH that implements a function called hoh_to_xml. Here's what ExpandHoH.pm looks like:

package ExpandHoH;

sub hoh_to_xml {
    shift; # remove 'ExpandHoH'
    my $hoh = shift; # our ref to %HoH

    # Ordinarily, we'd use something like XML::Simple here
    # to expand $hoh as XML.  But for the sake of simplicity
    # we'll simply make sure that what we've got is a
    # hash and return a status message.
    if(ref($hoh) eq 'HASH') {
        return 'OK';
    } else {
        return 'NOT OK';
    }
}

1;

You won't be "using" the module directly from your client program, however. You'll write a program that uses 'ExpandHoH', and calls the appropriate function with a reference to %HoH as the argument. Let's call the program expand_hoh:

#!/usr/local/bin/perl -w

use XMLRPC::Transport::HTTP;

# we'll use http and dispatch to 'ExpandHoH'.
XMLRPC::Transport::HTTP::CGI
    -> dispatch_to('ExpandHoH')
    -> handle
    ;

And here's the program that you'll run on the "client" end of the connection. Basically, you'll populate %HoH with your system's configuration information, will create a reference to it, and pass this reference to ExpandHoH.hoh_to_xml. In the end, you'll get a status from the server by way of result():

#!/usr/local/bin/perl -w

use XMLRPC::Lite; # or XMLRPC::Lite ``+trace" if you'd like

my $proxy = 'http://my.server.here/cgi-bin/expand_hoh';
my %HoH = (
    # system configuration stuff here
    );
my $ref_hoh = { %HoH };

my $xlite= XMLRPC::Lite
    -> proxy($proxy)
    -> call('ExpandHoh.hoh_to_xml', $ref_hoh);

my $status = $xlite->result;
# do something with $status here.

If all goes well, your client will receive an 'OK' status from the server.

We bustled through that pretty quickly. But it wasn't due to lack of interest! Just when you thought that it was all about %HoH, the tables are turn, and you find yourself smack in the middle of XML.

There are many XML-related modules in Perl: XML::Parser, XML::DOM, XML::Twig, to name a few. If you're getting your feet wet with XML and Perl, or you're just looking for something quick 'n' dirty to read (and write) XML files, you might consider using XML::Simple. Let's say that you have XML like the following:

<hosts>
  <host name="atlas" osname="Solaris">
    <addr>192.168.0.100</addr>
    <addr>192.168.0.101</addr>
    <addr>192.168.0.150</addr>
  </host>
  <host name="dns" osname="Linux">
    <addr>192.168.0.2</addr>
    <addr>192.168.0.3</addr>
  </host>
</hosts>

You can easily generate a %HoH from the above XML with XML::Simple:

#!/usr/local/bin/perl -w

use XML::Simple;

my $infile = 'configs.xml';
my $xml    = XMLin(); # read configs.xml into $xml

print XMLout($xml), "\n"; # show us how the XML looks

You might've noticed that the above code generates XML that isn't identical to the markup in configs.xml. One of the things that you'll need to be careful of if you use XML::Simple is that you provide XMLin() and XMLout() with the appropriate arguments so that your XML formatting is preserved, and so you don't accidentally fold attributes into your XML (elements) that you hadn't intended. You can cure the above code as follows:

#!/usr/local/bin/perl -w

use XML::Simple;

my $infile = 'configs.xml';
my $xml    = XMLin($infile, keeproot => 1, forcearray => 1, keyattr => []);

print XMLout($xml, keeproot => 1, keyattr => []), "\n";

Now that you've been thrown into shark-infested waters, and you don't hear Spielberg yell "cut!", let's throw you a line so that we can tie everything together (or: insert your own quip or cliché here).

Let's close things out by reviewing 2 key parts that we'd glossed over earlier. We'll start with XMLout() first. XMLout($hash_ref) lives in XML::Simple, and allows you to write a XML file from the contents of $hash_ref. $hash_ref, eh? Sound familiar? Second, if you were thinking, "hey, I just saw something about a hashref in that XMLRPC::Lite program!", then you're correct. We'll pass $hash_ref to ExpandHoH::hoh_to_xml(), which will in turn call XMLout($hash_ref). XMLout($hash_ref) will then generate the XML and write it somewhere on the remote end of the connection. Here's the revised ExpandHoH:

package ExpandHoH;

use XML::Simple; # needed for XMLout()

sub hoh_to_xml {
    shift; # remove 'ExpandHoH'
    my $hoh = shift; # our ref to %HoH

    # Expand $hoh to XML with XMLout().
    if(ref($hoh) eq 'HASH') {
        my $outfile = "/tmp/temp_xml.$$"; # for now, be temp

        XMLout($hoh, rootname => 'config', outputfile => $outfile, keyattr => []);
        if($@) { # XMLout() found an error somewhere
            return 'NOT OK';
        } else {
            return 'OK';
        }
    } else {
        return 'NOT OK';
    }

So, there you have it: a little bit of XML, a little bit of XML-RPC, and a rasher of corniness. While this article could've done things in slightly different ways, I'm a firm believer in "TMTOWTDI". So, sure, it would've been possible to use XML::Parser or XML::Dumper instead of the modules I chose for this article. In fact, if you've been looking to implement something similar to what was covered here, I suggest that you try exploring any XML/XML-RPC Perl modules you can find.

And finally, if you're curious, I'll be releasing the system-reporting application at some point in the near future. I haven't decided on licensing terms yet; nor have I developed a suitable mechanism for you to install it on your systems. But, watch this space. It'll be soon!

O'Reilly & Associates recently released (June 2002) Perl in a Nutshell, 2nd Edition.

Nathan Patwardhan was a software developer and system administrator for Banta Integrated Media in Cambridge, MA.


Return to the O'Reilly Network.

Copyright © 2007 O'Reilly Media, Inc.