Here’s the problem. Perl? Great. Love the Perl. CPAN modules? Not so great. Well, yes, they are *great* in a they-kick-ass-in-all-the-functionality-they-provide, but they’re not installed on the default Mac OS X user’s machine. So not so great. Sure, Apple includes the command line cpan utility (man 1 cpan), but don’t you want utilities that just, you know, work? Without any further laborious installation of packages?
Because Perl provides the perfect syntactic leavening for Dashboard widgets and AppleScript Studio Applications, I wanted a away around this. If you want to use Perl to do things like scrape Webpages and sort through XML, and things like that, you need XML processing that’s free of CPAN constraints. And OS X does not provide a built-in Perl XML parser as you might have noticed.
I tried using external command-line XML packages. They were all too much of a pain: not lightweight enough, wanting to install themselves into strange locations. Yuck.
So here’s what I ended up doing. I wrote a C-based libxml utility to create working (but half-assed) Perl hashes from an XML file. Hence, lightweight and portable. Yeah sure, it mostly sucks. But it does seem to work.
Here’s a real-life example. The following Perl utility scans your /tmp directory while Pandora is playing and lists the current and upcoming songs in a playlist. Simple, short and easy to integrate into a Dashboard widget. As with all my code: It is what it is. Use at your own risk. Don’t sue.
Got a better way? I want to know.
% cd Desktop
% ./playlist.pl
Laura by Billy Joel, The Nylon Curtain
/private/tmp/WebKitPlugInStreamoqZ3yw
Dear John by Elton John, Jump Up
/private/tmp/WebKitPlugInStream7NZ6CR
Shabby Doll by Elvis Costello, Imperial Bedroom
/private/tmp/WebKitPlugInStreamauiHr4
Every Breath You Take by The Police, Every Breath You Take - The Classics
/private/tmp/WebKitPlugInStreamsAY1Uu
To Make You Feel My Love by Billy Joel, Greatest Hits Volume III
/private/tmp/WebKitPlugInStreamBUGAa2
Real Good Looking Boy by The Who, Then And Now! (1964 - 2004)
Every Breath You Take (Live) by Sting, ...All This Time (Live)
I Know What I Like by Huey Lewis And The News, Fore!
%
Source: Playlist.pl, getxml.c. (Compile getxml as follows: cc getxml.c -o getxml -lxml2 -lcurl -w)
#! /usr/bin/perl
# Pandora Playlist - Erica Sadun, 26 May 06
my $goaway = "Pandora is not yet open and playing.\n";
# Find /tmp/Web* files containing the phrase "focusTrait"
my $foctrait = `grep -l focusTrait /private/tmp/Web*`;
if ($foctrait eq "") {print $goaway; exit(0);}
# Sort XML files by time and split into an array
my $xmlist = join(" ", split(/\n/, $foctrait));
my $lscmd = "ls -tr ".$xmlist;
$xmlist = `$lscmd`;
my @matchfiles = split(/\n/, $xmlist);
# Create a list of possible music files based on 6 digits in a row
my $flist = `ls -ltr /private/tmp/WebKitPlugInStream* | grep [0-9][0-9][0-9][0-9][0-9][0-9][0-9] |sed "s/^.*private/\/private/"`;
my @filelist = split(/\n/, $flist);
# Create a list of songs in order
my $nth = 0;
my @plist = ();
foreach my $fitem (@matchfiles)
{
# Read in the XML from the file
my $r1 = "%riz = ".`./getxml $fitem`;
my %riz; eval($r1);
# Pandora always lists 4 songs per XML file, including upcoming
# songs that have not yet downloaded to the cache.
foreach my $i (0..3)
{
my $song = $riz{Response}->{methodResponse}->{params}->{param}->{value}->{array}->{data}->{value}[$i]->{struct}->{member}[1]->{value};
$song =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
my $artist = $riz{Response}->{methodResponse}->{params}->{param}->{value}->{array}->{data}->{value}[$i]->{struct}->{member}[10]->{value};
$artist =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
my $album = $riz{Response}->{methodResponse}->{params}->{param}->{value}->{array}->{data}->{value}[$i]->{struct}->{member}[13]->{value};
$album =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
print "$song by $artist, $album\n\t$filelist[$nth++]\n";
}
}


You wrote an XML parsing utility in C because downloading Perl modules was too much trouble? What am I missing here?
Have you ever used CPAN? It does all the compiling, installing, and also takes care of dependencies for you. It is by far the easiest solution. You could have installed everything in a lot less time than it takes to rewrite something in C.
Dashboard widgets should only require being dragged into the ~/Library/Widgets directory. They should be small and not require people to install CPAN modules to run them.
Erica,
Have you ever used Xerces
It's a fairly decent XML parser written in C/C++ with Perl/Java bindings.
As for CPAN - what's so hard about using it?
How about running perl -MCPAN -e shell in Terminal and issuing basic commands like search and install? Can't get any easier than that ...
Joseph
I think you guys are missing the point of having to write something in C instead of using CPAN. Are you going to require your users to download and compile CPAN modules to use a Widget? For the sake of ease of use - I'm guessing no. The programmer should have to go through hoops, so the user won't have to. We're not using Linux or Windows after all.
You can bundle the CPAN modules inside the Widget so that the user doesn't need to install them. You just have to be careful about different versions of Perl. For a Dashboard Widget you know that you're running Tiger so it is at least v5.8.6.
Is it fair to end-users to stick entire libraries into their widgets? Widgets should be lightweight, small and very transparent so end-users can inspect and understand what that widget is doing.
You don't make any sense whatsoever. "Dashboard widgets should only require being dragged into the ~/Library/Widgets directory" and "Is it fair to end-users to stick entire libraries into their widgets? Widgets should be lightweight, small and very transparent so end-users can inspect and understand what that widget is doing." So, you go write a piece of C code that uses libxml and gets compiled into a binary that users can neither inspect nor understand. Or, if you include the source also, you are expecting that the average person "inspecting a Widget" will also know C and Perl. Or do you just include the source and make them compile it themselves, which also violates just being able to drag and drop install widgets.
There are a number of ways around this. 1) make an installer that installs the CPAN modules as part of it's post-instal or pre-install processing (cake to do with the Apple Installer), 2) write code in the widget that installs any needed (but missing) CPAN modules when the widget is first run. 3) do as other posters have suggested and include the libraries in the widget. Yes, widgets should be small, but the ENTIRE perl module tree (including all the DOM stuff, which you wouldn't need, and the XML Stream stuff, which I don't think you need) is 1 MB. Cut out XML Stream and the DOM, and you are under 500K, which is smaller than just about any application you will run, build, or install.
The last thing is that the post is very misleading. Perl is portable. Compiled C code, not as much. You would have to make this an XCode project and compile your C file as a universal binary, so it could work under Dashboard for 10.4 Intel and 10.4 PowerPC. Add that to the fact that your C code doesn't actually do it's job under 10.3, and you have a truly "non-portable" solution.
All that being said, I enjoyed hacking your sample perl code, replacing your C parser with XML::Simple, and doing a few other modifications, like accessing the names of the parameters by name instead of number (i.e. {'member'}->{'songTitle'} instead of {member}[1], etc.) Makes the code much more readable and understandable (and also keeps the code from needing to be re-written when 'songTitle' is no longer at position 1.)
I never heard of Pandora before I read your article. Wow, that's amazing and thanks for introducing me to it.
Also, it is hard to imagine how that Perl code could be much uglier. :(
Here's a quick cleanup to give an idea of how that should look like. This is without even addressing anything other than the shell-script-become-bad-Perl syndrome.
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
sub file_contains {
my ( $file, $pattern ) = @_;
my $rx = qr/$pattern/;
open my $fh, '<', $file or return;
while( <$fh> ) { return 1 if $_ =~ $rx }
return;
}
my @xml_file = grep file_contains( $_, 'focusTrait' ), glob '/private/tmp/Web*';
die "Pandora is not yet open and playing.\n" if not @xml_file;
my %modified_time;
$modified_time{ $_ } = ( stat $_ )[9] for @xml_file;
@xml_file = sort { $modified_time{ $b } <=> $modified_time{ $a } } @xml_file;
my @filelist;
find( sub {
push @filelist, $_
if $File::Find::name =~ m!/private/tmp/WebKitPlugInStream!
and /[0-9]{6}/;
}, '/private/tmp/' );
my $nth = 0;
foreach my $fitem (@matchfiles) {
my %riz = eval `./getxml '$fitem'`;
my $outer = $riz{Response}->{methodResponse}->{params}->{param}
->{value}->{array}->{data}->{value};
foreach my $inner ( @$outer ) {
my ( $song, $artist, $album ) =
map $inner->{struct}->{member}[1]->{value}, 1, 10, 13;
s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg for $song, $artist, $album;
print "$song by $artist, $album\n\t$filelist[$nth++]\n";
}
}
I hope this comes out correctly, since in preview it appears that all the indentation gets eaten.
Grant: Yes, I include the C-source. No, the end-user does not need to know C or Perl, but they should be able get the gist of the code, as they would with the JavaScript and CSS source.
You suggest an installer that installs CPAN modules. Do you know of any Dashboard Widgets that use installers? Apple:
Widgets are much less complex than applications and should provide a light-weight install experience. The preferred packaging experience is to have widgets delivered in zip archive format and placed on your web server for download. Only archive the .wdgt bundle, omitting all other files.
Portable versions: Apple's Widget library service lets you list minimum requirements for widgets. It's easy enough to have separate versions per platform rather than make a universal version.
it is hard to imagine how that Perl code could be much uglier. :(
It works.
"Yes, I include the C-source. No, the end-user does not need to know C or Perl, but they should be able get the gist of the code, as they would with the JavaScript and CSS source."
Ok, go get your grandma, grandpa, mother, father, or anyone else who is not a programmer and let them read your getxml.c file. They will have NO CLUE what it is doing. You are blinded by the fact that you are a programmer. The average end-user will NEVER open up your widget. If they do, they will not have the foggiest idea what is going on.
Also from Apple:
"UNIX Commands
Any UNIX command or script, including those written in sh, tcsh, bash, tcl, Perl, or Ruby as well as AppleScript, can be accessed from the widget object. Here's an example of calling a UNIX command:
var obj = widget.system("/bin/ps -aux | grep Dashboard", null);
alert(obj.outputString);"
So, write a function called checkSetup(), and in that function determine if they have the appropriate CPAN modules installed. If they don't, use CPAN to install them before your widget runs. There are lots of ways to make sure this process only needs to happen once. It really is quite simple, and much better than writing your own C module which you then distribute as a platform-specific binary. As to having separate version, that is plain stupid, especially when Apple came up with Universal Binaries so people don't have to worry about things like that.
"Got a better way? I want to know."
Do you really want to know? Or do you just want to defend a somewhat short-sighted approach that you happen to like? Writing something in plain C when you are using Perl and the functionality you need has already been done in Perl, and is available via CPAN is NOT the right way to do things, no matter how much you would like to defend it. If you were writing all of your hacks in C, it would make sense to use some function written in C. Of course if you like re-inventing the wheel, then go right ahead. Just don't try to put it forth as either the "right way" or the "best way".
Grant, thank you for adding your heart-felt responses.
HI, guys. I am with MacBook Pro (intel-based), but failed to install XML::Parser by CPAN. Anyone experienced the same, and solve the problem? thanks..