The Code
This script needs to run on a publicly available web server that can execute Perl scripts. You'll need the nonstandard Perl module LWP::Simple, which will fetch the movie list from Yahoo! Movies. You'll also need HTML::TableExtract, which will do the tough work of deconstructing the HTML for you.
This script relies on screen scraping to gather the movies in a list, which means it's picking through the HTML to find relevant information. This also means that if Yahoo! changes their movie list HTML, even slightly, this script will likely fail. Keep in mind that you might need to tinker with the script to keep up with changes to Yahoo! Movies.
To keep your fingers from doing too much work when you're ready to bring it up on your phone, you'll want to keep the name of this script short. Save the following code to a file called m.cgi and be sure to include your unique movie list URL as the value of $listURL at the top of the script:
#!/usr/bin/perl
# m.pl
# Convert a Yahoo! Movies list into WML for cell phones
# Usage: m.cgi
use strict;
use HTML::TableExtract;
use LWP::Simple;
# Set your Yahoo! Movie list URL
my $listURL = "insert your movie list URL";
# Set the base movie URL
my $movieURL = "http://acid1.oa.yahoo.com/mbl/mov/mdet?mid=";
# Set the titles of the Yahoo! Movies table you're parsing. note
# that if the title contains HTML, so too must these headers.
my @tehs = ["#", "Movie Title", "User<br>Grade",
"Avg. User<br>Grade", "Critics<br>Grade","Status"];
my $te = HTML::TableExtract->new(headers=>@tehs, keep_html=>1);
# Fetch the HTML
my $content = get($listURL);
my ($wml,@moviedata);
# Parse the table that matches the headers above.
$te->parse($content);
foreach my $ts ($te->table_states) {
foreach my $r ($ts->rows) {
next if @$r[0] =~ /grayText/; # final table footer.
my ($title, $mid); # parse ID and title from "Movie Title" field.
if (@$r[1] =~ m!.*?id=(.*?)"><b>(.*?)</b>.*?!gis) {
$mid = $1; $title = $2;
}
my $thisMovie = {
title => $title,
mid => $mid,
grade => &clean_text(@$r[2]),
avg => &clean_text(@$r[3]),
critics => &clean_text(@$r[4]),
status => &clean_text(@$r[5]),
};
push @moviedata, $thisMovie;
}
}
# Assemble the WML by looping through the array of hashes
for my $i ( 0 .. scalar(@moviedata)-1) {
$wml .= "<anchor>$moviedata[$i]View Movie Lists on Your Cell Phone ";
$wml .= "<go href=\"$movieURL$moviedata[$i]{mid}\"/>";
$wml .= "</anchor><br />\n";
$wml .= "<b>Status:</b> $moviedata[$i]b_pub<br />\n";
$wml .= "<b>Critics:</b> $moviedata[$i]{critics}<br />\n";
$wml .= "<b>Users:</b> $moviedata[$i]{avg}\n";
$wml .= "<br /><br />\n";
}
# Send final WML to the client
print "Content-Type: text/vnd.wap.wml\n\n";
print "<?xml version=\"1.0\" encoding=\"UTF-8\" ?>\n";
print "<!DOCTYPE wml PUBLIC \"-//WAPFORUM//DTD WML 1.1//EN\"";
print "\"http://www.wapforum.org/DTD/wml_1.1.xml\">\n";
print "<wml><card id=\"Menu\" title=\"Movie Wishlist\">\n";
print "<p><b>Movie Wishlist</b><br/><br />\n";
print $wml;
print "</p></card></wml>\n";
# This function removes HTML, space entities,
# linebreaks, and leading/trailing spaces from strings
sub clean_text() {
my $text = shift(@_);
$text =~ s!<.*?>!!g;
$text =~ s! !!g;
$text =~ s!\n!!g;
$text =~ s!^\s+!!;
$text =~ s!\s+$!!;
$text =~ s!\s{16}!, !;
return $text;
}
This script downloads the HTML from the URL you supply and picks relevant information from the HTML. When the script runs into a movie title, it also grabs the internal Yahoo! ID for that movie. Then, using the $movieURL as a base, the script assembles a link to that movie's detail page at Yahoo! Mobile. This means that if you're ever browsing your list on your phone and can't quite remember what that particular movie is about, you can simply click through to Yahoo!'s mobile site to get a summary of the movie.
In addition to the titles in the list, the script includes whether the movie is in theaters or on DVD, the critics' grade, and the average grade assigned by Yahoo! users.
Notice that at the end of the script, when it's printing out the WML, the content type is set as text/vnd.wap.wml. Setting this content type ensures that the device viewing the page will know how to render it. Web browsers won't be able to view the page, so you can either test it exclusively on your cell phone, or temporarily change the content type to text/xml in order to test it in a web browser.