View Review Details


Book:   Building Tag Clouds in Perl and PHP
Subject:   Disappointed
Date:   2007-06-20 16:27:54
From:   Anonymous Reader
Rating:  StarStarStarStarStar

The only useful thing I learned from this article that I didn't find for free after just a few googles was why and how to use the log of the count to determine the font size, as opposed to the straight count. That was a handy tip, and I did use that code. But relative to the amount of fabulous info that's available for free, I don't think it was worth ten bucks.


What I wanted to learn but didn't was more about prepping the content. I wanted to create a cloud for a user forum site, showing graphically what the hot topics are in users' recent postings. I wanted some tips about trimming out uninteresting words, e.g. "I", "was", "wanted", "more", .... I was hoping to find pointers to some stock code, or at least word lists, to help me trim the content.


I also wanted tips about normalizing the content. For example: "help", "helps", "helped" and "helping" are all about "help"; "Britney" and "Britney's" are both about "Britney". I can hardcode arrays to (PHP's) str_replace to replace all verb forms with one form, but I'm not sure how to handle possessives. Again I was hoping for pointers to existing work, or at least tips on doing the normalization.


Finally, I wanted some tips about memory management and performance for good-sized textual analyses.


This article doesn't address generating the content at all; it starts by assuming you already have a pre-built list of words. It then walks you through relatively straight-forward font-sizing code that is, for the most part, freely available for the googling. IMO, that's not worth $10.

Building Tag Clouds in Perl and PHP
See larger cover