Data Mining Email (10 tags)
Thousands of useful facts lie inaccessible on your hard drive, hidden within email messages and attachments. How much more productive would you be if you could extract, index, and search that information? Robert Bernier demonstrates how to store data from emails into a database, where you can use data-mining techniques to analyze it.
Massive Data Aggregation with Perl (9 tags)
What do you do if you have a huge array of disparate data sources from which to collect and present data in multiple formats? First, reach for Perl. Then...good question. Fred Moyer explains how his team designed and built a system to aggregate and present huge amounts of data with Perl.
Top Ten Data Crunching Tips and Tricks (8 tags)
Every day, programmers perform unglamorous but necessary data crunching: recycling legacy data, checking configuration files, yanking data out of web server logs, and more. Knowing how to crunch data with the least amount of effort can make the difference between meeting a deadline and making another pot of coffee. Greg Wilson, author of Pragmatic's Data Crunching, offers ten tips for crunch time.
Calculating Entropy for Data Mining (5 tags)
Eww, statistics. Right? Not necessarily--for example, calculating the entropy of your web statistics can help you analyze trends and correlations. Paul Meagher demonstrates statistical programming in PHP while explaining single-variable entropy.
Calculating Entropy for Data Miners (3 tags)
Quick--what's the relationship between the columns of your database? Don't know? Maybe it's time to pull out the information theory book and calculate how much data they store. Paul Meagher explains how this works while showing off premade PHP libraries to handle the details of the calculations for you.