View Review Details


Book:   Programming Collective Intelligence
Subject:   A visionary book that illuminates the Internet
Date:   2008-10-21 11:28:51
From:   AlexeySmirnov
Rating:  StarStarStarStarStar

This is a visionary book because it predicts a lot of what will happen to the Internet soon. How do we process information in the Internet age? Instead of reading magazines and newspapers we use blogs as our source of news. This is because blogs offer much more customized news feed. In a typical newspaper, how much of its content is of interest to a reader? I guess half is a big value but typically it is less than that.


I start my working day with consuming two sweet drinks. One drink is a cup of coffee. Another is a virtual information soup made of 100 blogs. I glance over most of the stories quickly using Google Reader and select those that I am interested in. I might read them in greater detail later on during the day, in the evening, or on a weekend. I do not know which drink gives me more pleasure - the delicious cup of coffee or sweet virtual soup. I like the latter a lot because it is rich with media content - with bright images, cool videos, wow-type web pages.


However, I often discover news that I wish I found out earlier. In other words, there are so many news sources that reading them all or just looking at the headlines of major blogs will take too much time. We need targeted information delivery service.


This is the main idea of this book. In fact, it starts with explaining how to make recommendations given a set of preferences of a number of people and your own preferences. What are those cool things that you have not tried out yet but everybody else did? The example described in the book is applied to Delicious which does not offer recommendations yet.


I often try to decide what my interests are. The blogs that I am reading might answer this question if one builds groups of them. In fact, I have done this manually, but I found out that this categorization is not perfect. The book answers this question in Chapter 3.


After that the book deviates into a number of additional topics such as search, neural networks, discrete optimization. The author Toby Segaran has a great ability to explain difficult concepts using simple words and pictures. As most of the stuff was familiar to me I was wondering how easy a new concept seemed and how much time I spent originally understanding it.


After that the main melody of the book is there again - the next chapter explains how to filter documents, for example to decide if a particular news story is interesting to you or not. Then the book deviates again into decision trees and building price models and even matching people on a dating site. However, there comes our melody again - this time it explains how to extract trends from a lot of news sources, that is decide what people are discussing today. This feature is similar to Google News except that the user has no control of news sources.


I was surprised when I found out that Python is such a popular language in a scientific community. The book describes lots of libraries dealing with numerical data or displaying various charts. The book will serve as a great introduction to Python language even though there are lots of introductory books available. In fact, learning Python this way it easier and more enjoyable.


After reading the book I definitely want to try out the tricks explained there and improve my information soup. This book is my virtual cookbook.