Everything that everyone else has said about how well written this book is, how applicable the examples are, etc, is spot on. It's a very engaging read.
I have a single request. Could someone make a downloadable version of the code available with more descriptive variable names. (Terse is fine for the book itself). I'm finding I'm having to rename variables as I go so that I can more easily grasp the math.
An example of one that I've renamed:
def sim_pearson(prefs,p1,p2):
# Get the list of mutually rated items
mutuallyRatedItems={}
for item in prefs[p1]:
if item in prefs[p2]: mutuallyRatedItems[item]=1
# if they are no ratings in common, return 0
if len(mutuallyRatedItems)==0: return 0
# Sum calculations
numMutuallyRatedItems=len(mutuallyRatedItems)
# Sums of all the preferences
sum_Person1_MutuallRatings=sum([prefs[p1][item] for item in mutuallyRatedItems])
sum_Person2_MutuallRatings=sum([prefs[p2][item] for item in mutuallyRatedItems])
# Sums of the squares
sumOfSquaresOfPerson1MutualRatings=sum([pow(prefs[p1][item],2) for item in mutuallyRatedItems])
sumOfSquaresOfPerson2MutualRatings=sum([pow(prefs[p2][item],2) for item in mutuallyRatedItems])
# Sum of the products
sum_ProductOf_Ratings_OfBothUsers_MutualItems=sum([prefs[p1][item]*prefs[p2][item] for item in mutuallyRatedItems])
# Calculate r (Pearson score)
numerator=sum_ProductOf_Ratings_OfBothUsers_MutualItems-(sum_Person1_MutuallRatings*sum_Person2_MutuallRatings/numMutuallyRatedItems)
denominator=sqrt((sumOfSquaresOfPerson1MutualRatings-pow(sum_Person1_MutuallRatings,2)/numMutuallyRatedItems)*(sumOfSquaresOfPerson2MutualRatings-pow(sum_Person2_MutuallRatings,2)/numMutuallyRatedItems))
if denominator==0: return 0
pearsonCorrelation=numerator/denominator
return pearsonCorrelation
Great book. I'm really enjoying it.
|