A while back, I began the new column Liberty On Beta and promised that each article will demonstrate a real-world problem that I solved for a real client. That was exactly what I did in my last article, including providing access to the source code on my website (click on Books, then on Articles).
Now, however, my incredibly brave (or nutty) editor has agreed to allow me an even more radical idea: let's document the creation of a real-world application from scratch. In real time--that is, as I develop it.
In this first column about my new application, I'll lay out what I have in mind and create a first specification for the project. In subsequent monthly columns, I'll show you how much I've built so far, what dead ends and problems I've run into, and how I've solved them. Each column will be a blow-by-blow diary of the creation of a real, meaningful, and somewhat tricky ASP.NET 2.0 application, warts and all. (If I show enough warts, I may never get work again!)
Project Concord  will be a web-based application that will be built with ASP.NET 2.0 forms-based security and personalization controls. Its purpose will be to allow you to find books on Amazon that were highly rated by people who liked other books that you liked. Rather than basing recommendations on what you've already bought, this will allow you to find books that were enjoyed by people who like the same books as you do. Further, rather than settling for reviews of a given book by just anyone, you can find just those reviews written by people who share your taste.
Project Concord will consist of the following modules:
The ASP.NET project provides the following features (at a minimum):
Items with an asterisk (*) will not be in the first release. As a rule, I believe in shipping a stripped-down first release, and then adding features in response to feedback, rather than delaying release while you add every feature you think the user may want. It often turns out that features you thought the user would love are never actually used at all, while things you never thought of are essential. Hence, version 1 will be the core project stripped down to its essence.
For example, in the core project there will be only pre-created accounts (that is, just me, and maybe you, if you're nice to me). In version 2, we'll add a feature to let anyone create an account and use the system (why not?).
In the core project there will be, initially, only one query: "Show me all the books that I've not reviewed that were 'liked' by people who are a 'good match' for me." (The definitions of "liked" and "good match" are discussed below).
For version 1, the ASP.NET application will allow the user to sign in, and then will display recommendations. These will be books that were highly rated by people who match the user.
The second module is a Windows application (eventually a Windows Service) that connects to the Amazon Web Service periodically (e.g., every hour) and finds every review you've written to date. It records the book (ISBN, author, genre, etc.) and your rating (how many stars). It updates the database (initially, it just chucks all the old records and creates new ones, but pretty quickly we'll find existing records and update them).
It then asks Amazon for the identification of everyone else who has reviewed the books you have reviewed, and it creates a record for them, keeping track of how many stars they gave each book you reviewed.
The Windows application will then compute (and store in the database) a "match rating" for each person in the database, initially based on the criteria shown in Table 1.
Notes: The first column is how closely you matched the reviewer. An exact match is that you both rated the book with the same number of stars. The second column is how many "match points" the reviewer is awarded to indicate how closely that reviewer matches your taste. Columns 3 and 4 provide an example of how such a match might be accomplished. For example, the fourth entry ("Off by 3") indicates that the reviewer might get the associated match score (-5) if you rated a book with two stars and the reviewer gave it five stars. Of course, this could also happen if you rated the book with four stars but the reviewer only gave it one.
|Match||Points||Example: You Rated||Example: They Rated|
|Exact||5||4 stars||4 stars|
|Off by 1||3||4 stars||3 or 5 stars|
|Off by 2||-3||3 stars||1 or 5 stars|
|Off by 3||-5||2 stars||5 stars|
|Off by 4||- 10||1 star||5 stars|
Table 1. The Match Points table
Each reviewer will have an individual entry in the reviewers table with their match score. The score will be (initially) just the average of the match points awarded for all the books you've both reviewed. In a later version we'll be able to define match criteria in a more sophisticated way; for example, taking into consideration the standard deviation or whatever other fancy statistical basis will give us the most confidence.
The next step is to query the Amazon Web Service for all the other reviews by each reviewer. That is, we want to know how each reviewer rated every book they reviewed that you have not reviewed (these are potential recommendations). For each book reviewed, we'll record the ISBN, title, genre and any other relevant information, including--most importantly--the rating this reviewer awarded the book.
The resulting database will provide the data used by the ASP.NET application to display results or recommendations to the user.
There are all sorts of ways to enhance the user experience, both by improving the sorting of recommendations and by providing sophisticated search opportunities. You might, for example, want to sort in various ways, including:
In addition, rather than just receiving recommendations, you might request specific searches:
The possibilities are endless, and I'm sure you can imagine many more queries. For example: "Show me every science fiction book that is highly rated by people who 'agree' with me about Neuromancer and Ender's Game."
Notice that words like "match," "like," and "agree" are in quotes, as these are set by scores. In the first version, the meanings of these will be hard-coded. In version 2+, these will be adjustable by the user so that the user can fine-tune the effectiveness of the program.
An exciting alternative to having the user twiddle these settings is to have the program adjust its own heuristics by checking the rating the user gives to books, against what might be predicted. This will allow the program to fine tune itself as it gains more information, on a per-user basis.
For example, suppose I have rated ten books, and there are 500 people who have rated these same books. The program can compare my rating for each book against the average match rating and optimize accordingly. As I rate more books, the system can adjust its internal meaning of "match" and "like" and "agree."
In addition, we can add a slider or other control to the ASP.NET program to allow the user to tighten or loosen the criteria, finding fewer or more matches respectively, and we can record this in the user's personalization record so that it is "sticky" and the user does not have to repeat the adjustment for every login.
At the risk of receiving a great many indignant emails, that is my entire specification, and my entire design, for now.
With this sketch in mind, I'll start building the project, taking the well-tested methodology of "Get it working and keep it working" (sometimes called "successive approximation" and sometimes called "painting yourself into a corner").
In any case, I'll start building it with the spec I have, and see how it develops. I have two incredible luxuries:
I'm sure the specifications will evolve as I go, and I'm certain the design will change as I see opportunities or run into obstacles. My promise to you is to keep careful notes. In subsequent columns, I'll review problems I ran into, and discuss what I hope will be interesting design or implementation issues discovered along the way. I'll also post a link to the work in progress and discuss aspects of the code developed to date.
This is not an open source project, but you are invited to kibitz, criticize, suggest, etc. Please do not send me code, but do feel free to post feedback.
Finally, if you want to discuss this in detail (which would be terrific!) please join the dedicated discussion on my support forum, where you will find the Concord Project folder. 
 Project Concord: I'm told that Microsoft names its projects based on towns near Seattle. I've chosen to name mine after towns near Boston ("Cradle of Liberty").
 You have to join Delphi to participate in the forum, but it is free.
Jesse Liberty is a senior program manager for Microsoft Silverlight where he is responsible for the creation of tutorials, videos and other content to facilitate the learning and use of Silverlight. Jesse is well known in the industry in part because of his many bestselling books, including O'Reilly Media's Programming .NET 3.5, Programming C# 3.0, Learning ASP.NET with AJAX and the soon to be published Programming Silverlight.
Read more Liberty on Beta 2 columns.
Return to ONDotnet.com
Copyright © 2009 O'Reilly Media, Inc.