Windows DevCenter    
 Published on Windows DevCenter (http://www.windowsdevcenter.com/)
 See this if you're having trouble printing code examples


Liberty on Beta 2

Creating an Application from Scratch, Part 3

by Jesse Liberty
02/14/2006
Use ClickOnce to Deploy Windows Applications

In my previous column I began writing my application PeopleLikeMe in earnest.

Quick summary: PeopleLikeMe lets you ask the following question: please find folks who reviewed the books I've reviewed and whose taste matches mine, and then show me books they reviewed that I have not yet reviewed, that they rated highly. That is, show me books I'm likely to enjoy.

Since the last time I wrote, I have implemented all of the code to obtain records from Amazon (described below) and to save that data in a SQL Server 2005 (this example should work equally well with SQL Server 2005, SQL Server 2005 Express, and SQL Server 2000).

In the original design, the user would login to an ASP.NET application and then ask to update his or her records, or, if the records were already fully up to date, then the user could ask to see books highly rated by other readers who matched the user's taste in books.

Unfortunately, the action of upating the records turns out to take a very long time. For example, when I update my own records, Amazon reports that I have reviewed 46 books. These 46 books were also reviewed by a total of nearly 2,100 other people, and those 2,100 people have reviewed in total nearly 17,000 books! Also, the time it takes to gather this information is gated by the fact that my license to query the Amazon web service requires that I make only one query per second (in my un-optimized version I lose 1,100 seconds, or nearly 20 minutes, just waiting my turn).

Related Reading

Programming ASP.NET
Building Web Applications and Services with ASP.NET 2.0
By Jesse Liberty, Dan Hurwitz

In any case, the idea that the user would wait for all of this, while staring at an inert browser, is clearly untenable. There are a number of design alternatives. I've settled on two.

In the long run, I'll create a web application that lets the user choose either to update or to search for recommendations (using web-forms authentication, personalization, etc.). If the user asks to update, the application will come back and say that this is a lengthy process and an email will be sent when it is completed. There will be a second (Windows) application that will run on the server that will take the user's ID off a queue, and run the update and then fire off the email.

Now, one could certainly argue that what is needed is not an ASP.NET application at all, but rather a stand-alone desktop application that works only for one user. Each person who wants recommendations runs his or her own copy. We'll finesse this decision for now and come back to it in later articles.

In either case, what is needed right now is a Windows application that does the work of:

That work is unchanged no matter which design we end up with, and so that is what I've implemented for this article. To keep this example simple, rather than taking my CustomerID off a queue (or from a database table) for now, I enter my Amazon CustomerID in a text box and click Run. The program runs only once, for me, but modifying it to take the next CustomerID off a table and run again would not be difficult.

The Engine

The new Windows application (PeopleLikeMeEngine) updates my table of reviews and finds all the other reviewers, scores their records, and finds all their reviews. Because this is a Windows application, it is now able to provide far more extensive "real-time" feedback, updating the display as it finds reviewers and inserts records into the database, as shown in Figure 1.

Figure 1
Figure 1.

To begin, I enter my Amazon CustomerID (see the previous article for how to find your ID) into the text box at the top of the form and click Run. The program finds all the books I've reviewed, then all the other reviewers of those books, and in turn, all the books they've reviewed.

(It should be noted that what Amazon calls the CustomerID I call the ReviewerID. We can argue all day about whether or not that is a good idea.)

The topmost listbox shows the book ASIN (ISBN), the ReviewerID (Amazon CustomerID), and that reviewer's rating for the book (1-5 stars). As database errors occur, they are added to the second listbox. This is expected, because the database is set up to reject duplicate entries. The complete error text for the current error is shown in the (disabled) text box (figure 1, marked 1). Double-clicking on any of the errors will display in the text box.

As the program runs, I display three running totals: the number of Reviewers who have reviewed the same books as I have (figure 1, marked 2), the number of Reviews by all these reviewers (figure 1, marked 3), and the total number of seconds we've lost waiting to ensure that we don't make a request to Amazon more than once per second (figure 1, "Seconds gated," marked 4).

The gated figures are so high (up to 20 minutes when I run the program) that this is an obvious place to look for optimizations (to be covered in future articles).

Implementation

There are two generic collections used throughout this program: myBooks, which is a type-safe List of Book objects, and AllReviewers which is a type-safe Dictionary of Reviewer objects keyed on the ReviewerID, as shown in listing 1.

  
 private List<Book> myBooks = new List<Book>();
 private Dictionary<string, Reviewer> allReviewers = 
    new Dictionary<string, Reviewer>();

Listing 1

The Book class holds an ASIN (ISBN) and my rating for that book. The Reviewer object holds the ID of the reviewer and that reviewer's score (how well that reviewer matches my taste). The reviewer's score is updated as the program runs, as described below. These two classes are incredibly simple, and are defined in the files Book.cs and Reviewer.cs respectively, as shown in listing 2.

    
public class Book
{
   private string asin;
   public string ASIN //... 
   private string myRating;
   public string MyRating //...
   public Book( string asin, string myRating )
    {
      this.asin = asin; this.myRating = myRating;
    }
}
public class Reviewer
{
   private string reviewerID = string.Empty;
   public string ReviewerID //...
   private int score = 0;
   public int Score //...
   private int numReviews = 1;
   public int NumReviews //....

    public Reviewer(string reviewerID, int score)
    {
      this.reviewerID = reviewerID;  this.score = score; 
    }
}
    
    Listing 2 (Compressed)

Database Interactions

The key to interacting with the database is to open the Data Sources window and create a DataSet for the tables, as shown in figure 2.

Figure 2
Figure 2.

With this in place, each table can be dragged onto the form. Delete the resulting data grid and the binding source and leave just the TableAdapters as shown in figure 3.

Figure 3
Figure 3.

Click on an adapter and use the smart tag to choose Edit Queries in DataSet, as shown in figure 4.

Figure 4
Figure 4.

For each of the TableAdapters you'll want to add queries (right-click on the adapter and choose Add Query). You'll need queries to add entries, to delete a set of reviews (by ReviewerID), and to query for how many reviews a given reviewer as entered, as shown in figure 5.

Figure 5
Figure 5.

These queries are implemented in code by passing in the appropriate parameters, as shown for example in listing 3.

 private void InsertBook(
    string asin,
    string title,
    string url,
    string author )
 {
    try
    {
       this.booksTableAdapter.InsertBook( asin, title, url, author );
 
Listing 3 (partial)

They key here is that you call the InsertBook method that corresponds to the query you created on the booksTableAdapter (see red box in figure 5), passing in the parameters specified for the query. The elided code updates the User Interface and handles exceptions (e.g., duplicate entry).

Implementation Overview

The two key methods are FindBooks and FindReviewers.

FindReviewers after the AllReviewers collection has been filled with just my ReviewerID. To keep this clear, I differentiate between myReviewerID (my own Amazon CustomerID) and the ReviewerID for all the other reviewers I'll find who also reviewed the books I've reviewed.

To begin, I clear out the table of my reviews. This is simpler than looking for changes and updating the table; I simply start clean each time.

The FindReviewers method then retrieves all my reviews from Amazon and inserts them into the Reviews table, and adds the book to my type-safe list of Book objects (which holds my rating of the book).

When I've found all my reviews I call FindBookInfo.

FindBookInfo iterates through the MyBooks collection, and for each book requests an ItemLookup from Amazon. It gets back a list of the CustomerIDs of everyone who has ever reviewed that book. I add an entry to the Books database table for each book found, by calling InsertBook() which in turn calls the InsertBook method we added to the booksTableAdapter (listing 3, above).

For more about this decisioin to store this book information in the database, see the sidebar "Book Table Design Question" below.

In any case, the reviewer's ID is added to the AllReviewers dictionary, along with a score computed by finding the difference in our reviews, and subtracting that number from 5 (a simple working algorithm that may well be adjusted in subsequent builds).

Once all the books I've reviewed have been traversed, we return to FindReviewers. This time, however, the AllReviewers collection has all the CustomerIDs of all the customers who have reviewed all the books I've reviewed. For each reviewer, we make a call to Amazon to get all their reviews, and enter their rating. In addition, if the review corresponds to a book I've reviewed, their score is incremented based on how closely our reviews match.

It is worth noting that FindReviewers is thus called twice: once just to find my reviews, and a second time to find all the reviews of all those who have reviewed any of the books I've reviewed. This makes for more code reuse, and less redundancy, at the cost of having more conditional code (e.g., if ( reviewerID == myReviewerID ) ).

Book Table Design Question

It isn't yet clear to me whether I should store the data on each book (author, title, etc.) at this point, or just get that data as needed from Amazon.

My compromise for now is that since I have to get the book information for every book I've reviewed, I might as well store it away, but for the books reviewed by other reviewers, I plan to get it on demand from Amazon.

Interacting with Amazon

We make two types of calls to the Amazon E-Commerce Web Service . The first is in FindReviews in which we execute a call to CustomerContentLookup that returns a collection of customers and all the reviews by that customer.

The interaction is not completely intuitive, so let's walk through it step by step.

First, as described in the previous article, we add AWSECommerce.cs which is the proxy code generated from the Amazon WSDL, and which is available ready-to-use from Amazon as part of their web services kit. An short excerpt of this file is shown in listing 4 to give you a flavor of what it contains.

 
 public class AWSECommerceService : System.Web.Services.Protocols.SoapHttpClientProtocol 
{
    
    /// <remarks/>
    public AWSECommerceService() {
        this.Url = "http://soap.amazon.com/onca/soap?Service=AWSECommerceService";
    }
     //...
         [System.Web.Services.Protocols.SoapDocumentMethodAttribute("http://soap.amazon.com", Use=System.Web.Services.Description.SoapBindingUse.Literal, ParameterStyle=System.Web.Services.Protocols.SoapParameterStyle.Bare)]
    [return: System.Xml.Serialization.XmlElementAttribute("CustomerContentLookupResponse", Namespace="http://webservices.amazon.com/AWSECommerceService/2005-01-19")]
    public CustomerContentLookupResponse CustomerContentLookup([System.Xml.Serialization.XmlElementAttribute("CustomerContentLookup", Namespace="http://webservices.amazon.com/AWSECommerceService/2005-01-19")] CustomerContentLookup CustomerContentLookup1) {
        object[] results = this.Invoke("CustomerContentLookup", new object[] {
                    CustomerContentLookup1});
        return ((CustomerContentLookupResponse)(results[0]));
    }Listing 4

We will use the object-oriented wrapper classes created by the proxy to interact with Amazon. For example, we'll declare a set of objects for making a CustomerContentLookup call:

CustomerContentLookup customerLookup = new CustomerContentLookup();
CustomerContentLookupRequest customerLookupRequest =
    new CustomerContentLookupRequest();
CustomerContentLookupResponse customerLookupResponse;

When we are ready to make a call to find information about a particular customer, we must provide information through the CustomerContentLookup object to the CustomerContentLookupRequest object.

customerLookup.AssociateTag = this.associateTag;
customerLookup.SubscriptionId = this.subscriptionID;
customerLookupRequest.CustomerId = reviewerID;
//...
   customerLookupRequest.ReviewPage = pageNumber.ToString();
   customerLookupRequest.ResponseGroup = 
      new string[] { "CustomerFull" };
   customerLookup.Request =
      new CustomerContentLookupRequest[] { customerLookupRequest };
   try
   {
      OneSecond();
      customerLookupResponse =
         amazonService.CustomerContentLookup( customerLookup );
   }

We set the authorization we need (AssociateTag and SubscriptionTag) and the CustomerID we want information about (we get this from our dictionary of reviewers). We then set the ResponseGroup to include the "CustomerFull" response, which the Amazon documentation indicates will return all the user's reviews. We call the amazonService object's CustomerContentLookup method, passing in the CustomerLookup object which itself contains the CustomerLookupRequest object. What we get back is a CustomerLookupResponse object.

Note also that before calling Amazon, I call my own OneSecond method, whose job is to make sure that at least one second has elapsed since the last time I called an Amazon web service.

    
private void OneSecond()
{
   bool bWaiting = true;
   while ( bWaiting )
   {
      this.lblGating.Text = numMillisecondsGated.ToString( "N" ) + 
       " Seconds gated";
      Application.DoEvents();
      TimeSpan elapsed = DateTime.Now - this.lastAmazonCall;
      if ( elapsed.Seconds >= 1 )
      {
         this.lastAmazonCall = DateTime.Now;
         bWaiting = false;
      }
      else
      {
         System.Threading.Thread.Sleep( WaitTime );
         numMillisecondsGated += WaitTime / 1000f;
      }
   }
}

To get the reviews for that customer, I drill down through the CustomerLookupResponse returned by the Amazon web service.

Customers[] customersFound = customerLookupResponse.Customers;
if ( customersFound.Length > 0 )
{
   CustomerReviews custReviews;
   CustomerReviews[] customerReviews;
   Customer[] customerCollection;
   try
   {
      customerCollection = customersFound[0].Customer;
      customerReviews = customerCollection[0].CustomerReviews;
      custReviews = customerReviews[0];
   }
//..
 Review[] reviews = custReviews.Review;
 foreach ( Review review in reviews )
 {
    InsertReview(
       review.ASIN,
       customerLookupRequest.CustomerId,
       review.Rating,
       review.Summary,
       review.Content );

In the elided code (just before setting the Review[] array), I clear out all the reviews for this customer so as to start clean, and I also set the number of pages of reviews returned (there can be up to ten pages, each with up to ten reviews).

FindBookInfo works much the same way except that I use an ItemLookup object rather than a CustomerLookup object.

ItemLookup itemLookup = new ItemLookup();
ItemLookupRequest itemLookupRequest = new ItemLookupRequest();
ItemLookupResponse itemLookupResponse;
itemLookup.AssociateTag = this.associateTag;
itemLookup.SubscriptionId = this.subscriptionID;
foreach ( Book theBook in myBooks )
{
   pageNumber = 1;
   totalPages = 1;

   while ( pageNumber <= totalPages  )
   {
      itemLookupRequest.ItemId = new string[] { theBook.ASIN };
      itemLookupRequest.ReviewPage = pageNumber.ToString();
      itemLookupRequest.ResponseGroup = new string[] 
            { "Small", "Reviews", "ItemAttributes" };
      itemLookup.Request = new ItemLookupRequest[] { itemLookupRequest };
      try
      {
         OneSecond();
         itemLookupResponse = amazonService.ItemLookup( itemLookup );
      }
      catch ( Exception ex )
      {
         lblMessage.Text = ex.Message;
         break;
      }

Here I iterate through all the books I've reviewed, and for each one I make a request, setting the ItemID to the ASIN for the book. The ResponseGroup is important, as this tells the Amazon web service which information I want about the particular book.

Once I have the item response (that is, the information about each book I've reviewed) I can drill down to get the book info and insert that information into the database (see "Book Table Design Question" above).

Items[] itemsArray = itemLookupResponse.Items;
if ( itemsArray.Length > 0 )
{
   Item[] itemArray = itemsArray[0].Item;
   if ( itemArray.Length > 0 )
   {
      CustomerReviews customerReviews = itemArray[0].CustomerReviews;
      if ( customerReviews != null )
      {
         if ( pageNumber == 1 )
         {
            totalPages = Convert.ToInt32( customerReviews.TotalReviewPages );
            string url = itemArray[0].DetailPageURL == null ? 
               string.Empty : itemArray[0].DetailPageURL;
            string title = itemArray[0].ItemAttributes.Title == null ? 
               string.Empty : itemArray[0].ItemAttributes.Title;
            string author = string.Empty;
            if ( itemArray[0].ItemAttributes.Author != null &&
               itemArray[0].ItemAttributes.Author[0] != null )
            {
               author = itemArray[0].ItemAttributes.Author[0];
            }
            InsertBook( theBook.ASIN, title, url, author );
         }

That done, I can now get all the reviews for the book (that is, the reviews by other reviewers) and for each one found, I compare my rating to theirs, and update the reviewer in the allReviewers collection.

// find everyone who has reviewed this book
Review[] reviewArray = customerReviews.Review;
foreach ( Review theReview in reviewArray )
{
   string reviewerID = theReview.CustomerId;
   if ( reviewerID != myReviewerID )  // note me?
   {
      int score = 0;
      try  // make sure you have integral ratings (as you should)
      {
         int myRating = Convert.ToInt32( theBook.MyRating );
         int theirRating = Convert.ToInt32( theReview.Rating );
         score = 5 - ( System.Math.Abs( myRating - theirRating ) );
      }
      catch
      {
         continue;
      }

      try
      {
         if ( allReviewers.ContainsKey( reviewerID ) )
         {
            allReviewers[reviewerID].Score += score;
         }
         else
         {
            allReviewers.Add( reviewerID, new Reviewer( reviewerID, score ) );
         }
         this.lblInformation.Text = "Reviewers: " + allReviewers.Count.ToString();
      }
      catch ( Exception ex )
      {
         this.lblMessage.Text = "Unable to add reviewer " + reviewerID + ex.Message;
      }
   }  // end if not me
}     // end for each review

It is possible to ask for ten items (books) at a time, and that is an obvious opportunity for optimization, though it will slightly complicate the loop.

Once we've found all the reviews for all the books I've reviewed, the collection allReviewers has the ID of every Amazon customer who has reviewed any books I've reviewed. We then pass that back to FindReviewers, and iterate through the list, getting all their reviews of all the books they've ever reviewed, and setting their score if I've reviewed the book as well. The end point is that I have a list of reviewers with scores for how well they match my taste, and a list of other books they've reviewed and how they rated those books. That is all I need to find books highly rated by readers who agree with my ratings of books we've both rated!

The complete source for this application is available on my web site; just click on Books and then on Articles.

Jesse Liberty is a senior program manager for Microsoft Silverlight where he is responsible for the creation of tutorials, videos and other content to facilitate the learning and use of Silverlight. Jesse is well known in the industry in part because of his many bestselling books, including O'Reilly Media's Programming .NET 3.5, Programming C# 3.0, Learning ASP.NET with AJAX and the soon to be published Programming Silverlight.


Read more Liberty on Beta 2 columns.

Return to WindowsDevCenter.com

Copyright © 2009 O'Reilly Media, Inc.