Web DevCenter
oreilly.comSafari Books Online.Conferences.
MySQL Conference and Expo April 14-17, 2008, Santa Clara, CA

Sponsored Developer Resources

Web Columns
Adobe GoLive
Essential JavaScript
Megnut

Web Topics
All Articles
Browsers
ColdFusion
CSS
Database
Flash
Graphics
HTML/XHTML/DHTML
Scripting Languages
Tools
Weblogs

Atom 1.0 Feed RSS 1.0 Feed RSS 2.0 Feed

Learning Lab






Amazon Hacks

Fail-Safe Amazon Images

by Paul Bausch, author of Amazon Hacks
10/22/2003

Amazon Web Services (AWS) allow anyone with some coding skills to create applications using Amazon's data. It's fairly easy to transform an AWS response into HTML and show a list of products and images on a remote site. Many ISPs put checks in place to stop "image leeching" — referencing an image URL from a remote server directly in a <img> tag. Amazon doesn't mind if you leech its images, though; in fact, they encourage it. But relying on someone else's data on someone else's servers introduces some challenges, and when you're putting together a dependent, distributed application you need to prepare for the worst while you're planning for the best. In this article, I'll show you how to properly display Amazon product images in your apps.

When you request information about a product through Amazon's Web Services, you get an impressive amount of data about that item. Each XML response also includes three URLs that point to small, medium, and large images of the product. (The XML tags that hold the image URLs are self-describing: ImageUrlSmall, ImageUrlMedium, and ImageUrlLarge.) With the URLs for the product images in hand, you have a couple of choices about how to use them.

To Host or Not to Host the Images?

If you want to display Amazon product images on your web site, you can save the image to your server and use a local path in the source attribute of your <img> tags, or simply use the Amazon URL in your <img> tags and let Amazon's servers do the work of displaying images.

At first glance, it seems that letting Amazon serve the product image is the best way to go: there's no extra coding to cache the image, and you save bandwidth. In practice, though, Amazon's image server can be unreachable for a variety of reasons. Keeping product images local means that you will have an image to display, whether Amazon's servers are responding or not. In my experience working with Amazon, the image server is rarely unreachable, but when it's not responding—even for just five or ten minutes at a time—a page full of broken images looks pretty bad. Another reason to consider caching product images locally is that some products don't have images. That sounds counterintuitive at first, too, but the process of caching the image gives you a chance to see if the product actually has an image.

Related Reading

Amazon Hacks
100 Industrial-Strength Tips & Tools
By Paul Bausch

When There Are No Images

Amazon has an incomprehensible number of products, and it's not possible for every one of them to have an image. The problem is that Amazon's API doesn't let you know which products have images and which products don't have images. Amazon always returns the image URLs, whether the product actually has an image or not. If the product doesn't have an image associated with it, all of the image URLs returned by the Amazon API will be single-pixel GIFs. The GIFs are transparent, and with some designs, that works fine. But if your product images have a border, or rely on the image being there for spacing, the single-pixel GIF can wreak havoc. There are a few ways to detect which products have images and which don't, and which method to use depends on whether you or Amazon are hosting them.

A Server-Side Solution

If you've decided to cache images locally, it makes sense to do the "image detection" at your server. In Amazon Hacks, Hack #84 provides code for this by checking the resulting image file's byte size. This method works well, but it requires your script to download the entire file, which can cause delays if you're working with the large image size. There's an even shorter way to determine whether or not the image is there for you to display.

The transparent GIF that's returned still has the JPG extension. By all appearances, it's a valid file. But the HTTP headers don't lie. By examining the headers for a given image URL, you can find out whether it's really a GIF or JPEG you're about to download. The headers give all sorts of information about the response, but the only values we're really interested in are the Content-Type (GIF or JPEG, in this case) and the Content-Length.

For example, Amazon Hacks has an image and the image URL returned by the Amazon API for the medium image is:

http://images.amazon.com/images/P/0596005423.01.MZZZZZZZ.jpg

We can use a script to see what the relevant HTTP headers for the request are:

Content-Length: 5148
Content-Type: image/jpeg

By contrast, the book Using Email Effectively doesn't have an image. The URL returned by the Amazon API for the medium image is:

http://images.amazon.com/images/P/1565921038.01.MZZZZZZZ.jpg

And its relevant headers:

Content-Length: 807
Content-Type: image/gif

As you can see, the image type is completely different and the content length is much smaller. Using this difference as a criteria for whether or not the image "exists," you can write routines in any scripting language to do this check for you.

In ASP

This ASP function uses the Microsoft XML parser to request the image headers. If the image's Content-Type is image/jpeg, the function returns True, otherwise False.

Function hasImage(URL_in)
    Set xmlhttp = Server.CreateObject("Msxml2.SERVERXMLHTTP")
    xmlhttp.Open "GET", URL_in, false
    xmlhttp.Send(Now)
    strCT = xmlhttp.getResponseHeader("Content-Type")
    
    If strCT = "image/jpeg" Then
        hasImage = True
    Else
        hasImage = False
    End If
    Set xmlhttp = Nothing
End Function

In PHP

Here's a variation on the theme for PHP. You'll need a package that supports fetching HTTP headers, like the PEAR HTTP_Request class this function uses.

require 'HTTP/Request.php';

function hasImage($URL_in) {
    $r = new HTTP_Request($URL_in);
    $r->sendRequest();
    $ct = $r->getResponseHeader("Content-Type");
    if ($ct == "image/jpeg") {
        return true;
    } else {
        return false;
    }
}

In Perl

Here's the same routine in Perl. You'll need a module that supports HTTP requests, and this example uses the convenient LWP::Simple.

use LWP::Simple;
 
sub hasImage {
    my ($URL_in) = @_;
    $content = head($URL_in);
    if ($content->content_type eq "image/jpeg") {
        return true;
    } else {
        return false;
    }
}

With these functions, you can check to see if an image exists and take the appropriate action: display it if it's there, replace it with a generic "no image available" graphic if not. Using these methods makes the most sense if you're caching images locally. You wouldn't want to examine the HTTP headers for every image you want to display every time someone requests a page on your server; all of this "pre-processing" would slow down your application. Also keep in mind that you can't cache images indefinitely — Amazon's terms of service require you to refresh any cached images every 24 hours. To get a jumpstart on caching Amazon images, refer to Hack #93 in Amazon Hacks, "Cache Amazon Images Locally."

And this still doesn't solve the problem of image availability. If the Amazon image server isn't up, there aren't any HTTP headers to look at, anyway. But there is another way to work around products without images — even if you're letting Amazon do the work of serving them.

A Client-Side Solution

By examining the resulting HTTP document after it's loaded, you can tell which products have images and which don't. JavaScript has access to web pages and all of their elements, so you can create a script that resides in the page to verify that all of the images are valid. Instead of examining HTTP headers to find the single-pixel GIFs, you'll look at image height and width. Yep, you guessed it, single-pixel GIFs are only one pixel high and one pixel wide. Finding those, you can replace the single-pixel GIF with a suitable "no product image" image.

Using the document.image collection, you can loop through every image on a web page and get its attributes. The attributes we're most interested in are width, height, and source. The source is important because you only want to process images being served by Amazon. If the page has local single-pixel GIFs for design purposes, you wouldn't want to replace those with a "no product image" image. Here's a bit of JavaScript that loops through all of the images on a page and gets the height and width of those being served by Amazon:

for (var i = 0; i < document.images.length; i++) { 
    img = document.images[i];
    if (img.src.indexOf('images.amazon.com') >= 0) {
        w = img.width;
        h = img.height;
        if ((w == 1) || (h == 1)) {
              img.src = 'book_noimage.gif';
        }
    }
} 

Take a look at the last if inside the loop. If the image width or height equals 1, you know you have a product without an image, and the image source is replaced with your generic "no product image" graphic.

While we're at it, why not tackle the problem of a non-responsive Amazon image server? Another image attribute JavaScript has access to is complete. If the complete attribute is true, that means the images have successfully loaded and your visitor can see it on the page. If complete is false (or doesn't exist), we can assume our visitor will be looking at a broken image soon. By adding a line to handle this, you can replace non-responsive images with your "no product image" graphic as well:

if ((img.complete != null) && (!img.complete)) {
    img.src = 'book_noimage.gif';
}

After putting it all together into a JavaScript function, you can add this to any page that has Amazon images to keep things looking good:

<script language="JavaScript">
function verify_images() {
    for (var i = 0; i < document.images.length; i++) { 
        img = document.images[i];
        if (img.src.indexOf('images.amazon.com') >= 0) {
            w = img.width;
            h = img.height;
            if ((w == 1) || (h == 1)) {
                img.src = 'book_noimage.gif';
            } else if ((img.complete != null) && (!img.complete)) {
                img.src = 'book_noimage.gif';
            }
        }
    } 
}
</script>

Make sure this function runs after the entire page has loaded by calling it from the OnLoad attribute of the <body> tag, like so:

<body OnLoad="verify_images();">

Don't forget to create a local graphic called book_noimage.gif, so that the missing or unresponsive images will be replaced by a local graphic.

Wherever you decide to apply these checks, they should help ensure that your dependent application looks good no matter what the conditions are.

Paul Bausch is a co-creator of the weblog software Blogger, maintains a directory of Oregon-based weblogs at ORblogs.com, and is the author of the forthcoming Yahoo! Hacks.


O'Reilly & Associates recently released (August 2003) Amazon Hacks.


Return to the Web Development DevCenter.