Web DevCenter
oreilly.comSafari Books Online.Conferences.
MySQL Conference and Expo April 14-17, 2008, Santa Clara, CA

Sponsored Developer Resources

Web Columns
Adobe GoLive
Essential JavaScript
Megnut

Web Topics
All Articles
Browsers
ColdFusion
CSS
Database
Flash
Graphics
HTML/XHTML/DHTML
Scripting Languages
Tools
Weblogs

Atom 1.0 Feed RSS 1.0 Feed RSS 2.0 Feed

Learning Lab






Why Try to Out-Google Google?
Pages: 1, 2

How Google Could Out-Step Itself (Or How Other Search Engines Could Go Beyond What They're Offering Now)

On one hand, looking at the potential of Internet search is frustrating because of the limiting factors that aren't in your control. If only XML was widely adopted. If only everybody used title tags. If only domain names were more descriptive. Etcetera, etcetera. But on the other hand, other technology is developing that does make powerful and more extensive searching and crawling possible. Google could expand what they've got and become even cooler than they already are. How? Here are five ways, off the top of my head.



  1. RSS feeds of all its properties.

    The RSS format is a very handy way to read a lot of different Web sites without spending a lot of time waiting for pages to load. I'd sure love an RSS feed of Google News searches on the keyword of my choice.

    It's weird about RSS. Lesser-known search engines like Daypop are making great use of it. But none of the major general search engines are. Why? If the concern is losing ad revenue, why not include an ad in the RSS feed?

  2. A customizable "all-in-one" search.

    I know, I know; an all-in-one search is veering dangerously close to Portalville. But I think in this case, it's warranted. Google has so many properties that would provide complementary resources -- Froogle and Google Catalogs, for example, or Google's web search paired with results from Google News. I'd love an interface where I could say, "Give me the results from this query and use Google News, Google Web Search, and Google Blogs (that last one is only if the rumors floating around are true), and then present all the results on one page." Can't Google (or any other search engine, really) aggregate its own search sources without it being portalitis?

  3. Expand its API to other Google properties.

    Last summer will forever be burned into my mind as the summer that I ate, drank, slept, and breathed the Google API to write Google Hacks. Even now a small part of my brain patiently grinds away, coming up with neat things to do with Google and the Google API (and until this part of my brain gives up and goes away you can see its results at www.buzztoolbox.com/google/).

    But even though I rapidly discovered the tons of possibilities enabled by Google's API (and I'm very grateful to them for releasing it), I just as quickly found the limitations. No access to Google News or Google Images or most of the other collections. Only 1,000 queries per day, with only 10 results per query. Not all the special syntaxes (such as the phonebook: syntax) work. The API would be even more exciting if access to the other Google properties were available through it.

  4. Reach out to information-collection publishers.

    Google is often reluctant to discuss how the guts of its indexing technology work, and I can't say that I blame them. If too much were understood about how Google indexed and ranked its contents, people would spend too much time playing "Let's try to fool Google," instead of "Let's try to fill Google up with excellent content."

    The downside of this is that there's a group of content publishers who are caught in the middle. I'm referring to librarians and other information professionals who are often in charge of putting large collections of information online. Usually, those kinds of content publishers have far better things to do than spend extensive amounts of time trying to make sure their content gets indexed. This is a pity because it is exactly these kinds of information collections (extensive, unique, often not available anywhere else online) that are so valuable to search engines.

    It would be great if Google (or any other search engine) took some of its resources and made an effort to reach out to those groups (librarians, information professionals, government officials, and so forth) who are regularly publishing large information collections, and assist them in getting their content indexed as regularly and completely as possible. How should they be using title tags? Is their database-driven site restricting their chances of being indexed? How can they use Google tools to offer search on their own sites, perhaps with some specialty forms for sub-collection searching?

  5. Pay attention to successful uses of its API.

    I saved this one for last because I suspect that Google's already doing it, but it wouldn't hurt for it to be mentioned. It would be great if Google looked around at what folks are doing with the Google API and incorporated it into their offerings. No, I don't think they ought to have a "Goocookin'" section on their site, but what about Google Alert at www.googlealert.com? That site must have thousands of users. Isn't that an audience that Google wants to cultivate?

Like I said, I suspect Google is already doing this. I wonder if the other search engines are watching what's being created with the Google API and coming up with some ideas of their own?

Trying to "out-Google Google" is a bad idea in the fast-moving timestream of the Internet. By the time you think you've out-Googled Google, they've out-Googled themselves and you're still behind. But if search engines took at close look at the cultural factors that led to some of Google's success, and then considered how they could leapfrog what Google's doing now -- we'd have some search engine competition that I'd really look forward to!

Tara Calishain is the creator of the site, ResearchBuzz. She is an expert on Internet search engines and how they can be used effectively in business situations.


O'Reilly & Associates recently released (February 2003) Google Hacks.


Return to the Web Development DevCenter.