In addition to the tricks discussed in the ActiveState Field Guide To Spam, spammers are already started foiling the filters by throwing in random real words. I regularly get spam through two levels of filtering (SpamAssassin and Eudora) that looks like this:
Our rates are the lowest! You can get 3.45% fixed for
rough pencil final happy
30-years! Follow this link to get the best rates
napkins canine amazed
in the country, but only for a limited time!
The extra random non-spam text foils it. And, since the words are random, tactics to get a checksum or signature on it are, or will be, useless. I suspect it won't be long before spam comes through with three lines of spam content, and a couple K of random words. If we get to where words that are clearly random are somehow caught, then the spammers will turn to pulling random pages off the net for their obscuring text. Maybe they'll throw in, say, a few pages of Macbeth to foil things.
The answer is to stop the spammers before they get their message in. All content-based filtering depends on the spammer getting their payload to us first, instead of checking them at the gate. This will mean a replacement of SMTP. Until then, SPF seems to have potential, but it has its drawbacks.
Mind you, I'm not throwing away my SpamAssassin install. It helps stop a significant amount of the spam. Unfortunately, content-based filtering is a Band-Aid on the real problem.
Andy Lester is a QA & Release Manager for Socialtext. He is also in charge of PR for The Perl Foundation and maintains over 25 modules on CPAN.
oreillynet.com Copyright © 2006 O'Reilly Media, Inc.