Apache James (Java Apache Mail Enterprise Server) is a 100% pure-Java SMTP and POP3 Mail server that performs double duty as a completely configurable open source mail application platform.

James is really cool. If you’re a Java hacker like me the whole idea of having a powerful, open-source email server that you can endlessly modify and extend is awesome. James has been around for a while and is past release 2.0 already, so it’s pretty solid as well.

Recently, some discussion broke out on the james-user list about building some extensions to James to allow it to perform as a ’spam honeypot’. The idea is to write a James-based application that waits for incoming spam and captures real-time statistics on the spam it receives.

One of the ideas generated was to use the application to dynamically create new spam filters as you discover new sources for spam.

Another subscriber recommended just working with the folks at Spamhaus instead. These people have tracking spam sources from around the globe down to a science.

For example, Spamhaus discovered that a spammer has been executing a ‘dictionary’-type attack on Hotmail for over 5 months. According to them, this person has been testing email address combinations at a rate of 3-4 per second, 24 hours a day, 7 days a week continuously. They were able to track the offenders down to a series of e-mail servers based in Bejing, China that they believe are owned by American spammers.

So how would developing Spam Honeypot with Apache James help? Well, to begin with it would allow the dynamic updating of spam filters for that install of James.

And if a group of people were to collaborate and work together to create a James mail-app that could capture spam and update a central, shared database then it might be possible for *all* servers running the mail-app to notify each other (through the shared database) whenever a new source for SPAM were found.

Of course, there are already DNS-based Real-time Black Lists (rbl’s) of spam senders, but using this approach you could filter spam using much more than just reverse-dns info on the sender. You could perform all kinds of analysis on the content of the spam as well.

Sort of grid-computing’s answer to spam trapping…