Honeyd Research: Honeypots Against Spam

Honeyd can be used effectively to battle spam. Since June 2003, Honeyd has been deployed to instrument several networks with spam traps. We observe how spammers detect open mail relays and so forth. The diagram below shows the overall architecture of the system.

The networks are instrumented with open relays and open proxies. We intercept all spam email and analyze why we received it. A single Honeyd machine is capable of simultaneously instrumenting several C-class networks. It simulates machines running mail servers, proxies and web servers. Captured email is sent to a collaborative spam filter that allows other users to avoid reading known spam.

Curiously, this setup has also been very successful in identifying hosts infected with worms.

Our findings are going to be made available as research paper in the near future. For questions, please contact Niels Provos.

Operating System Distribution and Spam Frequency
An interesting question for understanding how spammers operate is what operating system do they use.

Operating System Distribution Across Spammers

Using the support for passive fingerprinting in Honeyd 0.7, it is possible to identify the operating system that opens a connection to our spam traps. For each such connection, we try to identify the remote operating system on the TCP SYN segment. To determine the distribution of operating systems used to send spam, we count the number of times that an operating system connects to one of the spam trap systems and attempts to relay spam email.

Even though we can not identify the operating system for 53% of the connections, Linux is being used for at least 43% of all spammy connections. Solaris, Windows and FreeBSD are used infrequently.

In summary, most machines that submit spam are running or compromising either Linux or Solaris. It seems that Unix is the favorite operating system flavor used to send spam.

When looking at the number of spam emails intercepted by the honeypots, we see a noticeable increase in spam email in October.

This can be explained for several reasons. Spammers have become more aggressive in probing for open mail relays and some of the honeypots have been published in MX records for mail domains.

We also see that the number of IP addresses submitting spam has increased over the months, too.

Support

If you have suggestions on how to improve catching spam or would like to make resources available, please let me know.