Honeyd can be used effectively to battle spam. Since June 2003,
Honeyd has been deployed to instrument several networks with spam
traps. We observe how spammers detect open mail relays and so forth.
The diagram on the right shows the overall architecture of the system.
The networks are instrumented with open relays and open proxies. We
intercept all spam email and analyze why we received it. A single
Honeyd machine is capable of simultaneously instrumenting several
C-class networks. It simulates machines running mail servers, proxies
and web servers. Captured email is sent to a collaborative spam
filter that allows other users to avoid reading known spam.
Curiously, this setup has also been very successful in identifying
hosts infected with worms.
Our findings are going to be made available as research paper in the
near future. For questions, please contact Niels Provos.
Honeyd Spam Research Overview
Operating System Distribution and Spam Frequency
An interesting question for understanding how spammers operate
is what operating system do they use.
Using the support for passive fingerprinting in Honeyd 0.7, it is
possible to identify the operating system that opens a connection to
our spam traps. For each such connection, we try to identify the
remote operating system on the TCP SYN segment. To determine the
distribution of operating systems used to send spam, we count the
number of times that an operating system connects to one of the spam
trap systems and attempts to relay spam email.
Even though we can not identify the operating system for 53% of the
connections, Linux is being used for at least 43% of all spammy
connections. Solaris, Windows and FreeBSD are used infrequently.
In summary, most machines that submit spam are running or compromising
either Linux or Solaris. It seems
that Unix is the favorite operating system flavor used to send spam.
Operating System Distribution Across Spammers
When looking at the number of spam emails intercepted by the
honeypots, we see a noticeable increase in spam email in
This can be explained for several reasons. Spammers have
become more aggressive in probing for open mail relays
and some of the honeypots have been published in MX records
for mail domains.
We also see that the number of IP addresses submitting spam
has increased over the months, too.
Number of spam emails and IP addresses
If you have suggestions on how to improve catching spam or would like
to make resources available, please let me know.