Spam prevention with Exim and greylistd - Part 1

Spam prevention with Exim and greylistd - Part 1


Traditional methods of spam protection involve using Bayesian detection rules (usually via SpamAssassin) on messages after they have been accepted by your server. Most mail sysadmins may have encountered the constant cries from their users asking "can't you stop them sending it?". Of course you can't stop somebody sending a message but you can stop accepting them in the first place. Enter greylisting.

These two articles are kind of follow-ons to my previous article on spam prevention in exim mail servers. Think of it as an appendix. If you are starting from scratch you might find is useful to go and read that first.

What is greylisting?

Traditional spam techniques employ the concepts of blacklists and whitelists. The former being a list of senders that the server will always refuse to accept mail from, the latter being the opposite. Both lists are generally managed manually at user or sysadmin level.

Greylisting inserts a third list which is more temporary in nature. When a sender attempts to deliver a message to the server they temporarily refused with a request to try again later. The sender is then added to the greylist. The theory is that legitimate mail servers will try again after a set period of time. Spammers tend to use their own servers or have one installed as part of a botnet. These are more "hit and run" delivery agents and are unlikely to attempt delivery to the same recipient again. Greylisting as a concept has been bounced around for a while but most modern implementations follow Evan Harris' white paper on the subject.

The greylist method I am concentrating here (greylistd) also has two time-out settings. Any sender who attempts a retry too son will still be refused, any sender who does not attempt a retry within the second period will be moved from the greylist to a blacklist. Any sender that retries successfully will be added to a whitelist and on their next delivery will not be refused at all (by greylistd). Figure 1 includes a diagrammatic summary of the process.

Note that there is another form of spam-prevention which is sometimes called greylisting. This involves a challenge-response process where new senders (e-mail address only in this case) are sent an autoreply asking them to either go to a web page or reply to the autoresponse in order to confirm they are a real sender. I am not talking about that here nor do I advocate it does not allow for spoofing and in that context is little better than those "You sent us some spam!" warning/advertisement autoresponders.

Figure 1: A typical greylisting processFigure 1: A typical greylisting process

A word about "senders"

When I say "senders" here I am not referring to just the sending e-mail address. Spammers have long used e-mail spoofing to disguise themselves and will rarely send "from" the same address twice. Thus any filtering attempts made on the sending address only will ultimately give false negatives. Modern methods of identifying a sender will link the sending host and address. Greylistd maintains its lists using sender triplets of sending host+sender address+recipient address. This further adds the dimension that because a sender/host combo has successfully delivered to one of your local users, it does not necessarily mean you should allow them openly send to all your users. If you permitted this then all a spammer would need do is make sure they had sent successfully to one local address.

Setting it up

I'll assume you have Exim installed and running by now and also note that I am referring to a Debian (Etch) system running Exim 4 here to keep it consistent with previous articles. If you use Postfix then take a look at the postgrey package. Here's the process for greylisting with Exim 4 on Debian:

  1. Backup your Exim configuration Back up your exim config files (cp -R /etc/exim4 /etc/exim4.pre-greylist will do the trick on a Debian system). Don't skip this step no matter how experienced you are.

  2. Install greylistd Install greylistd with your preferred package manager (apt-get install greylistd in my case).

  3. Configure greylistd Configuring greylistd is, unsurprisingly, a case of editing the /etc/greylistd/config file. You won't need to change too much so just pay attention to the [timeouts] section particularly retryMin. You want to ensure that spammers sending multiple messages in one session to the same local address are continually rejected but if you set it too high the legitimate mail may be delayed too much for your users' taste. I would say 300 seconds (5 minutes) is a reasonable setting.

You may also need to add the Debian-exim user to the greylist group: useradd Debian-exim greylist. This may have been done when greylistd was installed though - debconf should have told you if it was. Don't forget to restart the greylistd daemon after you are done editing (/etc/init.d/greylistd restart).

  1. Reconfigure Exim You need to re-configure Exim to use greylistd as well. If you have used Debian's default Exim4 setup you can use the script supplied as part of the greylistd package.Running greylistd-setup-exim4 add will add the relative config options to the correct ACL.

If you don't use the default Debian multi-file Exim setup (or you don't fancy trusting your MTA config to a script) you can read the files in /usr/share/doc/greylistd/examples/ for guidance on what to add and where. They are quite straighforward.

  1. Restart Exim /etc/init.d/exim4 restart will do the trick there.

Tailing your exim mainlog (tail -f /var/log/exim4/mainlog) will tell you if it all worked. You should start to see log entries like this..

2008-09-11 06:25:59 H=([117.4.26.192]) [117.4.26.192] F=<a.sender@somedomain.com> temporarily rejected RCPT <an.address@yourdomain.com>: greylisted.

Eventually these will be added to by entries with the same host, sender and recipient address but this time delivered as normal.

If all goes according to plan you'll see fewer SPAM messages arriving in your users' mailboxes. Your users' may see a short delay in mail being delivered but not too much. Just remind them that e-mail was never intended to be an instant delivery medium anyway. Greylisting may also become less effective as spammers start to find ways around it but for now it works and -- on free platforms at least -- is fairly simple to setup.

Next time I'll look into maintaining a greylist installation and how to keep track of the items on the lists.

References

Category: 

Comments

Maurice Cepeda's picture

"any sender who does not attempt a retry within the second period will be moved from the greylist to a blacklist."

This is inconvenient. I'd hate to be left wondering if my important email got through, or having asked for their confirmation in my email, being left wondering if my contact knows to send it twice because the first confirmation may get stuck. Which may lead to an impetus to resend more than twice, out of sheer worrying.

Having asked my prof once about getting my email, he told me, "Yes I got all __five__ of them. Now I don't know if he actually got five, but he didn't like getting multiple emails, and I don't either.

And I don't like the idea of having my emails put in a limbo state of a grey list, and then possibly thrown into a black list based on a certain period of time I don't know of. The user shouldn't have to worry about these things. It makes for an inefficient and unreliable mail delivery, and has a lot to desired in terms of public confidence.

I attended a presentation held by a local UNIX users group where an entirely new model was proposed. I don't have the technical expertise to describe in it precise terms, but after receiving mail your mail delivery service prodded the sender's IP (something like pinging, not an email) --this after a certain amount of time has passed. If the sender was still there (meaning the IP was valid and there), the mail was delivered, if not ... no-go.

This was based on two assumptions, the first that grey listing isn't full proof, or so he'd have us believe. He mentioned the listing methods always are needlessly restrictive on some incidents --creating false positives-- and insufficiently so in other cases.

And second that spammers are never going to be located at the same IP for long. They do their business as quickly as possible and get out before getting tracked. Thus the need to wait the time period, followed by the prodding.

The presenter suggested this method did away with complicated listing methods or using multiple methods, was almost completely full proof and would diminish false positives too, meaning you're getting to get your --sometimes badly-- needed email. And in my case of a few days ago, I wouldn't have gotten that early morning weekend call --interrupting my sleep-- asking, "Where's my document!!??".

Honestly in my layman's view, I can't think of a better way to do this.

Ryan Cartwright's picture

Read the section called "A word about senders" again. "Sender" in this case is not you, it's your outgoing/relaying server trying to deliver a message from you to your professor - the sending triplet. So you send a message to your professor and your outgoing/relaying server is asked to try again - which it does after a short period and the message is delivered.

Any relay server will encounter delays with delivering messages to remote hosts. Properly configured servers will try again after a short while. In fact most have a series of retry attempt periods before giving up entirely. The "try again" mechanism is already in place across the Internet and no user has to worry about it. Greylisting simply employs this - it places no further demands on either user.

Remember that it's only the first e-mail that is delayed (not the first attempt for each message you are trying to send). All subsequent ones are let through because the sending triplet is whitelisted.

The method you describe would fail in a lot of cases because a lot of servers reject such lookups. for example I run a relaying server which cannot be pinged and will not accept inbound SMTP except from my public MX server. It has reverse DNS set on it but any attempt to connect to it by something other than an client within my LAN or my MX server will be dropped by my firewall. Thus delivery from me to you would not happen.

Greylisting is not fool-proof (no anti-Spam solution is) but in combination with spam scanning (e.g. SpamAssassin), reverse DNS lookups and sender (e-mail address) verification, it helps.

cheers
Ryan

Maurice Cepeda's picture

Thanks for patience and explaining the pertinent details.

About dropping any prodding, yeah ... come to think of it, that's how I configure my firewalls.

Ryan Cartwright's picture

You're welcome. To be honest it's a complex subject to describe. Familiarity with SMTP procedures helps of course and it is a lot simpler to use than write about.

cheers Ryan

Author information

Ryan Cartwright's picture

Biography

Ryan Cartwright heads up Equitas IT Solutions who offer fair, quality and free software based solutions to the voluntary and community (non-profit) and SME sectors in the UK. He is a long-term free software user, developer and advocate. You can find him on Twitter and Identi.ca.

Most forwarded

Interview with Dave Mohyla, of DTIDATA

Dave Mohyla is the president and founder of dtidata.com, a hard drive recovery facility based in Tampa, Florida.

TM: Where are you based? What does your company do?
DTI Data recovery is based in South Pasadena, Florida which is a suburb of Tampa. We have been here for over 10 years. We operate a bio-metrically secured class 100 clean room where we perform hard drive recovery on all types of hard disks, from laptop hard drives to multi drive RAID systems.

Anybody up to writing good directory software?

Since the very beginning, directories (of any kind) have had a very central role in the internet. (I have recently grown fond of Free Web Directory. Even Slashdot can be considered a directory: a collection of great news and invaluable user-generated comments. As far as software is concerned, doing a quick search on Google about software directories will return the free (as in freedom) software directories like Savannah, SourceForge, Freshmeat and so on, followed by shareware and freeware sites such as FileBuzz, PCWin Download Center and All Freeware (great if you're looking for shareware and freeware, but definitely less comprehensive than their free-as-in-freedom counterparts).

Interview with Mark Shuttleworth

Mark Shuttleworth is the founder of Thawte, the first Certification Authority to sell public SSL certificates. After selling Thawte to Verisign, Mark moved on to training as an astronaut in Russia and visiting space. Once he got back he founded Ubuntu, the leading GNU/Linux distribution. He agreed on releasing a quick interview to Free Software Magazine.

Is better education the key to finding better software?

I read David Jonathon's article Anybody Up To Writing Good Directory Software? the other day, which got me thinking about software directories in general. As David mentioned, many of the software directories one finds when doing a quick google search are free as in beer, not as in freedom. But what interests me is the software directories that already exist, providing a combination of both free as in beer software, and open source software. Sites such as Freeware Downloads and Shareware Download don't advertise themselves as providing free as in liberty software, but each of them have a good selection of open source software available... if you know where to look.

Most emailed

Free Open Document label templates

If you’ve ever spent hours at work doing mailings, cursed your printer for printing outside the lines on your labels, or moaned “There has got to be a better way to do this,” here’s the solution you’ve been looking for. Working smarter, not harder! Worldlabel.com, a manufacture of labels offers Open Office / Libre Office labels templates for downloading in ODF format which will save you time, effort, and (if you want) make really cool-looking labels

Creating a user-centric site in Drupal

A little while ago, while talking in the #drupal mailing list, I showed my latest creation to one of the core developers there. His reaction was "Wow, I am always surprised what people use Drupal for". His surprise is somehow justified: I did create a site for a bunch of entertainers in Perth, a company set to use Drupal to take over the world with Entertainers.Biz.

Update: since writing this article, I have updated the system so that the whole booking process happens online. I will update the article accordingly!

So, why, why do people and companies develop free software?

More and more people are discovering free software. Many people only do so after weeks, or even months, of using it. I wonder, for example, how many Firefox users actually know how free Firefox really is—many of them realise that you can get it for free, but find it hard to believe that anybody can modify it and even redistribute it legally.

When the discovery is made, the first instinct is to ask: why do they do it? Programming is hard work. Even though most (if not all) programmers are driven by their higher-than-normal IQs and their amazing passion for solving problems, it’s still hard to understand why so many of them would donate so much of their time to creating something that they can’t really show off to anybody but their colleagues or geek friends.

Sure, anybody can buy laptops, and just program. No need to get a full-on lab or spend thousands of dollars in equipment. But... is that the full story?

Fun articles

Santa Claus - the most successful open source project

It dawned on me the other day, as I was shopping for the dozens of gifts it seems I have to buy every December, that Santa Claus is the most successful open source project in history. (Bridget @ Illiterarty would agree with that). Santa Claus is essentially a marketing development that is embodied by everyone who stuffs a sock, gives a gift, hosts a dinner or wishes Merry Christmas over the holiday season.

Most emailed

Editorial

When I first started thinking about Free Software Magazine, I was feeling enthusiastic about the dream. I had Dave, Gianluca, and Alan willing to help me, I had established members of the free software community willing to help me out, I had writers volunteering their time and energy for free, and I had a generous offer from OpenHosting for servers, all before I'd proved myself. There was a sense of excitement in the air, and I thought maybe, just maybe, I could make this work.

Free Software Magazine uses Apollo project management software and CRM for its everyday activities!