Spam prevention with Exim and greylistd - Part 2 - management and stats

Spam prevention with Exim and greylistd - Part 2 - management and stats


In part one of this tutorial looked at installing and configuring greylistd alongside Exim to help combat the evils of Spam. In this second part I will look at getting some information out of greylistd -- handy if you need to troubleshoot why the CEO's "urgent" message hasn't arrived yet!

Captain Slog

Greylistd is the daemon called by Exim during SMTP connections. The acl_check_rcpt ACL (if setup for greylisting) will ensure that Exim logs SMTP connections resulting in greylisting. You can use this information along with statistics from greylistd itself to troubleshoot and track progress of greylisted sender-triplets and messages.

A typical greylisting entry in the Exim mainlog will look like this:

2008-09-11 06:25:59 H=(mx_relay.somedomain.com) [117.4.26.192] F=<a.sender@somedomain.com> temporarily rejected RCPT <an.address@yourdomain.com>: greylisted.

This tells you the date (11 September 2008), time and sender triplet of the message. If you want to know whether a greylisted messages was subsequently delivered on a retry, you can check further down the Exim log for a "Completed" entry:

2008-09-11 06:37:51 1KgLbh-0007zz-Kh <=a.sender@somedomain.com H=(mx_relay.somedomain.com) [212.2.3.143] P=smtp S=25153 id=F89EE039F5154749BCDAF05850A8993C0E8044B7@ASENDERPC
2008-09-11 06:37:51 1KgLbh-0007zz-Kh => an.address@yourdomain.com R=verify_recipient T=remote_smtp H=17.10.189.25 [17.10.189.25]
2008-09-11 06:37:51 1KgLbh-0007zz-Kh Completed

This tells you a message was delivered by that server from that sender to that recipient. It might be a second message that just happened to have the same triplet or it might be the second attempt for the first message. Because no delivery was made on the first attempt, Exim did no assign it an ID. The second successful attempt got one (1KgLbh-0007zz-Kh). Don't make the automatic assumption that these two entries refer to a single e-mail message. That said, you can say that the second entry means that mail with this triplet is now being delivered and is likely on the whitelist.

Greylistd itself can confirm this with use of the greylist shell command. You need to supply the sender-triplet as <sending host ip address> <sender e-mail address> <recipient e-mail address>:

myserver:/# greylist check 117.4.26.192 a.sender@somedomain.com an.address@yourdomain.com
white
myserver:/# 

The response of white confirms to us that this triplet is on the whitelist and future messages with that triplet will not be delayed. Be careful here if this returns grey though, because that is also the default response for triplets not on any of the lists. That's because check asks greylistd what status that triplet has -- an unknown triplet would be greylisted.

Stats

The greylist command can also give you some statistics:

myserver:/# greylist stats
Statistics since Fri Aug 22 16:04:45 2008 (21 days and 2 hours ago)
-------------------------------------------------------------------
18655 items, matching 48655 requests, are currently whitelisted
    0 items, matching     0 requests, are currently blacklisted
 2841 items, matching  2854 requests, are currently greylisted

Of 167535 items that were initially greylisted:
 -  18655 ( 11.1%) became whitelisted
 - 148880 ( 88.9%) expired from the greylist

You'll note the ratio of whitelist items to requests. This is another indication that greylisting is doing its job: regular senders will send several requests each, but have only one item in the whitelist. It's also possible to view details of the grey, white or black lists:

myserver:/# greylist list --white
Greylist data:
==============
  Last Seen            Count  Data
  2008-09-12 09:45:15      2  117.4.26.192 a.sender@somedomain.com an.address@yourdomain.com
...

The Count column indicates that two messages have been successfully sent via that sender-triplet and Last Seen tells you when the last one was. These lists can be quite long, so you might like to pipe them into a pager (greylist list --grey | more) or filter them with grep greylist list --grey | grep somedomain.com. That last one will display all greylist entries containing "somedomain.com".

Greylist can also dump its data in a format to be used for MRTG, which you'll need redirect into a file if you want to save it: greylist mrtg > ./filename will do the trick.

Altering the lists manually

So far the greylist commands I've shown you will not affect the three lists it maintains. You may come across situations where you need to move a triplet from one list to another or perhaps add a triplet to the whitelist. I'd recommend against adding a triplet to the blacklist -- prefering to let greylistd do its job. Greylist gives you a way to do this. Needless to say these options should be used with some degree of caution (don't go adding your CEOs LinkedIn notifications to the blacklist unless they ask you to). greylist add --white <sender-triplet> will add the triplet to the white list. Replace --white with --grey or --black accordingly to add to those lists. Again don't forget it's the triplet that needs adding, and not just the sender e-mail address. In case you need it, greylist delete <sender-triplet> will delete that triplet from whatever list it may be on.

May contain traces of nut

In the UK food products containing the word “diet” in their name ( e.g. Diet Coke) are accompanied by the warning “Can help only as part of a calorie controlled diet”. By the same nature greylisting can help reduce spam levels only in partnership with other tools, for example: Bayesian-rule based scanning, cautious use of one or more DNSBL and Client-side filtering. I have to say that it is one of the more effective measures I have implemented--and if nothing else, it will reduce the load on your server.

Category: 

Author information

Ryan Cartwright's picture

Biography

Ryan Cartwright heads up Equitas IT Solutions who offer fair, quality and free software based solutions to the voluntary and community (non-profit) and SME sectors in the UK. He is a long-term free software user, developer and advocate. You can find him on Twitter and Identi.ca.

Most forwarded

Interview with Dave Mohyla, of DTIDATA

Dave Mohyla is the president and founder of dtidata.com, a hard drive recovery facility based in Tampa, Florida.

TM: Where are you based? What does your company do?
DTI Data recovery is based in South Pasadena, Florida which is a suburb of Tampa. We have been here for over 10 years. We operate a bio-metrically secured class 100 clean room where we perform hard drive recovery on all types of hard disks, from laptop hard drives to multi drive RAID systems.

Anybody up to writing good directory software?

Since the very beginning, directories (of any kind) have had a very central role in the internet. (I have recently grown fond of Free Web Directory. Even Slashdot can be considered a directory: a collection of great news and invaluable user-generated comments. As far as software is concerned, doing a quick search on Google about software directories will return the free (as in freedom) software directories like Savannah, SourceForge, Freshmeat and so on, followed by shareware and freeware sites such as FileBuzz, PCWin Download Center and All Freeware (great if you're looking for shareware and freeware, but definitely less comprehensive than their free-as-in-freedom counterparts).

Interview with Mark Shuttleworth

Mark Shuttleworth is the founder of Thawte, the first Certification Authority to sell public SSL certificates. After selling Thawte to Verisign, Mark moved on to training as an astronaut in Russia and visiting space. Once he got back he founded Ubuntu, the leading GNU/Linux distribution. He agreed on releasing a quick interview to Free Software Magazine.

Is better education the key to finding better software?

I read David Jonathon's article Anybody Up To Writing Good Directory Software? the other day, which got me thinking about software directories in general. As David mentioned, many of the software directories one finds when doing a quick google search are free as in beer, not as in freedom. But what interests me is the software directories that already exist, providing a combination of both free as in beer software, and open source software. Sites such as Freeware Downloads and Shareware Download don't advertise themselves as providing free as in liberty software, but each of them have a good selection of open source software available... if you know where to look.

Most emailed

Free Open Document label templates

If you’ve ever spent hours at work doing mailings, cursed your printer for printing outside the lines on your labels, or moaned “There has got to be a better way to do this,” here’s the solution you’ve been looking for. Working smarter, not harder! Worldlabel.com, a manufacture of labels offers Open Office / Libre Office labels templates for downloading in ODF format which will save you time, effort, and (if you want) make really cool-looking labels

Creating a user-centric site in Drupal

A little while ago, while talking in the #drupal mailing list, I showed my latest creation to one of the core developers there. His reaction was "Wow, I am always surprised what people use Drupal for". His surprise is somehow justified: I did create a site for a bunch of entertainers in Perth, a company set to use Drupal to take over the world with Entertainers.Biz.

Update: since writing this article, I have updated the system so that the whole booking process happens online. I will update the article accordingly!

So, why, why do people and companies develop free software?

More and more people are discovering free software. Many people only do so after weeks, or even months, of using it. I wonder, for example, how many Firefox users actually know how free Firefox really is—many of them realise that you can get it for free, but find it hard to believe that anybody can modify it and even redistribute it legally.

When the discovery is made, the first instinct is to ask: why do they do it? Programming is hard work. Even though most (if not all) programmers are driven by their higher-than-normal IQs and their amazing passion for solving problems, it’s still hard to understand why so many of them would donate so much of their time to creating something that they can’t really show off to anybody but their colleagues or geek friends.

Sure, anybody can buy laptops, and just program. No need to get a full-on lab or spend thousands of dollars in equipment. But... is that the full story?

Fun articles

Santa Claus - the most successful open source project

It dawned on me the other day, as I was shopping for the dozens of gifts it seems I have to buy every December, that Santa Claus is the most successful open source project in history. (Bridget @ Illiterarty would agree with that). Santa Claus is essentially a marketing development that is embodied by everyone who stuffs a sock, gives a gift, hosts a dinner or wishes Merry Christmas over the holiday season.

Most emailed

Editorial

When I first started thinking about Free Software Magazine, I was feeling enthusiastic about the dream. I had Dave, Gianluca, and Alan willing to help me, I had established members of the free software community willing to help me out, I had writers volunteering their time and energy for free, and I had a generous offer from OpenHosting for servers, all before I'd proved myself. There was a sense of excitement in the air, and I thought maybe, just maybe, I could make this work.

Free Software Magazine uses Apollo project management software and CRM for its everyday activities!