February 24, 2004

OS X Spam Filter Technique

I love Mail.app, but the junk mail filter isn't as good as it could be. I get over 500 spams per day due to some early-on poor decisions with my email addresses.

It's possible to write an applescript to interact with external spam-classification programs, but the problem is that Mail.app's applescript routines are actually rather sparse, and not very well-documented.

Well! After doing a lot of hacking, research, and asking around (including noticing how other software packages handle it), I think I have figured out the keys to using rules to call Applescripts to filter spam.

I like using Apple's filter to find spam, but there are a lot of false positives. And I don't want to look at all of them. What I needed was a second pass, using a different process to go through all my junk mail and get rid of the mails that were guaranteed to be spam, leaving the small number of doubtfuls and false positives to examine manually.

I am using SpamSieve 2.11 for this. (I have not confirmed if 2.12 works for this yet.)

SpamSieve's instructions say to turn off the internal Junk Mail Filter. I don't want to do that.

Apple's Junk Mail Filter is its own ruleset. The problem is that it is not obvious is that the Junk Mail Ruleset is set to run after all of the user-defined rules are finished.

So, using a technique used in the freeware JunkMatcher, I reordered the rules:

ruleset

Here are each of the rules:

This rule calls an external applescript supplied by spamsieve. What's interesting is that the flag rule is not applied unless the applescript fails. In other words, if spamsieve catches it, it does what the spamsieve applescript tells it. If it doesn't, then it gets flagged.

spamsieve-rule

This next rule actually just copies the regular junk mail rule, but we're manually defining it so it happens before the user-defined rules are finished. In this case, it moves mail that spamsieve didn't catch into the Junk Mail folder. I'm not sure why it doesn't catch emails that passed the above SpamSieve test, but it doesn't.

junk-rule

This next rule stops rule processing for all junk mails. All ensuing rules will be for messages the junk mail filter didn't catch, including my inactive rule that calls a separate applescript supplied by JunkMatcher.

junkstopper

Finally, the very last rule of your ruleset must be to stop processing all rules. This keeps messages from leaking through to Apple's default junk mail filter that we already defined above, which could possibly (I think) redefine emails that you've already tried to organize.

fullstop

That's it! The end result for me is that my Junk Mail Filter liberally matches lots of spam and some false positives, which go to the Junk Mail folder. Then, Spam Sieve, which I have conservatively configured to leave behind false negatives but NO false positives, keeps most of that Junk Mail from being in the Junk Mail folder, moving it aside to a separate Spam folder (which is what the applescript does if it succesfully tests for spam). So I'm left with a Spam folder with lots of guaranteed spam that I can delete without examining, and a sparse Junk Mail folder that I can review for false positives.

Works great!

If you have read this far, you may enjoy this article on other spam filters that can be used with Apple's Mail.app.

Posted by Curt at February 24, 2004 10:38 PM