Spam Filtering and the Plague of False Positives

3 Comments

Pbanz says:

February 9, 2004 at 1:17 AM

I believe no spam filter can be 100% accurate all the time. I do tech support for one of the best every day and we get real close. So something else is needed. I think that’s a review/retrieve option. It takes only seconds to browse through hundreds of blocked messages and Mr. Schwartz would be able to retrieve his false positive email. We designed such a products and it works very well. Actually some users tell us that they lost mail before they used a spam filter and our product. It makes sense. If you look at hundreds of spam email and only a few good ones between it’s easy to be a bit heavy on the delete key for just a couple seconds. That’s a dozen erased messages on any reasonable fast system … and no telling if they where good or bad.

Log in to Reply
jayfish says:

October 3, 2003 at 11:11 AM

False positives are largely a function of poor filtering software. While there will be some, lots and lots of companies have already moved to newer technologies that combine filtering techniques. A lot of these vendors have lowered their false positive ratio to one in a million messages. If you have 50 emails a day, this works out to something like one every 75 years – I would call that an acceptable rate…

Log in to Reply
donfitzgerald says:

September 30, 2003 at 8:46 AM

I’ve had it with spam. I finally found a service which offers accurate anti-spam filters on the servers (so I don’t have to download all that spam to use a program to determine if the messages are spam). That way, spam never fills up my inbox or other folders, and I don’t have to worry about quota.
I recently began using Bayesian filtering, and in only a few weeks it has outclassed many of the other filtering methods I’ve seen. It’s eerily accurate, with no false positives (not yet, anyway), and I truly believe that adaptive systems like that are the only way to go. Although spammers may do much to disguise their message from filters, they cannot disguise their sales pitch without distorting it enough to render it ineffective.
To be useful to the masses, server-side filtering must be configurable on a per-user basis (a problem many of the ISP-based systems have is they treat all users as having the exact same preferences). I have complete control over my spam filters, and I can easily correct any false-poasitives without losing valuable mail (mail identified as spam is delivered to my spam folder, which I check occasionally).
I think people need to realize that in order to best fight spam, it will require a cooperative effort between service providers and end-users – the filtering must be done at the server level, before mail is delivered, but it must be according to a single user’s preferences.
Forget online dnsbl’s (which have come under recent attack) that cast too wide a net; forget static filters like spamassassin, which have too many false positives – adaptive technology is the way to fight the spam problem.
Don Fitzgerald
http://www.cotse.net

Log in to Reply