Spam Filtering and the Plague of False Positives

Jeff Schwartz recently received an inquiry from a senior executive at a major wireless telephone carrier, asking if he could provide some product information about the software his company produces.

“So I e-mailed it,” Schwartz told TechNewsWorld. But it never arrived. “I sent it two more times,” he said. Still, the message didn’t make it through.

“Finally, I printed it out and snail-mailed it to him,” said Schwartz. The culprit was a spam filter.

“Since my e-mail was about spam and contained lots of trigger words, the filter deleted it,” said Schwartz, who is president of NextGen Development, the maker of an antispam software application called GoodbyeSpam.

ISPs Aggressive

Executives and professionals are becoming increasingly negative about the issue of false positives — the unfortunate consequence of using spam filters. When a filter incorrectly identifies a noncommercial e-mail message as spam, it is treated in much the same way as if it were a vile piece of e-mail from a pornographer, a Viagra pitchman or a low-cost mortgage refinancier.

A study released recently by consultancy Return Path, called the E-Mail Blocking and Filtering Report, concluded that 17 percent of permission-based e-mail messages are not being delivered to in-boxes by top ISPs, a statistic that is up from just 2 percent in a similar study done in the fourth quarter of 2002.

“In an effort to protect subscribers from the onslaught of spam, ISPs are inadvertently tagging opt-in e-mail as spam, resulting in them being deposited in bulk or spam folders, or not being delivered at all,” Remy Taylor, a director of marketing at Return Path, told TechNewsWorld. “Some companies saw 46 percent of e-mail sent to their customers not get delivered.”

Return Path — using a delivery monitoring service — eyed the deliverability of 10,000 e-mail marketing campaigns earlier this year across 12 major ISPs representing 60 percent of the domains of large companies’ mailing lists. “Non-deliverability numbers are rising,” said Taylor. But the problem goes beyond even mass online marketing.

Business Communication

Routine business communications — e-mails sent from personalized contact lists on people’s PCs — are being filtered out by corporate blacklists and shunted aside if they come from domains that have been home to problematic messages.

According to several recent studies, up to 15 percent of routine e-mail communications simply are not delivered.

“I publish an electronic newsletter called the Apogee that helps readers build more valuable and profitable companies,” Clifford R. Kurtzman, CEO of Adastro, told TechNewsWorld. “The problem I have is with the [whitelist] spam filters that require the sender of the message to go to a special page and enter in a code before the mail will go through.

“When I have 75,000 subscribers, if even a small fraction of my subscribers start using such software, it creates a nightmare for me to have to manually deal with each recipient using the filter to get past their block and get them the newsletter they signed up for,” Kurtzman said.

Not Going To Take It

“I predict that within the next 24 months, we’ll hear about a user suing their ISP because the ISP’s spam filter censored their e-mail and deleted an important message which caused the user to lose money,” said Schwartz.

Some ISPs are overreacting to complaints from customers about false positives, according to marketing lobbyists and some analysts.

“As one ISP [owner] told me, he is not even filtering anymore,” Michael Herrick, president and CEO of Matter Form Media, a developer of antispam software, told TechNewsWorld. “If there is one false positive, his phone rings off the hook.”

Technology companies are joining the debate, developing solutions that, they say, will generate fewer false positives.

“What people resent the most is having the IT department or ISP determine what is — and what is not — spam,” said Herrick. “No one else has the right to open your regular mail. It should be no different with e-mail.”

New Technologies

The new technologies that are coming on the market — such as SpamFire for Windows — are targeted toward PC users, not network honchos. “Wherever the filtering takes place, it should be with the end user,” said Herrick.

Other new antispam technologies are gaining traction in the market. Emeryville, California-based Sendmail this summer started shipping its antispam software, Sendmail Advanced Anti-Spam Filter, which can be customized down to the user level but deployed at the Internet gateway.

Meanwhile, San Francisco-based Webwasher USA this summer debuted a new antispam filter that employs several statistical algorithms for smaller companies. The filter analyzes the sender, the message and the subject simultaneously to determine whether a piece of e-mail is spam.

Newport Beach, California-based SpamSoap recently debuted a new technology that conducts 22,000 tests — in three seconds — on each e-mail before calling it spam. The technology checks for inefficient routing. For example, a message that originates in the United States but goes through Brazil, China or Germany before arriving in a user’s in-box is a hallmark of spam.

Some observers note, however, that no spam filtering software is perfect, and that all users ultimately wind up wasting a lot of time searching through their bulk mail folders for false positives.

Credit Report?

ISPs are talking about developing some sort of technological ratings system, like a credit report, for message senders. The idea is to move away from the concept of good versus bad filtering toward establishing secure, trusted online relationships.

But some Internet users like the spam status quo — and figure that everybody else is making too big of a deal about the rising tide of e-mail.

“I look at spam as a form of entertainment,” said Dan Seidman, a leading sales coach and founder of SalesAutopsy.com.

3 Comments

  • I believe no spam filter can be 100% accurate all the time. I do tech support for one of the best every day and we get real close. So something else is needed. I think that’s a review/retrieve option. It takes only seconds to browse through hundreds of blocked messages and Mr. Schwartz would be able to retrieve his false positive email. We designed such a products and it works very well. Actually some users tell us that they lost mail before they used a spam filter and our product. It makes sense. If you look at hundreds of spam email and only a few good ones between it’s easy to be a bit heavy on the delete key for just a couple seconds. That’s a dozen erased messages on any reasonable fast system … and no telling if they where good or bad.

  • False positives are largely a function of poor filtering software. While there will be some, lots and lots of companies have already moved to newer technologies that combine filtering techniques. A lot of these vendors have lowered their false positive ratio to one in a million messages. If you have 50 emails a day, this works out to something like one every 75 years – I would call that an acceptable rate…

  • I’ve had it with spam. I finally found a service which offers accurate anti-spam filters on the servers (so I don’t have to download all that spam to use a program to determine if the messages are spam). That way, spam never fills up my inbox or other folders, and I don’t have to worry about quota.
    I recently began using Bayesian filtering, and in only a few weeks it has outclassed many of the other filtering methods I’ve seen. It’s eerily accurate, with no false positives (not yet, anyway), and I truly believe that adaptive systems like that are the only way to go. Although spammers may do much to disguise their message from filters, they cannot disguise their sales pitch without distorting it enough to render it ineffective.
    To be useful to the masses, server-side filtering must be configurable on a per-user basis (a problem many of the ISP-based systems have is they treat all users as having the exact same preferences). I have complete control over my spam filters, and I can easily correct any false-poasitives without losing valuable mail (mail identified as spam is delivered to my spam folder, which I check occasionally).
    I think people need to realize that in order to best fight spam, it will require a cooperative effort between service providers and end-users – the filtering must be done at the server level, before mail is delivered, but it must be according to a single user’s preferences.
    Forget online dnsbl’s (which have come under recent attack) that cast too wide a net; forget static filters like spamassassin, which have too many false positives – adaptive technology is the way to fight the spam problem.
    Don Fitzgerald
    http://www.cotse.net

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

TechNewsWorld Channels