[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4688: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4690: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4691: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4692: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
Poco Forums • View topic - Spell check as spam filter?
Page 1 of 1

Spell check as spam filter?

PostPosted: Tue Jun 27, 2006 1:50 pm
by wortgames
Hi all,

Is there a simple way to spell-check incoming mail in Poco and assign a spam score based on the results?

Most of my legitimate mail tends to be made up from real words, whereas much of my spam uses numbers, spaces and miss-spellings presumably to avoid triggering banned word lists.

Perhaps I could create a white-list of 'known' offences committed by friends (or just edit my own dictionary as appropriate). Or set their address to bypass the check.


Possible? Easy? Difficult?

PostPosted: Wed Jun 28, 2006 2:46 am
by Michael
Unfortunately this is something that would require development from Poco Systems, there is no access to the Spell checking engine from PocoScript.

PostPosted: Wed Jun 28, 2006 9:00 pm
by robin
But it's a nice idea!

PostPosted: Thu Jun 29, 2006 1:53 pm
by wortgames
Thanks Robin, I think there's definitely some value in it.

One advantage I see is that most important mail (ie business mail) would generally have few spelling errors, although the exact rate would no doubt vary depending on the business you are in.

Personal mail is likely to contain a few lazy errors and 'cute' spellings, but I would imagine in a fairly short space of time it would be possible to 'teach' the dictionary and/or develop rules to reduce the false positives. For example the 'English' dictionary could comprise English and American spellings, common mis-spellings, and modern abbreviations for example.

If we were going to get clever about it, we could also implement a 'hot list' of words that spammers try to mis-spell (eg viagra, mortgage) so the filter could assign a higher score if it thinks the mis-spelled word is close to a hotlist word. This hotlist could even update periodically / submit itself to a master database.

Any words containing a number should probably be given a higher score, and perhaps the same mis-spelling appearing more than once should not receive multiple scores (for example a brand name, industry jargon or model number that may be repeated).

I suspect it might prove quite effective.

PostPosted: Mon Jul 03, 2006 10:26 am
by Slaven
Thanks, that is a neat idea. There may be some obstacles in terms of computer resources (there's already quite a lot of activity whenever a single messages is checked for spam, so initiating the spell-check engine may slow down the process), I'll add it to our wishlist! :)

PostPosted: Wed Jul 05, 2006 8:45 am
by Maximus
Hi, nice idea... but don't forget that people like me (and many others in Europe) receive messages in various languages. We should be able to select more than just one dictionary (e.g. German, French and English). Sometimes, even a single email message comprises several languages, e.g. a lot of admin messages from airlines, ISPs, etc. contain the text in up to 5 languages. I am living in Switzerland (4 national languages & English), but I guess this is also valid for other nations like Canada (French and English) as well.

Cheers
Adi