[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4688: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4690: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4691: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4692: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
Poco Forums • View topic - Junk Filters not alway filtering

Junk Filters not alway filtering

Discussion on Bayesian and standard junk mail filters

Moderators: Eric, Tomas, robin, Michael

Junk Filters not alway filtering

Postby croths » Tue Oct 26, 2004 2:22 am

OK, this is getting silly!!!!!
I started my complaint in someone else's post titled "What's with the "banned words" not working?" Seeing as I have more to say I thought I'd better start my own post.
Since doing the latest upgrade to Poco my junk mail filters are NOT working as well. Each morning when I start up and the nights mail coming in, in the 80 to 120 range, junk mail pours into my Inbox, whereas before the upgrade it was almost a perfect sorting. This morning out of 82 mail, 22 of obvious junk came into my Inbox while my sons mail and the Poco Report got put into the Junk folder.
I'm very non-techy but I've tried hard to understand what others have suggested for settings.
I had not altered it at all before this problem started, only in trying to repair it.
Right now I have:-
Sensitivity Level at High
Run Standard Filters ticked
Junk Threshold .99
Good Mail Bias 2.0
Junk Score 100
Good Mail -100

All that whoosh's over my head, don't understand it at all.
The bad word collection is a waste half the time, words like Rolex and Cable still pour in.
I don't want to but I'm thinking of re-installing K9
Am I the only one having this trouble with the latest upgrade?

Maureen.
croths
Poco Tourist
 
Posts: 35
Joined: Mon Jul 26, 2004 12:12 am
Location: Ontario, Canada

Postby Michael » Tue Oct 26, 2004 2:51 am

My junk mail filtering is still working fairly well although I'm not being flooded with as many messages as you are. This morning I did have a false positive, and a bad one. This was not due to the BF filters but the good bias also didn't click in to help.

I use a combination of techniques, several regular filters and scripts to move legitimate messages (these will not be touched by the JMF so I don't risk false positives on them). I then use some custom filters and scripts to seed the JMF score. Next I run Poco's builtin filters, including Bayesian ones and I run a final script that checks for styled messages trying to get past the filters.

My BF filters are set with a junk threshold of 0.90, my good bias is set to 2. My junk score is set to 20 and I just increased my good score from 2 to 10 (I'll monitor this to see if this helps deal with the false positives).

The support group at work has instituted some aggressive junk mail filtering so my rate of junk mail at work has dropped from 40 a day to 2-3 a week. This used to be where I got the bulk of my jmf information from but it has obviously dried up (to my great relief).
Michael
Moderator
 
Posts: 866
Joined: Mon Jul 26, 2004 12:14 pm
Location: Victoria BC, Canada

Re: Junk Filters not alway filtering

Postby Michael » Tue Oct 26, 2004 2:56 am

croths wrote:I've tried hard to understand what others have suggested for settings.
I had not altered it at all before this problem started, only in trying to repair it.
Right now I have:-
Sensitivity Level at High
Run Standard Filters ticked
Junk Threshold .99
Good Mail Bias 2.0
Junk Score 100
Good Mail -100

All that whoosh's over my head, don't understand it at all.
The bad word collection is a waste half the time, words like Rolex and Cable still pour in.
I don't want to but I'm thinking of re-installing K9
Am I the only one having this trouble with the latest upgrade?

Maureen.


Maureen:

With your junk score and good mail scores set the way they are Bayesian filtering will virtually always be the sole determining factor of whether a message is classified as junk or not. Assuming that a message makes it through to your JMF then if BF makes a decision about the message then the other filters will almost certainly have no impact (hence your thought that the "bad word collection is a waste of time). In order to overcome a wrong analysis by the BF filters too many other JMF filters would have too be triggered in order to negate the 100 point positive or negative score added by the BF filter.
Michael
Moderator
 
Posts: 866
Joined: Mon Jul 26, 2004 12:14 pm
Location: Victoria BC, Canada

Postby croths » Wed Oct 27, 2004 12:44 am

... and so it continues.
I changed my settings to what Michael has his at, same thing this morning. Had 57 mails, all junk, 27 came into my Inbox.
What IS the best settings to have?
croths
Poco Tourist
 
Posts: 35
Joined: Mon Jul 26, 2004 12:12 am
Location: Ontario, Canada

Postby Michael » Wed Oct 27, 2004 5:30 pm

Maureen:

With both BF and non-BF filters in use it is difficult to know whether a message was caught due to the BF filters. As such I typically check the messages caught by the JMF and, for those that were not caught by the BF filters I train the junk mail filters on them.

I use the following script to detect messages not caught by the BF filters:
Code: Select all
{  Tag non-Bayesian messages.

ReadAllHeaders $Hdrs %Message
StringPos #ix "%BAYES%" $Hdrs
If #ix > 0 Then BayesCaught
:BayesMissed
  TagMessage %Message
Exit
:BayesCaught
  LocateLine #lx "%BAYES%" $Hdrs
  GetLine $Line #lx $Hdrs
  StringPos #lx ":" $Line
  ChopString $Line 1 #lx
  Trim $Line
  StringPos #lx " " $Line
  ChopString $Line #lx 9999
  Set #Score $Line
  If #Score < 0 Then BayesMissed
  UntagMessage %Message


To run this script select all messages in the Junk Mail mailbox (Ctrl+A) and then run the script. Next use the "Show Tagged" setting in the show only bar to issolate the messages. Next I open Poco's Junk Mail filter window (Ctrl+F4) and then classify the messages as junk. I know this is a lot of work but it has helped my junk mail filtering rate.

PS: 1. Make sure you "File as Junk" the junk mail messages that got past your filters .
2. I assign the above script to a button for easier access.
Michael
Moderator
 
Posts: 866
Joined: Mon Jul 26, 2004 12:14 pm
Location: Victoria BC, Canada

Postby croths » Mon Nov 01, 2004 3:15 am

Nothing here has changed!!
As much as I'm teaching it, all over again, and it's not too bad - but not as good - through the day, in the morning, when there is a load of mail, junk just pours into my inbox, as if it just cannot handle 80 odd emails. It used to be better than this and I'm not pleased.
Wish I'd never installed the latest upgrade.
croths
Poco Tourist
 
Posts: 35
Joined: Mon Jul 26, 2004 12:12 am
Location: Ontario, Canada

Postby Pete » Mon Nov 01, 2004 7:12 am

croths wrote:[...] as if it just cannot handle 80 odd emails.

You could test this theory by selected one of the junk messages in the index pane (top-right pane) and then clicking on "Tools > Run filters now > Incoming".

If the message goes to the junk mailbox, then that might indicate a timing issue during downloads. If the message stays in the In mailbox however, then I would just say that the Bayesian filter seems to work better after you have taught it a lot of words.

On my system, a lot of messages are filtered to the Junk mailbox, and a lot of them are not. I currently have 30,000 junk words and 5,000 good words in the databases. I don't have a problem with false positives (i.e. good messages incorrectly marked as junk), so I haven't bothered to increase the number of good words because I'm guessing that that wouldn't increase the number of junk messages that Pocomail catches.

croths wrote:[...] words like Rolex and Cable still pour in.

I have reported this too, and I think that the answer was that junk words that you teach the Bayesian filter are not used until they are encountered and taught several times (perhaps at least five times?). Personally, I'd like junk words to be used by the Bayesian filter sooner, but I'm not at all familiar with the technical details, so I don't know if this is possible.
Pete
 


Return to Junk Mail Filtering Help and How-To

Who is online

Users browsing this forum: No registered users and 1 guest

cron