[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4688: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4690: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4691: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4692: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
Poco Forums • View topic - Bayesian filters operating? [Mod: Resolved]
Page 1 of 1

Bayesian filters operating? [Mod: Resolved]

PostPosted: Fri Apr 15, 2005 3:27 am
by seabc
I get reasonably good spam filtering results with Pocomail (85% accuracy), but am puzzled. When I look at how Pocomail scores emails, I can see score components based on headers (all of which are driven by standard non-Bayesian filters, I think), but no score components for Bayesian filters, which are also turned on (at default settings, except that I have set sensitivity level to high; and Pocomail has learned about 10,000 good words and 10,000 bad words). Further, when I unselect standard non-Bayesian filters, then I don't see any eveidence of any filtering. So, based on the results I see, I cannot tell that Bayesian filters are operating.

Can someone guide me to whatever I might be doing wrong? I apologize if the answer is already in the forum somewhere. After reading 50-60 posts and learning a lot but not finding the answer, I figured I had better just ask.

Thanks

Re: Bayesian filters operating?

PostPosted: Fri Apr 15, 2005 12:36 pm
by Eric
seabc wrote:Can someone guide me to whatever I might be doing wrong? I apologize if the answer is already in the forum somewhere. After reading 50-60 posts and learning a lot but not finding the answer, I figured I had better just ask.

Image

PostPosted: Sun Apr 17, 2005 8:49 am
by seabc
Yes - both are checked. That is why I am puzzled

PostPosted: Sun Apr 17, 2005 10:20 am
by Eric
seabc wrote:Yes - both are checked. That is why I am puzzled
Really strange. :?

Could you close Poco and open the poco.in file in notepad of something similar?

Look for JunkFilterEnabled=1.
What does it say in your ini-file?
If it's 0, change it to 1 manually and save the changed ini-file.

Restart Poco and test it out to see if it works. :wink:

PostPosted: Sun Apr 17, 2005 1:12 pm
by seabc
It was set to zero, despite having both boxes checked, so I exited Pocomail, edited the file, and returned to wait for some spam. Here is the junk mail scoring section for the next piece of spam I received. This indicates nothing about bayesian filtering?!

X-Poco-Score-Detail: +3 [X-MAILER=] (X-Mailer )
X-Poco-Score: +15
X-Poco-Score-Detail: +2 [FROM=%ADDRESSBOOKS%] (From %addressbooks%)
X-Poco-Scored: +15
X-Poco-Score-Exceeds: 10

The spam score is 10 higher than the two components shown above suggest because of another filter that adds 10 points to anything that comes to the address this one did, but that isn't from someone in the addressbook. (I only use that filtering rule for one of my two primary email addresses)

Do I need to restart the computer to have the revised ini file work? I'll try that, and then wait for more spam.

PostPosted: Sun Apr 17, 2005 10:50 pm
by Eric
seabc wrote:Do I need to restart the computer to have the revised ini file work? I'll try that, and then wait for more spam.
You don't need to restart your computer. Only restarting Poco should do the job. :)

Have you tried installing a test copy to see if that works?

I can't help you with the headers, because I know almost nothing about how the Bayesian filter works. :?
However it should mention something about the Bayes score. IIRC

Let me know how it goes with the test copy. :wink:

PostPosted: Mon Apr 18, 2005 8:48 am
by seabc
I am a little wary of installing a test copy of Pocomail. In the past, I have installed betas in temporary directories, at Slaven's suggestion. And each time I do, then the new version of Pocomail is the one in my registry, I think, which then can create an issue if I decide to stick with the original version. Is there some way to install in a new directory but not create this problem?

PostPosted: Mon Apr 18, 2005 8:56 am
by Eric
seabc wrote:I am a little wary of installing a test copy of Pocomail. In the past, I have installed betas in temporary directories, at Slaven's suggestion. And each time I do, then the new version of Pocomail is the one in my registry, I think, which then can create an issue if I decide to stick with the original version. Is there some way to install in a new directory but not create this problem?
I've also installed many betas, but never experienced any problems with it. I just delete the entire directory and afterwards use my registry cleaner 'RegSupreme Pro' to remove all traces. :)
Same as you I install it into a new directory, meaning Poco34_build n°.

As for the test version you can set the accounts to 'leave mail on server' and afterwards you can just download the messages into your original version.

It's what I did to check PocoMail & Barca during the betas. :wink:

PostPosted: Tue Apr 19, 2005 12:52 pm
by Michael
seabc wrote:Here is the junk mail scoring section for the next piece of spam I received. This indicates nothing about bayesian filtering?!

X-Poco-Score-Detail: +3 [X-MAILER=] (X-Mailer )
X-Poco-Score: +15
X-Poco-Score-Detail: +2 [FROM=%ADDRESSBOOKS%] (From %addressbooks%)
X-Poco-Scored: +15
X-Poco-Score-Exceeds: 10

The spam score is 10 higher than the two components shown above suggest because of another filter that adds 10 points to anything that comes to the address this one did, but that isn't from someone in the addressbook. (I only use that filtering rule for one of my two primary email addresses)


When the Bayesian filter runs it adds a "X-Poco-Score-Detail" header with text of the form
Code: Select all
[%BAYES%=P=0;T=90;BIAS=+20]


If you have set the Good Mail Score to 0 in the bayesian filters then no scoring filter entry will be added if the Bayesian filters consider the message to be good (which, I suspect, is the case here).

PostPosted: Wed Apr 20, 2005 6:40 am
by seabc
That is interesting. Thanks

My Good Mail score is set at 0, the default, I believe. So that would explain why there might be no line showing the use of Bayesian filters. Does it have an impact on whether Bayesian filters even run?

I will try setting the Good Mail score a bit higher, and wait for some spam. Hopefully, it will show that the filters are running, and that I need to be more thoughtful in my training.

PostPosted: Wed Apr 20, 2005 8:11 am
by seabc
Now I can see that Bayesian filters are working. I cannot be sure if they were running before, though as far a I can recall, they have never had any visible impact on junk mail scores, for either messages that were categorized as good or as junk (based on headers). I will look for emails that are judged to be spam, and see if there is any indication of Bayesian filters.

Unfortunately, now that I have set the Good Score a bit above zero, Pocomail decided that my first two spams were good mail, so I need to look into retraining, and I guess I have to jigger the Good Score and Junk Score.

I wish there was better documentation on how to do this well. The support forums seem have lots of advice, but nothing particularly concise.

Findings

PostPosted: Wed Apr 20, 2005 9:34 am
by seabc
Once I set the Good Score above zero, I can see that the Bayesian Filters do work for email judged to be either good or junk. Now "good messages" have their junk scores reduced by a point, and junk messages have their scores increased by 100. Neither of these outcomes happened before I changed the Good Score to be something other than zero.

So it appears to me that the default settings, which include a Good Score of 0, produce the result of making Bayesian Filters not work (on my machine at least, with whatever else I may have done), for good messages or spam.

PostPosted: Wed Apr 20, 2005 2:40 pm
by Michael
If you are using any of the non-bayesian junk mail filtering features of Poco (such as banned senders, banned subject, etc) you might want to examine the scores of the messages to see whether or not a score of 100 is appropriate for bayesian identified messages. This won't affect items that force mail to be classified as junk but it will affect items that are designed to designate mail as good (allowed senders, allowed receivers and negative scores in the junkbody.txt file). It will also affect any filters where you decease the junk score (or any scripts that do this).

PostPosted: Thu Apr 21, 2005 3:01 am
by seabc
Thanks for hellping me get this resolved.

Now that I have responded to what appears to be an undocumented feature in PM (no Bayesian filtering if the Good Score is set to zero), I am finding that nearly every email is being classified as good by Bayesian Filtering. My training has resulted in filters that pass almost everything. Is there a good, concise description of how best to train PM so I will get a positive contribution from Bayesian filtering?