[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4688: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4690: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4691: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4692: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
Poco Forums • View topic - Spam filtering app that works well with Pocomail?

Spam filtering app that works well with Pocomail?

General email topics, from anti-virus and anti-spam software to webmail and ISPs

Moderators: Eric, Tomas, robin

Spam filtering app that works well with Pocomail?

Postby jatkins679 » Sun Oct 03, 2004 3:03 pm

Howdy. I'm looking for a spam filtering application that will play nice with Pocomail. Anyone have any good experiences with one?

Thanks,

J
jatkins679
Drop-in Visitor
 
Posts: 11
Joined: Sat Aug 14, 2004 12:27 pm

Postby tribble » Sun Oct 03, 2004 3:42 pm

Well, I've had great luck with K9 but the question BEGS to be asked, what's wrong with the built-in filtering of PocoMail?
tribble
Poco Enthusiast
 
Posts: 430
Joined: Wed Jul 28, 2004 8:55 am

Postby speerga » Mon Oct 04, 2004 5:05 am

Let me second the recommendation of K9. Works well. Easy to set up.

As for that question begging to be asked -- I didn't open the thread, but I'll give the answer I've posted in several other places: Until Pocomail's Junk filtering can be done out of the box, why wouldn't someone looking for a good spam filter NOT try K9 or one of the other spam filters?

I've tried, trained, combined tools until I'm blue in the face -- and the best I can ever get and sustain for spam filtering using Pocomail's built in tools is about 80-90 percent. And generally that slips off as I continue to use Poco's tools instead of K9.

With K9, within 2 weeks I had it trained above 90 percent -- and it's never been below 98 percent in the last four months. I can't remember the last time K9 misidentified "good" mail as spam, either.

Gary Speer

P.S. Perhaps this belongs in the Junk Mail filtering topic, or in the E-mail related forum?
speerga
Resident Poster
 
Posts: 116
Joined: Wed Jul 28, 2004 1:49 pm
Location: Springfield, Missouri

Postby Eric » Mon Oct 04, 2004 6:10 am

P.S. Perhaps this belongs in the Junk Mail filtering topic, or in the E-mail related forum?
You're right, Gary. It belongs to Email Hall. :wink:
Here are some links to free spam filtering apps:
- K9
- Spamihilator
- PopFile
- SpamPal
- Mailwasher

I've used all of them in the past, except Popfile, but I no longer need them. Even my Spam Blackout has been put out of service. Strict Bayesian in Poco is doing an excellent job with my spam. :D
Eric
 

Postby speerga » Mon Oct 04, 2004 6:21 am

Eric wrote:I've used all of them in the past, except Popfile, but I no longer need them. Even my Spam Blackout has been put out of service. Strict Bayesian in Poco is doing an excellent job with my spam. :D


All right, Eric. It probably isn't going to help me to ask this, as I've tried every idea anyone on these forums has mentioned (if I understood them?), but here goes: What are your settings in Pocomail's Junk Mail Filter to get those great results?? :shock:

I've tried everything mentioned that I was able to understand, and nothing's really worked well -- no combination of Bayesian plus other Poco options, and no strict Bayesian settings alone have even begun to match the 98 percent or so accuracy I get with K9.

But I WANT to use Poco's Junk Mail filtering because it's built-in and seems like it SHOULD work for me. About 85-90 percent of all the spam I get comes to one account (one I can't cancel because of business reasons). I guess I'd be in pretty good shape if I just created a filter to send the email from that account to Junk Mail and check it once in awhile. But it really seems like some sensitivity settings and training of the Bayesian filter should work??

Gary Speer, weary of spam talk ( :lol: )
speerga
Resident Poster
 
Posts: 116
Joined: Wed Jul 28, 2004 1:49 pm
Location: Springfield, Missouri

Postby Eric » Mon Oct 04, 2004 6:51 am

:roll: Well, Gary, I had some time to try several settings during the Poco Beta.
speerga wrote:All right, Eric. It probably isn't going to help me to ask this, as I've tried every idea anyone on these forums has mentioned (if I understood them?), but here goes: What are your settings in Pocomail's Junk Mail Filter to get those great results?? :shock:
Here it goes:
- Sensitivity level set to High
- Bayesian Junk threshold: 0.99
- Good mail bias: 3.0
Until now Poco has learned 5439 junk words and 2603 good words.
Also every newsletter I'm subscribed (or going to subscribe) to is automatically filtered to the rightful mailbox. And I do have quite a lot of subscriptions. These HTML newsletters will almost always be considered spam, if no filter present.
Currently at 98.8 %. :D
Eric
 

Postby tribble » Mon Oct 04, 2004 8:40 am

The question begs to be asked as speerga's situation is not the same for everyone. For me, the current numbers are: %98.72 accuracy of 794 messages (I reset this when I installed the 2000 build last week).

My junk corpus is 55,709 words and good is 18,505.

I had >%95 after four days when originally setup.

Tools -> Junk Mail Filtering:
Enable automatic Junk Mail filtering
High Sensitivity

On the General settings tab:
Filter Settings, Custom Sensitivity = Highest
Run standard non-Bayesian filters is selected.

On the Bayesian tab:
Run learning Bayesian filters is selected
Junk threshold = 0.99
Good mail bias = 2.0

After classifying a message give the following score:
Junk score 100
Good Scare -100
tribble
Poco Enthusiast
 
Posts: 430
Joined: Wed Jul 28, 2004 8:55 am

Postby speerga » Mon Oct 04, 2004 9:42 am

I thank you, Eric and tribble, for giving us that information. You tempt me to try Pocomail's Junk Mail filtering one more time. :lol:

We'll see what happens.

Gary Speer

EDITED TO ASK: I note Eric's good mail bias is at 3.0 and tribble's is at 2.0.

Can you explain what this means? How it works? Why the differences? What should I try for mine? And how much difference will the 2.0 or 3.0 make? Thanks.
speerga
Resident Poster
 
Posts: 116
Joined: Wed Jul 28, 2004 1:49 pm
Location: Springfield, Missouri

Postby tribble » Mon Oct 04, 2004 9:49 am

I hope Eric or Jim or Slaven can answer your questions speerga, all I know is, it works :-)
tribble
Poco Enthusiast
 
Posts: 430
Joined: Wed Jul 28, 2004 8:55 am

Postby Eric » Mon Oct 04, 2004 9:55 am

speerga wrote:EDITED TO ASK: I note Eric's good mail bias is at 3.0 and tribble's is at 2.0.
Can you explain what this means? How it works? Why the differences? What should I try for mine? And how much difference will the 2.0 or 3.0 make? Thanks.
I don't know much about this Bayesian stuff, but if you set the good mail bias higher, you'll get more chance to classify junk as good mail, I think. :roll:
Help | Contents wrote:Good Mail Bias is also used internally during calculation of word probabilities. It will multiply the probability of any good words by the set value to give good words more power over junk words. This is another way to minimize false positives
You'll have to play a bit with it, until you find the setting which works for you. :wink:
Eric
 

Postby speerga » Mon Oct 04, 2004 5:25 pm

Ah, well. Not very encouraging results. I closed K9 earlier today (yesterday, I guess now!) and re-enabled Pocomail's Bayesian filtering set at 99 and 2.

I'm working with a "good words" training around 32,000 and "bad words" of over 40,000.

For the first dozen or so emails after I started Poco's Bayesian junk filtering, I had 3 junk mails not put into Junk Mail and the other 8 or 9 correctly identified.

In the next few hours, I had ONE spam correctly identified and put in Junk Mail and 42 spam NOT correctly identified. After I went through all 42 clicking the button to identify each as spam and move it to the Junk Mail folder, two things happened: 1) I received two different "free Dell computer" spam which were almost identical to each other and which were almost identical to one correctly identified as spam earlier in the day -- but neither of those two were caught by the filter; and, 2) I had three "access violation" boxes pop up -- which only seems to happen when I've clicked to identify and move a lot of spam.

What a waste of time. Maybe some day Pocomail's Bayesian filter really will work without thousands of email and weeks of training. But at least for me that day hasn't even gotten close yet. I'm happy for those of you who get it working. I dunno how you do it or did it, but I'm happy for ya! :shock:

Gary Speer, going back to K9 first thing in the morning.
speerga
Resident Poster
 
Posts: 116
Joined: Wed Jul 28, 2004 1:49 pm
Location: Springfield, Missouri

Postby J-Mac » Tue Oct 12, 2004 5:46 pm

Hi Gary.

If you search, you can see all my posts where I spent a couple of months researching everyone's PM Junk filtering setups and then setting up my own BF in Pocomail. I could never get above the 85-88% level. When I tried going with the strict Bayesian, it actually got worse!

The odd thing is, several users are getting widely disparate results using what appears to be identical settings. I can't figure it out!

I just started using K9 about two and a half weeks ago. It got up to the mid-90's real quickly! Since then it seems to have good days, and better days. I'm hovering between 92 and 93%. Somehow I'm disappointed!

In reality, 93% is much better than I ever achieved with Pocomail's junk filtering, and much, much better than I ever achieved with iHateSpam from Sunbelt Software when I was using Outlook as my mail client!

I say I feel somewhat disappointed because I keep reading posts from you and several others who claim to get 98%+! I must not have my K9 setup for optimum performance.

It's currently trained with over 29,000 good words and 13,500 spam words. I have it configured to add [Spam] to the start of the subject, so I can then let Pocomail filter it to the Junk Mail folder. While I have PM's Junk Mail Filters disabled, I set up a regular filter to look for [Spam] in the subject line and move it to the Junk mail folder. I don't have a "Whitelist" or "Blacklist" set up, but I do have the "Use a DNS blackhole list to help identify spam" checked, and it's using the default sbl-xbl.spamhaus.org list server, and the "Days to cache IPs" is set at 5. I have no other "Tweaks" or features enabled.

Is there anything else I can do to creep up toward that 98% figure?

Thanks!!
J-Mac
J-Mac
Poco Enthusiast
 
Posts: 356
Joined: Wed Jul 28, 2004 9:54 pm
Location: The Great State of Pennsylvania, in the Merry Old Land of Oz!

Postby speerga » Wed Oct 13, 2004 6:13 am

J-Mac wrote:I say I feel somewhat disappointed because I keep reading posts from you and several others who claim to get 98%+! I must not have my K9 setup for optimum performance.

It's currently trained with over 29,000 good words and 13,500 spam words. I have it configured to add [Spam] to the start of the subject, so I can then let Pocomail filter it to the Junk Mail folder. While I have PM's Junk Mail Filters disabled, I set up a regular filter to look for [Spam] in the subject line and move it to the Junk mail folder. I don't have a "Whitelist" or "Blacklist" set up, but I do have the "Use a DNS blackhole list to help identify spam" checked, and it's using the default sbl-xbl.spamhaus.org list server, and the "Days to cache IPs" is set at 5. I have no other "Tweaks" or features enabled.

Is there anything else I can do to creep up toward that 98% figure?

Thanks!!


Hmm. Sounds like K9's getting there. I honestly can't remember how long it took me to achieve the 98 percent + level I'm getting now. Probably a couple of weeks longer than I remember . :lol:

I know K9 initially, i.e., within a couple of weeks. shot up past the 90 percent with 1 percent or less false positives (good mail misidentified as spam). But then it probably slowed down. I've been running it since mid-April most of the time -- except for a few weeks out while I tried on several occasions to go with Pocomail's Bayesian filtering.

I don't use the "black hole" feature; but I do use white list/black list features. I'm not sure how much those two features actually help, since most of the black listing is spam email addresses, which usually are fake and change from spam-to-spam anyway.

I would suggest you click the little "dog" in the corner of K9's screen and go to the home page and look it all over carefully. If you'll note on that page, they even have a user forum now. There's sometimes some helpful stuff there.

Sorry I don't have more specific help. Probably you'll see the stats get better as you continue using it, just as with all the Bayesian filters.

Gary Speer
speerga
Resident Poster
 
Posts: 116
Joined: Wed Jul 28, 2004 1:49 pm
Location: Springfield, Missouri

Postby tribble » Wed Oct 13, 2004 9:13 am

It would be VERY interesting to see PSI do an analysis on your poor Poco BF results versus those of us that get good results.

It seems to me, all things being somewhat equal, that Poco's BF should work equally well across the spectrum and if it doesn't, 1) you have an absolutely unique config/mail environment or 2) something isn't quite right with the Poco BF.

If K9 gives you good results, you should be able to obtain similar results with Poco BF as many of us did. I used K9 for some time until the BF release in Poco, I was very happy with K9 but it required too much maintenance (clearing the logs and such). After just a few days with Poco BF my results went well above 90% and I now retain over 98%.

I'd like to try a simple experiment. Could I send you guys my spam and good corpus, replace your own and see if the results improve (or vice versa as I receive, on average, 300 spam messages a day).

I reset my results for each beta and final release and within a day or two am right back to >98%.

I have >56K junk words and >18K good words in my corpi, with 98.83% accuracy today.
tribble
Poco Enthusiast
 
Posts: 430
Joined: Wed Jul 28, 2004 8:55 am

Postby stuart_rogers » Sun Oct 24, 2004 10:29 am

I have just read this thread with interest. I currently use Mailwasher as I dont like the idea of downloading all emails in full (virii included) and junking them on my computer. Yes I do know that MW does download part of each email but it still seems a better way to be able to junk the stuff from the server.

Am I correct in understanding that Poco's BF system requires the whole email to be downloaded prior to processing?

Like most people I would like to use all the features of the software I buy but in this case it would seem like BF wont work and I'm left with pre-download filters which need to be done manually.
stuart_rogers
Drop-in Visitor
 
Posts: 14
Joined: Mon Oct 18, 2004 8:25 pm

Next

Return to Email Hall

Who is online

Users browsing this forum: No registered users and 2 guests

cron