[prev in list] [next in list] [prev in thread] [next in thread] 

List:       spamassassin-users
Subject:    Re: Bayes auto-learning a bad =?UTF-8?Q?idea=3F?=
From:       Benny Pedersen <me () junc ! org>
Date:       2011-09-28 12:57:34
Message-ID: 8c21cea366e596b046d118c53aff0c81 () junc ! org
[Download RAW message or body]

On Wed, 28 Sep 2011 14:30:32 +0200, Lars Jørgensen wrote:
> On 28-09-2011 13:20, Benny Pedersen wrote:
>>> I train Bayes manually on the borderline cases, but also have
>>> auto-learning enabled. Is that really a bad idea? Should I disable 
>>> it,
>>> delete the bayes-databases and start over on manual-only learning?
>>
>> no training is always good
>
> Are you missing a comma? Do you mean "no, training is always good" or
> "no training is always good"?

no just my bolsk algebra and english is bad :)

>> what score are you learning on ?, default is -0.1 and 12.0, i have
>> changed them here to -4 and 14
>
> Can't find any settings to that effect, so I guess I am using
> defaults. I have entered your settings in my config now.

perldoc Mail::SpamAssassin::Plugin::AutoLearnThreshold

>
> Looking at
> 
> http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Conf.html#learning_options
> i see an option called "bayes_use_hapaxes" that promises
> significantly better hit-rates, but also increases database size by a
> factor of 8 to 10. What is the recommendation on this?

dont known for sure what is best there, using default here

perldoc Mail::SpamAssassin::Plugin::Bayes
perldoc Mail::SpamAssassin::Conf

for 3.3.1 and above i add in local.cf

bayes_auto_learn_on_error 1

reduce poising bayes and load

> If throughput
> is a factor in this decision, we are scanning about 60,000 to 90,000
> mails a day.

more then my server handle now

>
>> what plugins have you enabled ?
>
> DCC
> pyzor/razor
> SpamCop
> AutoLearnThreshold
> TextCat
> MIMEHeader
> ReplaceTags
> DKIM
> Check
> HTTPSMismatch
> URIDetail
> Bayes
> All the EvalTest plugins
> VBounce
> ImageInfo
> FreeMail
>
>> 3dr party rules or just default sa 3.3.2 ?
>
> Default and Sought Rules.

should be safe enough to not give any problem to bayes

tip if you like to restart learning bayes on can do this like here:

sa-learn --dump magic

bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)

and adjust this with 200 more then listed in dump magic, this ensure 
that bayes go back in learning mode


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic