[prev in list] [next in list] [prev in thread] [next in thread] 

List:       spamassassin-users
Subject:    bayes expiry and token count bug
From:       Kai Schaetzl <maillists () conactive ! com>
Date:       2008-09-28 16:31:15
Message-ID: VA.000033b8.09939681 () news ! conactive ! com
[Download RAW message or body]

There must be a bug in the way the token reduction gets calculated.
I see this on several of my bayes databases.

Example (excerpts):

sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0      62759          0  non-token data: nspam
0.000          0      43000          0  non-token data: nham
0.000          0    1796131          0  non-token data: ntokens

local.cf:
bayes_expiry_max_db_size 1000000

sa-learn --force-expire -D
[16049] dbg: bayes: expiry check keep size, 0.75 * max: 750000
[16049] dbg: bayes: token count: 0, final goal reduction size: -750000
[16049] dbg: bayes: reduction goal of -750000 is under 1,000 tokens, 
skipping expire
[16049] dbg: bayes: expiry completed

There are 1796131 tokens, but sa-learn thinks there are 0 tokens. Or do I 
misinterpret this?

I can get it sometimes to start an expiry by changing the 
bayes_expiry_max_db_size to some other value. e.g. on a database with 3.5 
million tokens it saw 0 tokens with a limit of 100.000, but saw the 
correct number of tokens when I changed the limit to 1.000.000. 
Unfortunately then the typical expiry failure kicks in (couldn't find a 
good delta atime, need more token difference, skipping expire).
As this bugs me for quite some time I'm wondering in case there is a bug 
in the basic token count (as it seems) if there's not a chance there's 
also a bug in the expiry procedure?


Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic