[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kmail-devel
Subject:    [Bug 46826] Bayesian spam filter feature
From:       <lakeland () acm ! org>
Date:       2003-07-24 3:42:59
[Download RAW message or body]

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
     
http://bugs.kde.org/show_bug.cgi?id=46826     




------- Additional Comments From lakeland@acm.org  2003-07-24 05:42 -------
Subject: Re:  Bayesian spam filter feature

On Mon, 21 Jul 2003 11:07, you wrote:

> > > I do not want to a spamassassin server running on my box. I do not want
> > > to install extra things just to make kmail work properly. I just want a
> > > default spam filter that can be run easily.
> >
> > Is installing kmail and bogofilter really more complex than installing
> > mozilla and its filters?
>
> Yes it is. You have to know that bogofilter is something that can
> plug-in into kmail (I've just learned something in fact), you have to
> search a bogofilter package for your distro/system and then install
> bogofilter. Yeah, very simple.

% apt-cache search spam | grep -i filter
amavisd-new - Interface between MTA and virus scanner/content filters
blackhole-exim - Spam filter - exim version
bogofilter - a fast Bayesian spam filter
crm114 - The Controllable Regex Mutilator and Spam Filter
ifile - Mail filter capable of learning
mailfilter - A program that filters your incoming e-mail to help remove spam.
pyzor - spam-catcher using a collaborative filtering network
razor - spam-catcher using a collaborative filtering network
spamassassin - Perl-based spam filter using text analysis
spamc - Client for perl-based spam filtering daemon
spamfilter - Filter spam from incoming mail
spamoracle - A statistical analysis spam filter based on Bayes' formula
spamoracle-byte - A statistical analysis spam filter based on Bayes' formula
spamprobe - a C++ Bayesian spam filter
blackhole-qmail - Spam filter - qmail version
qmail-qfilter - qmail-queue filter front end

% apt-get install bogofilter

Perhaps two minutes' work?

> My problem is that I don't want to install another program just to make
> kmail work, I don't have the root password on my workstation, and I
> simply don't have the time : searching for the documentation, reading
> the right documentation, and set up everything : half a day lost.

Then get your sysadmin to install a spam filter.  It just seems this problem 
is not related to kmail, but to the ease of installing software.

> It depends on who you think your users are. Basic users do not want to
> know anything about bogofilter, spamprobe, spamassassin or crm114. I am
> sure my girlfriend will be happy to know more about all these anti-spam
> tools and test them ;-)

*shrug*, then have a default (spamassassin I guess, since it doesn't require 
configuration, training, or correcting).

> Seriously, what is probably wanted by common users is just a simple simple
> spam filter that comes by default and moves the most annoying messages like
> "VIAGRA" "PENIS ENLARGEMENT" "COME TO NIGERIA" in a directory entitled
> "spam_is_here".

Simple spam filters do not work.  Seriously.  If they did then this _might_ be 
an acceptable solution.  Since even outlook has this, almost every spam is 
designed to avoid basic filtering.  If you don't have better spam filtering 
than this, you may as well not have spam filtering.

> Isn't that possible to have a default very basic Bayesian built-in

Yes, but.

a) Bayesian filters need training.  A training file _could_ be sent with kmail 
but it would add another 20MB to kmail's distribution size which would be 
unaccetable. crm114 is a bit of a winner here since it would only add 1MB.

b) Bayesian filters need to be constantly updated/corrected.  Static bayesian 
fitlers are no better than spamassassin.  And in a year they'll be useless.

c) That means writing and maintaining a bayesian filter in kmail's code, which 
seems like unnecessary work duplication to me.

I think that most people checking email get their spam filtered by their mail 
provider.  The ISP runs spamc, and modifies the headers.  This makes 
filtering in kmail really trivial.  In my case I run my own mail server for 
myself and family.  I still do not do filtering in kmail because then I'd 
have to configure it for everyone.  Instead of have it run from exim (via 
procmail).  Procmail is also used to filter the spam into everyone's spam 
boxes.  So the only feature I really needed for kmail was correcting 
mistakes. 

Getting back to your girlfriend example:

If the mail is spamc'ed by your ISP then you don't have a problem.  Kmail has 
been able to filter this for years.

If your ISP doesn't run spamc but you do have root on localhost, then install 
bogofilter or similar, and copy the procmail example from the bogofitler 
manpage.  E.g. I have the following procmail recipe for my brother-in-law.  
As far as he is concerned, spam detection happens automatically and nothing 
is needed in kmail:
	MAILDIR=$HOME/Maildir
	LOGFILE=$HOME/.procmail_log


	:0fw
	| /usr/bin/spamc

	:0:
	* ^X-Spam-Status: SPAM
	.spam/

	:0:
	./

If your ISP doesn't run spamc, and you don't have root, and no spam tools are 
installed, then you have some hassle.  But downloading and installing 
bogofilter is less than an hour's work, and spamassassin is similar.

So, my personal opinion is there are very few people who need to have spam 
detection built into kmail instead of as an add on.  .  Maintaining a good 
spam filter in kmail would be a lot of work, and the only people to benefit 
would be people without root who don't know how to install sofware, and 
people with root who don't know what software to install.  Most linux 
distributions install spamassassin as part of a 'mail server' service, so 
most people don't need to install anything.

It could be better documented.  It could be made much easier to set up 
(automatic creation of spam folder, autodetection and integration of 
installed spam tools).  It could be better integrated (right click to mark as 
spam, or mark as non spam). But I don't think a new filter is needed.

Corrin
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE/GzN0i5A0ZsG8x8cRAn2+AJ4gGmK6RdjDO+nWkjolKtewBTLbkwCglh/A
Wsjo4xon+AZbzpQCwN7FSes=
=V7+e
-----END PGP SIGNATURE-----
_______________________________________________
KMail Developers mailing list
kmail@mail.kde.org
http://mail.kde.org/mailman/listinfo/kmail
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic