'Re: Architectural problems shown by the anti spam wizard'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kmail-devel
Subject:    Re: Architectural problems shown by the anti spam wizard
From:       Ingo =?iso-8859-1?q?Kl=F6cker?= <kloecker () kde ! org>
Date:       2004-01-30 20:39:30
Message-ID: 200401302139.33243 () erwin ! ingo-kloecker ! de
[Download RAW message or body]

[Attachment #2 (multipart/signed)]

On Friday 30 January 2004 02:33, Don Sanders wrote:
> On Thursday 29 January 2004 17:33, Andreas Gungl wrote:
> > Don Sanders wrote:
> > > Hi Andreas,
> > >
> > > On Thursday 29 January 2004 05:45, Andreas Gungl wrote:
> > >>Hi,
> > >>
> > >>for those who are interested I want to describe some problems I
> > >>came over while I worked on some details for the anti spam
> > >> wizard.
> > >
> > > I'll try to answer your questions as best I can. But I'm having
> > > difficulty because I don't understand the intended workflow for
> > > setting up the anti-spam stuff you have been working on.
> > >
> > > Don't get me wrong it's a critical feature in my opinion, but I'm
> > > lacking clarity on how exactly it is meant to work from the end
> > > users point of view. Is it intended to work like the Mozilla
> > > ant-spam stuff? (I haven't used that either, but I see lots of
> > > people saying good things about it).
> > >
> > > I don't understand what value the new spam/ham statuses have for
> > > instance.
> >
> > I'll try to explain it in short here:
> > The main problem is that KMail doesn't have built in spam filtering
> > - in contradiction to Mozilla. Installing Mozilla gives you all you
> > need, so you can rely on your own. KMail wants to cooperate with
> > existing anti spam tools. The "average user" (whoever it is) has
> > two problems IMO. She would have to check what tools are installed
> > (installation could be prepared by the distributors) and then
> > (perhaps more difficult) she would have to define filter rules to
> > let KMail properly cooperate with the tools.
> >
> > The wizard tries to find installed tools. Then it allows to let
> > create filters to use the tools to detect spam mails (basically by
> > piping through the tools) and to handle spam messages e.g. by
> > moving to trash (by identifying related headers like X-SpamFlag).
> > Having bayesian spam tools, you need to learn them. So the wizard
> > can create rules to let the tools learn ham and spam messages.
> > The creation of the tools is done based on information in a config
> > file (kmail.antispamrc, see CVS), so new tools can get added
> > without changing the code.
>
> Ok. I guess it makes sense for the user to have to explicitly
> activate spam filtering before it is used so a wizard that appears
> when the user first tries to classify spam or something makes sense.
>
> > Detection and handling rules are applied on incoming messages and
> > on manual filtering. Classification (learning) is done using ad-hoc
> > filters.
>
> To me ad-hoc filters refers to filters listed in the filter
> configuration dialog with "Add this filter to the Apply Filter
> Actions menu' chechbox ticked.
>
> I'm not sure whether it is a good idea to involve those with spam
> filters.

Why do you think it's not a good idea? Isn't this a prime example for 
the usefulness of ad-hoc filters? AFAIU the purpose of ad-hoc filters 
is to provide a way for the user to apply filter actions to messages. 
So what's wrong with creating ad-hoc filters for classifying messages 
as spam or ham?

> An alternative would be to create two KActions 'Classify as spam',
> 'Classify as ham' in kmmainwidget, create two corresponding slots*
> KMMainWidget::slotClassifyAsSpam, KMMainWidget::slotClassifyAsHam,
> and in the kernel create two KMFilters one to mark as spam, one to
> mark as spam. These KMFilters would be deleted and replaced by new
> KMFilters if spam tool options were changed (via the wizard or a
> configuration dialog or whatever).
>
> To make sure the KMFilters are applied to incoming messages the
> KMFilterMgr (which should eventually be obsolete) and the
> ActionScheduler (which is the replacement) could be updated.

I don't think that this is a good idea. The user should be able to 
control when the spam filter is tested. For example I filter first for 
all KDE mailing-list and only then I check the remaining messages for 
spam. Since checking a message for being spam takes a long time 
(several seconds) I don't want to check all messages for spam.

Also I don't understand what the advantage of hardcoding the actions 
would be. I mean why did you invent ad-hoc filters if you now propose 
to hardcode the spam classification action although ad-hoc filters are 
perfectly suited for this task.

> > You may end up with some (many?) unused resp. invalid action
> > entries for the toolbar in the XMLGUI ressource file. As soon as
> > you have any new action named like such an entry, a toolbar button
> > will show up even if you never intended to create one.
>
> If each KMMainWidget has two fixed KActions for classifying as
> spam/ham then would this problem still exist?

Of course not. But the problem is of general nature, i.e. it affects all 
ad-hoc filters (Create an ad-hoc filter, add it to your toolbar, delete 
the ad-hoc filter, create a new ad-hoc filter with the same name. 
Result: The new ad-hoc filter will show up in the toolbar although the 
user has never added it to the toolbar). So it has to be solved anyway 
regardless of whether we introduce hardcoded actions for spam/ham 
classification.

> I'm not sure I understand, this problem is due to the QObject name of
> the spam classification KActions being variable, correct?

Yes and no. The problem is due to the QObject name of all ad-hoc filters 
(not only of the spam classification KActions) because it's neither 
unique (the user can create multiple ad-hoc filters with the same name) 
nor constant (the user can rename ad-hoc filters) in time.

> > >>   The wizard plugs the action into the toolbar and to make the
> > >>change persistent it modifies the XMLGUI file. This is done by
> > >>manual manipulation of the XMLGUI file, currently there is no API
> > >>which would support automatic write back of the current config by
> > >>the toolbar itself. Of course the user will face the sync
> > >> problems described above sooner or later.
> > >
> > > Not completely sure I understand the sync problems.
> >
> > The toolbar does not change it's XMLGUI config file when you plug
> > an ad-hoc filter based action into it. A button would show up after
> > you have plugged the action, but after a restart the button is lost
> > if you don't replug it again.
> > The wizard tries to eliminate the restart problem by additionally
> > adding the actions to the ressource file.
>
> So again this is due to the QObject name of the spam classification
> KActions being variable?

Not really. This is due to KDE lacking an API for manipulating and/or 
saving the toolbar.

> > >>Another problem might be based on
> > >>translation (i18n) issues when e.g. the user switches the
> > >> language, but I have no concrete scenario for this, it's just a
> > >> guess.
> > >
> > > If you look at the KAction constructor in
> > > initializeFilterActions() the QObject::name() is not i18n()'d so
> > > this should be invariant in the case of the user switching the
> > > language.
> >
> > The action names are based on the filter's name.
>
> So that would seem to be the problem, the action names should be
> fixed rather than dependant on the filter's name.

Exactly.

> > >>Let's say that we have a default action "mark as spam". The
> > >>appropriate filter should register to this action making the
> > >> action active. After that the user may configure the action
> > >> regarding the toolbar position. My conclusion was that this (up
> > >> to now) a very special case. It's more complex to implement than
> > >> the solution above. And I still have no clear picture about what
> > >> would this imply for other parts of KMail.
> > >
> > > I think default actions make sense.
> >
> > The mark spam action would be by default in KMail.
>
> Good.

Not good. See my other reply.

> > Then, there
> > might be some code (the provider) which registers to it by saying,
> > hey I want to be called when the mark spam action is triggered. It
> > can be made inside the KMail code base e.g. by letting the user
> > associate a filter with an action (oh, I don't really want this) or
> > by even having a dcop interface for external plugins.
>
> So I think it makes sense to have a couple KMFilters in the KMKernel,
> or create a new SpamClassification class that contains a couple of
> KMFilters and have a reference to that in KMKernel.
>
> *The slotClassifyAsSpam, and slotClassifyAsHam could also be methods
> of this SpamClassification class, or whatever.

I think that this is completely unnecessary. Why should we special case 
spam handling? It works perfectly with the current filters and ad-hoc 
filters. The problems that now rear their heads are of general nature 
and not specific to the spam handling case. It's just that Andreas' 
work on the spam wizard brought those problems to the surface. But 
fixing the spam handling by introducing a special class won't magically 
fix the general problems.

To be honest, I'm not totally opposed to special case spam handling by 
introducing two new actions (but only if those actions are capable of 
supporting multiple spam tools at the same time). Independently of 
this, the general problems have to be fixed as well. A solution for the 
uniqueness and constantness problem of the action names (which I 
already mentioned it in my other message) would be to give all filters 
a unique identifier. And the problem with manipulating the XMLGUI file 
would no longer be a problem if we simply add the classification 
actions to the default toolbar.

Regards,
Ingo

[Attachment #5 (application/pgp-signature)]

_______________________________________________
KMail developers mailing list
KMail-devel@kde.org
https://mail.kde.org/mailman/listinfo/kmail-devel

[prev in list] [next in list] [prev in thread] [next in thread]