'Re: [Clamav-devel] Subject: Pure Perl milter for use with clamd.'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       clamav-devel
Subject:    Re: [Clamav-devel] Subject: Pure Perl milter for use with clamd.
From:       "G.W. Haywood" <clamav-devel () jubileegroup ! co ! uk>
Date:       2019-08-25 14:44:18
Message-ID: alpine.DEB.2.11.1908231807350.21214 () mail6 ! jubileegroup ! co ! uk
[Download RAW message or body]

Hi Micah,

On Fri, 23 Aug 2019, Micah Snyder wrote:

> This project sounds pretty cool.  It wouldn't be something I would
> want to maintain as a part of the clamav repository.  Personally I'm
> well versed in Python but I have almost no Perl experience.  I
> suspect that you'll want to maintain ownership anyways for the
> freedom it permits to add and change features as needed.

Thanks!  The main thing I'd want to avoid is having umpteen different
versions all over the place all getting out of step with each other.
I've seen that many times with other projects, including a couple of
milters.  It's messy and confusing for anybody who wants to use them.

Obviously you already have your hands full with the existing ClamAV
codebase, I wouldn't want to add to the burden.  I still lean towards
keeping the masters on CPAN, as examples with the source for the
interface module.  If it only serves to get more people started with
their own ideas it will be useful.  If it becomes popular I'll think
about other ways of doing it, including some form of support space.

> If you do plan to maintain a "full" version with all the milters, as
> well as a cut-down version for just clam, you may want to make each
> milter into a separate module where the "full" one imports all of
> the modules (code reuse, vs duplication).

The "full" version is what I expect I'll be using for the forseeable
future, so I'll maintain that at the very least.  The clamav milter
version won't need much work once it's settled in and I can backport
the odd improvement to it from the full version, which I'm doing now.

On the subject of modules you've touched on something that has been
bothering me for a while.  It would be great if you could just pick
which bits you wanted to use and then write something like

use xm_IPC;
use xm_GeoIP;
use xm_ASN;
use xm_DNSBL;
use xm_SPF;
use xm_greylist;
use xm_DKIM;
use xm_ARC;
use xm_tarpit;

but that's pie in the sky at the moment.  It would cost months of pain
at the very best and with the amount of interdependence there is, both
between different callbacks, and, within each callback, between (what
would be) the different modules I think I'd spend the rest of my life
ironing out the surprises.  At the moment most of the functions can be
selected by command-line options and it's very likely to stay that way
unless someone (someone younger?) steps up.  Incidentally 'xm' stands
for "extensible milter", which means it will do more or less anything
you might want to do with mail.

> ... If you host your code in a Github repository, you can make a
> pretty slick documentation site ... we migrated all of our
> documentation into Markdown hosted on github...

Thanks, I'll take a look at that.  There's a lot of documentation and
I intend to write more, and I don't have a really good way to present
it all at the moment.

> On a related topic, we have been discussing the idea of phasing in
> an HTTP server as a replacement for the TCP server in clamd.

Hmmmmmm.  Given the pressures on other development I wonder if you'll
have enough hands.  I'm in the "if it ain't broke, don't fix it" camp.
While I can see the attraction of off-loading some of the complexity
and maintenance, you have many outstanding issues, and not only is the
existing interface nice and simple, it also seems to be very reliable.
As a clamd user, I'm not sure what HTTP offers me that I'd especially
want.  About the only thing I'd ask for is a better grip on the state
of the databases used by clamd - e.g. something to load each one when
I wanted to load it, rather than all at once, plus maybe some kind of
an extended 'VERSIONCOMMANDS' instruction which would tell me the name
and timestamp of all the currently loaded database files.  But the fix
for #10979 is very much more important than these niceties, and, if I
may say so, long overdue.  I've merged the patch in attachment #7196
into 0.101.4, and I'm currently running both the unpatched and patched
versions of clamd side-by-side, scanning with both.  I'll let you know
if I find anything really interesting, but as I've mentioned it would
need bigger volumes of genuine mail than we see here to test it well.
I suppose I could let the spammers get further along the milter chain,
but that goes against the grain a bit. }:-)  Anyway, the patch *seems*
to be doing the right things; here's an UNpatched daemon getting PINGs
at the top of every minute on its TCP interface, around the time it's
reloading its databases:

Aug 23 10:09:01 mail6 root: PONG
Aug 23 10:10:01 mail6 root: PONG
Aug 23 10:11:01 mail6 clamd[32258]: SelfCheck: Database modification detected. Forcing reload.
Aug 23 10:11:03 mail6 clamd[32258]: Reading databases from /etc/mail/clamav
Aug 23 10:14:41 mail6 clamd[32258]: Database correctly reloaded (8905170 signatures)
Aug 23 10:14:01 mail6 root: PONG
Aug 23 10:12:01 mail6 root: PONG
Aug 23 10:13:01 mail6 root: PONG
Aug 23 10:11:01 mail6 root: PONG
Aug 23 10:15:01 mail6 root: PONG

Note the timestamps of the database reload span more than one minute,
and see the jumble of PONG replies after the reload completes.  This
jumble was a surprise - at least it was to me.  This daemon holds up
mail for three or four minutes while it's reloading its databases.

Here's the patched daemon doing the same thing:

Aug 24 09:32:01 mail6 root: PONG
Aug 24 09:33:01 mail6 root: PONG
Aug 24 09:34:01 mail6 root: PONG
Aug 24 09:34:01 mail6 clamd[17521]: SelfCheck: Database modification detected. Forcing reload.
Aug 24 09:34:01 mail6 clamd[17521]: Reading databases from /etc/mail/clamav
Aug 24 09:35:01 mail6 root: PONG
Aug 24 09:36:01 mail6 root: PONG
Aug 24 09:37:01 mail6 root: PONG
Aug 24 09:37:46 mail6 clamd[17521]: Database correctly reloaded (8903969 signatures)
Aug 24 09:38:01 mail6 root: PONG
Aug 24 09:39:01 mail6 root: PONG

The patched daemon replies to PINGs and will scan messages while it's
reloading its databases.  I haven't looked at what happens if, while
it's reloading, you run something like a recursive directory scan but
then it doesn't normally do things like that here, it just scans mail.

-- 

73,
Ged.
_______________________________________________

clamav-devel mailing list
clamav-devel@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-devel

Please submit your patches to our Bugzilla: http://bugzilla.clamav.net

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
[prev in list] [next in list] [prev in thread] [next in thread]