[prev in list] [next in list] [prev in thread] [next in thread] 

List:       spamassassin-users
Subject:    Re: fake base64 encoding
From:       John Wilcock <john () tradoc ! fr>
Date:       2017-02-02 15:52:22
Message-ID: 7a3511b5-48ef-74ba-5145-7f2139de1c1a () tradoc ! fr
[Download RAW message or body]

Le 02/02/2017 à 15:50, RW a écrit :
> On Thu, 2 Feb 2017 05:43:24 -0500
> Kevin A. McGrail wrote:
...
>> I will score much higher since it is in the wild.  Can you throw a
>> spample up on pastebin?
> Perhaps text/html makes a big difference, but base64 encoded utf-8
> text is not uncommon these days - particularly outside North America.
>
> To score it higher you might want to include a "full" rule that checks
> for base64 encoding in the headers followed by illegal whitespace near
> the beginning of what should be the base64 text.


Indeed. In my (very small) corpus, I see lots of base64-encoded utf-8 
text/html parts of multipart messages, but very few non-multipart examples.

All of the latter really are base64-encoded, rather than plain text 
labelled as base64, but that may simply be due to the small size of my 
corpus. As it happens they are all spam, but I'm not convinced that 
hitting on any utf-8 text/html message that purports to be 
base64-encoded, regardless of whether it is actually base64 or not, is a 
good idea.

FWIW,
John

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic