[prev in list] [next in list] [prev in thread] [next in thread] 

List:       spambayes-bugs
Subject:    [spambayes-bugs] [ spambayes-Bugs-1600821 ] Classifier
From:       noreply () sourceforge ! net (SourceForge ! net)
Date:       2006-11-21 23:59:47
Message-ID: E1GmfWl-0003GR-2L () sc8-sf-web1 ! sourceforge ! net
[Download RAW message or body]

Bugs item #1600821, was opened at 2006-11-22 00:59
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=1600821&group_id=61702

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: imapfilter
Group: 1.0.1
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Ivan Vilata i Balaguer (ivilata)
Assigned to: Tony Meyer (anadelonbrin)
Summary: Classifier UnicodeDecodeError on wrong transfer encoding

Initial Comment:
Running ``sb_imapfilter.py`` 1.0.1 seems to raise the following \
``UnicodeDecodeError`` when it comes across a mail with 7-bit content transfer \
encoding with 8-bit characters in it while classifying::

    Traceback (most recent call last):
    File "/usr/bin/sb_imapfilter.py", line 924, in ?
      run()
    File "/usr/bin/sb_imapfilter.py", line 914, in run
      imap_filter.Filter()
    File "/usr/bin/sb_imapfilter.py", line 785, in Filter
      self.unsure_folder)
    File "/usr/bin/sb_imapfilter.py", line 703, in Filter
      evidence=True)
    File "/usr/lib/python2.4/site-packages/spambayes/classifier.py", line 190, in \
chi2_spamprob  clues = self._getclues(wordstream)
    File "/usr/lib/python2.4/site-packages/spambayes/classifier.py", line 496, in \
_getclues  clues.sort()
  UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1: ordinal not \
in range(128)

I'm attaching the mail which caused this.  I know it is not properly-formatted, but \
it is a legitimate mail produced by a popular MUA (Thunderbird 1.5).  Spam surely is \
worsely formatted

Someone talked about the same problem in the list: \
http://www.mail-archive.com/spambayes at python.org/msg04543.html

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=1600821&group_id=61702


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic