[prev in list] [next in list] [prev in thread] [next in thread] 

List:       tmda-users
Subject:    UnicodeDecodeError - problem identified. How to solve?
From:       "Thiago Lima" <thiagomadeira () gmail ! com>
Date:       2008-03-14 20:00:17
Message-ID: 9a30eb600803141300w709220dflb76a475d5a27c600 () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hi,

I have the UnicodeDecodeError in my tmda-cgi for some long time. Everytime a
user complains I delete the message manualy.  After googleing for the
problem I´ve got no solutions for it. Nobody could describe the problem
corretly and the development team could not solve it.

After analizing many headers with the problem, I think I solved the problem.
TMDA-CGI exit with error when the Subject uses a different encoding than the
Content-Type header or if there´s any problem with the 'charset' field in
Content-Type header.

Exemples that crashes TMDA-CGI:

-- this one says to use utf-8 but uses latin-1 in the subject.

Subject: Curso de Introdução ao mercado  de Açoes

Content-Type: text/html;

        charset="utf-8"



-- this one uses utf, but the charset is strange

Subject:
=?UTF-8--utf?Q?Seu_pedido_(pensomail_antispam_-_confirme_mensagem_para_mb?=
=?UTF-8--utf?Q?alasso@levysalomao.com.br.)_n=C3=A3o_foi_reconhecido!?=

Content-Type: text/plain; charset="UTF-8--utf"



 -- a really don´t know -1252 charset. think it does not exist.

Content-Type: text/html; charset="-1252"



Does that make sense ? There´s any way to fix it with those samples? Need
more samples?  I can´t program in python, otherwise I´d look into the code.
:(

[Attachment #5 (text/html)]

<div>Hi,</div>
<div>&nbsp;</div>
<div>I have the UnicodeDecodeError in my tmda-cgi for some long time. Everytime a \
user complains I delete the message manualy.&nbsp;&nbsp;After googleing for the \
problem I´ve got no solutions for it. Nobody could describe the problem corretly and \
the development team could not solve it.</div>

<div>&nbsp;</div>
<div>After analizing many headers with the problem, I think I solved the problem. \
TMDA-CGI exit with error when the Subject uses a different encoding than the \
Content-Type header&nbsp;or if there´s any problem with the &#39;charset&#39; field \
in Content-Type header.</div>

<div>&nbsp;</div>
<div>Exemples that crashes TMDA-CGI: </div>
<div>&nbsp;</div>
<div>-- this one says to use utf-8 but uses latin-1 in the subject.&nbsp;</div>
<div>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"><font size="3"><font \
face="Calibri">Subject: Curso de Introdução ao mercado<span style="mso-spacerun: \
yes">&nbsp; </span>de Açoes</font></font></p> <p class="MsoNormal" style="MARGIN: 0cm \
0cm 0pt"><span lang="EN-US" style="mso-ansi-language: EN-US"><font size="3"><font \
face="Calibri">Content-Type: text/html;</font></font></span></p> <p class="MsoNormal" \
style="MARGIN: 0cm 0cm 0pt"><font size="3"><font face="Calibri"><span lang="EN-US" \
style="mso-ansi-language: EN-US"><span style="mso-spacerun: \
yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
</span></span>charset=&quot;utf-8&quot;</font></font></p>

<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"><font size="3"><font \
face="Calibri"></font></font>&nbsp;</p> <p class="MsoNormal" style="MARGIN: 0cm 0cm \
0pt"><font size="3"><font face="Calibri">-- this one uses utf, but the charset is \
strange</font></font></p></div> <div>
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"><span style="FONT-SIZE: 10.5pt; \
FONT-FAMILY: Consolas; mso-fareast-font-family: &#39;Times New Roman&#39;; \
mso-fareast-theme-font: minor-fareast; mso-fareast-language: PT-BR; mso-no-proof: \
yes">Subject: =?UTF-8--utf?Q?Seu_pedido_(pensomail_antispam_-_confirme_mensagem_para_mb?=<span \
style="mso-spacerun: yes">&nbsp; \
</span>=?UTF-8--utf?Q?alasso@levysalomao.com.br.)_n=C3=A3o_foi_reconhecido!?=</span></p>


<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt">
<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"><span lang="EN-US" style="FONT-SIZE: \
10.5pt; FONT-FAMILY: Consolas; mso-fareast-font-family: &#39;Times New Roman&#39;; \
mso-fareast-theme-font: minor-fareast; mso-fareast-language: PT-BR; mso-no-proof: \
yes; mso-ansi-language: EN-US"></span></p> <span lang="EN-US" style="FONT-SIZE: \
10.5pt; FONT-FAMILY: Consolas; mso-fareast-font-family: &#39;Times New Roman&#39;; \
mso-fareast-theme-font: minor-fareast; mso-fareast-language: PT-BR; mso-no-proof: \
yes; mso-ansi-language: EN-US">Content-Type: text/plain; \
charset=&quot;UTF-8--utf&quot;</span>  <p class="MsoNormal" style="MARGIN: 0cm 0cm \
0pt"><span lang="EN-US" style="FONT-SIZE: 10.5pt; FONT-FAMILY: Consolas; \
mso-fareast-font-family: &#39;Times New Roman&#39;; mso-fareast-theme-font: \
minor-fareast; mso-fareast-language: PT-BR; mso-no-proof: yes; mso-ansi-language: \
EN-US">&nbsp;</span></p>

<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"><span lang="EN-US" style="FONT-SIZE: \
10.5pt; FONT-FAMILY: Consolas; mso-fareast-font-family: &#39;Times New Roman&#39;; \
mso-fareast-theme-font: minor-fareast; mso-fareast-language: PT-BR; mso-no-proof: \
yes; mso-ansi-language: EN-US">&nbsp;-- a really don´t know -1252 charset. think it \
does not exist.</span></p>

<p class="MsoNormal" style="MARGIN: 0cm 0cm 0pt"><span lang="EN-US" style="FONT-SIZE: \
10.5pt; FONT-FAMILY: Consolas; mso-fareast-font-family: &#39;Times New Roman&#39;; \
mso-fareast-theme-font: minor-fareast; mso-fareast-language: PT-BR; mso-no-proof: \
yes; mso-ansi-language: EN-US">Content-Type: text/html; \
charset=&quot;-1252&quot;</span></p>

<p>&nbsp;</p>
<p>Does that make sense ? There´s any way to fix it with those samples? Need more \
samples?&nbsp; I can´t program in python, otherwise I´d look into the code. :(</p> \
<p>&nbsp;</p> <p>&nbsp;</p></p></div>



_____________________________________________
tmda-users mailing list (tmda-users@tmda.net)
http://tmda.net/lists/listinfo/tmda-users

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic