[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-bugs-dist
Subject:    Bug#19440: KMail and Japanese.
From:       Michael =?iso-8859-1?q?H=E4ckel?= <Michael () Haeckel ! Net>
Date:       2001-01-31 21:34:16
[Download RAW message or body]

On Wednesday, 31. January 2001 05:13, toyohiro@ksmplus.com wrote:
>
> $cd SOURCES/kdelibs/kdecore
> $diff -u charsets.config.orig charsets.config
> --- charsets.config.orig        Tue Jan 30 11:20:22 2001
> +++ charsets.config     Tue Jan 30 11:22:39 2001
> @@ -68,6 +68,13 @@
>  x-cp-1257=cp 1257
>  tis620=iso 8859-11
>  tis-620=iso 8859-11
> +jisx0201.1976-0=eucjp
> +jisx0208.1983-0=eucjp
> +jisx0208.1990-0=eucjp
> +jisx0208.1997-0=eucjp
> +jisx0212.1990-0=eucjp
> +jisx0213.2000-1=eucjp
> +jisx0213.2000-2=eucjp
>  windows-874=iso 8859-11
>  windows874=iso 8859-11
>  x-windows-874=iso 8859-11

Thanks for your detailed description. I think charsets in kdelibs is the job 
of Lars Knoll. I posted it on kde-core-devel.
Unfortunately I just was not even with your changes able to get any Japanese 
characters displayed although I thought I have a font for it installed 
(jis-fixed and jis-gothic). Could you tell me where I can get a Japanese font, 
or are these the wrong ones?
Ususally I use a unicode font, but that has only squares where the Japanese 
characters should be.

> diff -ur kmail.orig/kmmsgbase.cpp kmail/kmmsgbase.cpp
> --- kmail.orig/kmmsgbase.cpp    Wed Jan  3 00:30:16 2001
> +++ kmail/kmmsgbase.cpp Tue Jan 30 12:04:37 2001
> @@ -372,6 +372,9 @@
>         // decode base64 text
>         cstr = decodeBase64(str);
>        }
> +#if QT_VERSION < 224
> +      if(QString(charset).lower() == "shift_jis" ||
> +                              QString(charset).lower() == "shift-jis" )
> +          charset ="sjis"; // toyo
> +#endif
>        QTextCodec *codec = codecForName(charset);
>        if (!codec) codec = codecForName(KGlobal::locale()->charset());
>        if (codec) str = codec->toUnicode(cstr);

I did these things a bit different, because I already wrote 
KMMsgBase::codecForName for a similar problem. To change it there should fix 
a few more problems.

> diff -ur kmail.orig/kmreaderwin.cpp kmail/kmreaderwin.cpp
> --- kmail.orig/kmreaderwin.cpp  Fri Jan 26 06:46:28 2001
> +++ kmail/kmreaderwin.cpp       Tue Jan 30 13:00:58 2001
> @@ -904,6 +907,12 @@
>      } // if (!pgpMessage) then the message only looked similar to a pgp
> message else htmlStr = mCodec->toUnicode(quotedHTML(aStr));
>    }
> +//  Mail header charset(iso-2022-jp) is using all most E-mail system in
> Japan. +//  ISO-2022-JP code consists of ESC(0x1b) character and 7Bit
> character which +//  used from '!' character to  '~' character.
> +//  JIS7 is header charset of iso-2022-jp.  toyo
> +  else if( QString(mCodec->name()) == "JIS7" )
> +    htmlStr += quotedHTML(mCodec->toUnicode(aStr));
>    else htmlStr += mCodec->toUnicode(quotedHTML(aStr));
>    mViewer->write(htmlStr);
>  }

Well, it's a hack, but it works. This ESC definitely makes also problems at a 
few other places and I expected the charset problem mainly fixed.
Japanese really seems to need some special treatment.
Why can't you simply use utf-8 :-)

> (C) Please get the addressbook file and save to
>     $HOME/.kde/share/apps/addressbook.
>
>     addressbook file is in http://www.ksmplus.com/~toyohiro/kde2.1
>
>     Please execute kmail and push AddressBook button.
>     Fig-1  http://www.ksmplus.com/~toyohiro/kde2.1/addressbook.gif
>
>     Please Composer button -> click "To:" field .
>     and press Control-Key + Left-arrow-Key and display the contents
>     of addressbook.  The messages did not display in Japanese.
>     Fig-2  http://www.ksmplus.com/~toyohiro/kde2.1/composer1.gif
>
>     Please apply below patch and re-compile and execuete kmail again.
>     Please Composer button -> click "To:" field .
>     and press Control-Key + Left-arrow-Key and display the contents
>     of addressbook.  The messages displayed in Japanese.
>     Fig-3  http://www.ksmplus.com/~toyohiro/kde2.1/composer2.gif
>
> diff -ur kmail.orig/kmcomposewin.cpp kmail/kmcomposewin.cpp
> --- kmail.orig/kmcomposewin.cpp Sat Jan 27 04:31:57 2001
> +++ kmail/kmcomposewin.cpp      Tue Jan 30 12:14:33 2001
> @@ -2427,7 +2427,8 @@
>
>    n=0;
>    if (!KMAddrBookExternal::useKAB())
> -    for (QString a=adb.first(); a; a=adb.next())
> +    for (QString a=QString::fromLocal8Bit(adb.first())
> +                      ; a; a=QString::fromLocal8Bit(adb.next()))
>        {
>         if (QString(a).find(s,0,false) >= 0)
>           {

Thanks for this patch, but the internal addressbook is obsolote. KAB or 
abbrowser are recommended. Maybe we definitely should remove it someday.

Some time ago I committed a patch for internal addressbook to support 
non-latin characters. I didn't test it vary well myself before. I just tried 
it again. Well, it seems that even with your patch it does not work 
correctely. If I exit KMail and start it again the non-latin characters are 
converted to question marks.

Do you think, it also works, if I apply the attached patch (to KMail). Also 
headers (Subject and address fields) should work with iso-2022-jp, at least 
if you don't make them too long.

Regards,
Michael Häckel
["japanese.diff" (text/plain)]

Index: kmmessage.cpp
===================================================================
RCS file: /home/kde/kdenetwork/kmail/kmmessage.cpp,v
retrieving revision 1.169
diff -u -3 -p -r1.169 kmmessage.cpp
--- kmmessage.cpp	2001/01/23 08:11:30	1.169
+++ kmmessage.cpp	2001/01/31 21:15:03
@@ -255,7 +255,10 @@ void KMMessage::fromString(const QString
   resultPos = (char*)result.data();
   if (strPos) for (; (ch=*strPos)!='\0'; strPos++)
   {
-    if ((ch>=' ' || ch=='\t' || ch=='\n' || ch<='\0')
+//  Mail header charset(iso-2022-jp) is using all most E-mail system in Japan.
+//  ISO-2022-JP code consists of ESC(0x1b) character and 7Bit character which
+//  used from '!' character to  '~' character.  toyo
+    if ((ch>=' ' || ch=='\t' || ch=='\n' || ch<='\0' || ch == 0x1b)
        && !(ch=='>' && aStr.mid(strPos-aStr.data()-1,6)=="\n>From"))
       *resultPos++ = ch;
   }
Index: kmmsgbase.cpp
===================================================================
RCS file: /home/kde/kdenetwork/kmail/kmmsgbase.cpp,v
retrieving revision 1.64
diff -u -3 -p -r1.64 kmmsgbase.cpp
--- kmmsgbase.cpp	2001/01/02 15:30:16	1.64
+++ kmmsgbase.cpp	2001/01/31 21:15:04
@@ -284,6 +284,8 @@ QString KMMsgBase::skipKeyword(const QSt
 QTextCodec* KMMsgBase::codecForName(const QString& _str)
 {
   if (_str.isEmpty()) return NULL;
+  if (_str.lower() == "shift_jis" || _str.lower() == "shift-jis")
+    return QTextCodec::codecForName("sjis");
   return QTextCodec::codecForName(_str.lower().replace(
     QRegExp("windows"), "cp") );
 }
@@ -402,7 +404,7 @@ const QString KMMsgBase::decodeRFC2047St
 
 
 //-----------------------------------------------------------------------------
-const char especials[17] = "()<>@,;:\"/[]?.= ";
+const char especials[18] = "()<>@,;:\"/[]?.= \033";
 
 const QString KMMsgBase::encodeRFC2047String(const QString& _str,
   const QString& charset)
@@ -428,7 +430,7 @@ const QString KMMsgBase::encodeRFC2047St
     while (cr < latinLen)
     {
       if (latin[cr] == 32) start = cr + 1;
-      if (latin[cr] < 0) break;
+      if (latin[cr] < 32) break;
       cr++;
     }
     if (cr < latinLen)
@@ -438,7 +440,7 @@ const QString KMMsgBase::encodeRFC2047St
       while (cr < latinLen)
       {
         /* The encoded word must be limited to 75 character */
-        for (i = 0; i < 16; i++) if (latin[cr] == especials[i]) numQuotes++;
+        for (i = 0; i < 17; i++) if (latin[cr] == especials[i]) numQuotes++;
         if (latin[cr] < 0) numQuotes++;
         /* Stop after 58 = 75 - 17 characters or at "<user@host..." */
         if (cr - start + 2 * numQuotes >= 58 || latin[cr] == 60) break;
@@ -456,7 +458,7 @@ const QString KMMsgBase::encodeRFC2047St
       while (pos < stop)
       {
         numQuotes = 0;
-        for (i = 0; i < 16; i++) if (latin[pos] == especials[i]) numQuotes = 1;
+        for (i = 0; i < 17; i++) if (latin[pos] == especials[i]) numQuotes = 1;
         if (latin[pos] < 0) numQuotes = 1;
         if (numQuotes)
         {
@@ -501,7 +503,7 @@ const QString KMMsgBase::encodeRFC2231St
   bool quote;
   while (*l)
   {
-    if (*l < 0) break;
+    if (*l < 32) break;
     l++;
   }
   if (!*l) return latin;
@@ -511,7 +513,7 @@ const QString KMMsgBase::encodeRFC2231St
   while (*l)
   {
     quote = *l < 0;
-    for (i = 0; i < 16; i++) if (*l == especials[i]) quote = true;
+    for (i = 0; i < 17; i++) if (*l == especials[i]) quote = true;
     if (quote)
     {
       result += "%";
Index: kmreaderwin.cpp
===================================================================
RCS file: /home/kde/kdenetwork/kmail/kmreaderwin.cpp,v
retrieving revision 1.236
diff -u -3 -p -r1.236 kmreaderwin.cpp
--- kmreaderwin.cpp	2001/01/25 21:46:28	1.236
+++ kmreaderwin.cpp	2001/01/31 21:15:08
@@ -904,6 +904,12 @@ void KMReaderWin::writeBodyStr(const QCS
     } // if (!pgpMessage) then the message only looked similar to a pgp message
     else htmlStr = mCodec->toUnicode(quotedHTML(aStr));
   }
+//  Mail header charset(iso-2022-jp) is using all most E-mail system in Japan.
+//  ISO-2022-JP code consists of ESC(0x1b) character and 7Bit character which
+//  used from '!' character to  '~' character.
+//  JIS7 is header charset of iso-2022-jp.  toyo
+  else if( QString(mCodec->name()) == "JIS7" )
+    htmlStr += quotedHTML(mCodec->toUnicode(aStr));
   else htmlStr += mCodec->toUnicode(quotedHTML(aStr));
   mViewer->write(htmlStr);
 }


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic