From kde-core-devel Fri Feb 18 18:00:27 2005 From: Mashrab Kuvatov Date: Fri, 18 Feb 2005 18:00:27 +0000 To: kde-core-devel Subject: Re: [PATCH] KSpell Unicode problem (BR#86940) Message-Id: <200502181900.31549.kmashrab () sat ! physik ! uni-bremen ! de> X-MARC-Message: https://marc.info/?l=kde-core-devel&m=110874994207670 MIME-Version: 1 Content-Type: multipart/mixed; boundary="--nextPart1780598.KK6VRWRiOX" --nextPart1780598.KK6VRWRiOX Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Hi Luk=C3=A1=C5=A1, first of all thanks for looking at the issue. Waldo, thanks too. On Friday 18 February 2005 13:29, Luk=C3=A1=C5=A1 Tinkl wrote: > Hmmm, unfortunate situation... Mashrab's patch breaks with older aspells,= =20 > that's what I feared. Furthermore, KSpell+ASpell5 currently can't cope wi= th=20 > any utf8 text at all. Attached patch fixes that. To my knowledge (correct me if I'm wrong) neither Aspel < 0.60 nor Ispell supports utf8 spellchecking, they treat input/output in 8-bit encoding. Tha= t's why, actually, I was surprised to see UTF-8 in a list of encodings of contr= ol center. One could argue that it is possible to pass --encoding=3Dutf8 to Aspell-0.50, but from the documentation of Aspell-0.50 [1]=20 encoding=20 (string) The encoding the input text is in. Valid values are ``utf-8'',=20 ``iso8859-*'', ``koi8-r'', ``viscii'', ``cp1252'', ``machine unsigned 16'',= =20 ``machine unsigned 32''. However, the aspell utility will currently only=20 function correctly with 8-bit encodings. I hope to provide utf-8 support in= =20 the future. What spellchecker are you using? If Ispell, are you sure it is not a wrapper around Aspell (some distros dropped Ispell)? If you do not have Aspell-0.60, how do you spellcheck utf8 texts (like you said Aspell-0.50 cannot do that)? Thanks for patch, I'll definitely try it once at home. However, I doubt it solves the problem. Currently, the spellchecking i.e., pass a word to Aspell and get suggestions, is working very well. No way to blame backend. The problem, IMHO, is in what happens with suggested word. Namely, there is a variable called posinline, from how I understand the code it giv= es the position of a word being checked in a string. Later, lastpos is calculated to figure out which word to replace/highlight. The part of a code which I commented out does crazy things resulting in wrong posinline. Did anybody understand what I said? It seems it is not clear even to myself. :-) 1.=20 http://aspell.net/0.50-doc/man-html/4_Customizing.html#SECTION0052300000000= 0000000 Cheers, Mashrab. PS. I faked a reply cutting&pasting from the archive, since I was not in th= e=20 list. Now I subscribed. =2D-=20 Mashrab Kuvatov Ph.D student University of Bremen, IUP Home-page: www.sat.uni-bremen.de/members/mashrab PGP key: www.uni-bremen.de/~kmashrab/kmashrab.asc --nextPart1780598.KK6VRWRiOX Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQBCFi0/iSmHahjHWuoRAuL7AKCB/OaAQT7RH0ieg4KC0t1B8SI8PwCghiRM mxEcIeq/6RtOvivZ0L1gR+s= =yix9 -----END PGP SIGNATURE----- --nextPart1780598.KK6VRWRiOX--