[prev in list] [next in list] [prev in thread] [next in thread] 

List:       aspell-user
Subject:    [Aspell-user] small bug soundslike and non-ascii
From:       Pablo Saratxaga <pablo () mandriva ! com>
Date:       2005-10-21 11:29:40
Message-ID: 20051021112940.GD15917 () chanae ! alphanet ! ch
[Download RAW message or body]


Kaixo!

I discovered that soundslike just handles ASCII only; and converts
any non-ascii to some ascii value.
In most cases of existing *_phonet.dat it doesn't matters; but
in some cases it does.

French and Walloon are na example of that.

For example, "c" and "ç" are very different,
"ca" sounds "KA", but "ça" sounds "SA";
however, current phonet code handles "c" and "ç" just the same;
as a result, "ça" is viewed as sounding "KA" too...

another example is "e" vs "ê,é,è".
At the end of a word, "e" (without accent) is always mute,
eg: "livre" => "LIVR"
but not if it is accented, eg: "livré" => LIVRE
as a result, it is impossible to define some usefull soundslike
rules if they involve non-ascii chars in the language.

(I think also that it makes it impossible to defined soundslike rules
for languages for wich non-ascii letters are even more proeminent,
or even exclusively used; like Czeck, Esperanto, Russian,...)

the idea of matching fully accented chars with "ascii only" versions
is however a good one, but the match could involve several chars
(eg: "ö" -> "oe" in German, and not "ö" -> "o");
the possibility to define an "asciification" table could help
find the better suggestions when spell checking an unaccented
ascii-only text; that is particularly true for those languages
that, for lack of proper computer support, had been written in
ascii for a long time, like Esperanto and Romanian for example.

thanks

-- 
Ki ça vos våye bén,
Pablo Saratxaga

http://chanae.walon.org/pablo/		PGP Key available, key ID: 0xD9B85466
[you can write me in Walloon, Spanish, French, English, Catalan or Esperanto]
[min povas skribi en valona, esperanta, angla aux latinidaj lingvoj]

[Attachment #3 (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic