[prev in list] [next in list] [prev in thread] [next in thread] 

List:       aspell-user
Subject:    Re: [Aspell-user] Configuring spell check in mult language documents
From:       Kevin Atkinson <kevina () gnu ! org>
Date:       2011-07-09 4:25:09
Message-ID: alpine.BSF.2.00.1107082210340.31818 () bas ! flux ! utah ! edu
[Download RAW message or body]



On Sat, 9 Jul 2011, Mahesh T. Pai wrote:

> Carlo Traverso said on Fri, Jul 08, 2011 at 08:24:54PM +0200,:
>
> > aspell list -l lang1 | aspell list -l lang2
>
> That would take the words out of their context, no?
>
> > I did not check the hindi dictionaries, but probably hindi accepts
> > both latin and hindi characters as word components (this is how
> > ancient greek, grc, does). The solution of your problem could be to
> > define a variant of hindi that only accepts hindi characters.
>
> AFAICT, no. Especially if you are putting that in the linguistic
> sense.

The linguistic sense, is not relevant here, what is relevant is if Aspell 
treats Latin characters as part of the word.  I just looked up how Hindi 
is configured in Aspell and this is not the case.  Thus, the problem is 
that gedit is trying to check Latin words even though they are in a 
completely different script.  Naturally these words are not in the Hindi 
dictionary so Aspell marks then as incorrect.  The best solution is not 
to try to check those words at all (which is what Aspell will do if you 
check the file from the command line), the second best solution is for 
Aspell to mark words without any word characters as correct, which I 
discussed in an earlier email.

> Hindi (and most Indic languages) use the 16 bit mapping in UTF-8
> encoding schema.
>
> I suspect that the difficulties mentioned by Kevin have more to do
> with aspell being "internally 8 bit", as Kevin put it some months back.

That has absolutely nothing to do with it.



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic