[prev in list] [next in list] [prev in thread] [next in thread] 

List:       aspell-user
Subject:    Re: [Aspell-user] Problems with Arabic.
From:       Mohammed Sameer <msameer () foolab ! org>
Date:       2006-03-05 9:03:32
Message-ID: 20060305090332.GB4227 () home ! foolab ! org
[Download RAW message or body]


On Sat, Mar 04, 2006 at 09:59:18PM -0700, Kevin Atkinson wrote:
> 
> 
> On Sun, 5 Mar 2006, Mohammed Sameer wrote:
> 
> >Hi,
> >
> >I've created a simple wordlist for Arabic.
> >It contains +40,000 just to test aspell and Arabic.
> >
> >Looks like everything is fine with aspell from the command line, But 
> >when using abiword or any other graphical tools to generate the 
> >suggestions, I find that the suggestions are Latin letters not Arabic 
> >words.
> >
> >I had to encode the files in ISO 8859-6 as aspell didn't accept UTF-8 
> >for the data files. I think this might be the source of the problem but 
> >I can't be sure.
> >
> >Now my question is: How can I force the output from libaspell to be 
> >UTF-8 ? I tried the "data-encoding utf-8" in the ar.dat file but it 
> >didn't work.
> 
> You can't really "force" Aspell to output UTF-8.  Aspell will output what 
> every encoding the application ask it to.  if it ask's for "utf-8" it will 
> get it.
> 

Abiword is using enchant, I had a look at enchant source code.
enchant is asking libaspell to output in utf-8 but it's not working.

The point is that it's working with other languages "Otherwise people might've complained"
but not with Arabic, That's why I'm a bit lost.

> What encoding is your word list in you used to generate the dictionary? 
iso 8859-6

> It needs to be in the same encoding the "data-encoding" is in.  It can be 
> ISO 8859-6 and Aspell will still output UTF-8 when an application asks for 
> it as Aspell will convert the output to UTF-8.

I've uploaded all the files here: http://www.foolab.org/aspell.tgz

Can you please have a look ?

Here's how I generated the ar.rws file:
aspell --lang ar create master ./ar.rws < wordlist

Many thanks,

-- 
GNU/Linux registered user #224950
Proud Egyptian GNU/Linux User Group <www.eglug.org> Admin.
Life powered by Debian, Homepage: www.foolab.org
--
Don't send me any attachment in Micro$oft (.DOC, .PPT) format please
Read http://www.gnu.org/philosophy/no-word-attachments.html
Preferable attachments: .PDF, .HTML, .TXT
Thanx for adding this text to Your signature

["signature.asc" (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic