[prev in list] [next in list] [prev in thread] [next in thread] 

List:       aspell-user
Subject:    Re: [Aspell-user] using aspell for Arabic with diacritics
From:       Mohammed Sameer <msameer () foolab ! org>
Date:       2007-01-21 1:47:00
Message-ID: 20070121014659.GB2012 () home ! foolab ! org
[Download RAW message or body]


On Thu, Jan 18, 2007 at 09:25:34AM +0100, Hugo Coolens wrote:
> Even though I know most written Arabic doesn't use diacritic marks, it 
> is very helpful for beginners. My problem is that I have a lot of words 
> written with diacritics, when trying to spellcheck them with aspell, 
> aspell refuses them as correct. I thought it would be just a matter of 
> telling aspell not to look at the diacritics as follows:
> cat tekst.ar |aspell -a -d ar --ignore-accents=true
> 
> but it seems not the right way to do this
> I also thought of using a filter something like:
> cat tekst.ar | tr 'harakaat' -d |aspell -a -d ar
> 
> but I don't know what to use for 'harakaat'
> 

I once wrote a small program to do this. I've cleaned it a bit and here it is:
http://home.foolab.org/cgi-bin/viewcvs.cgi/src/clean_arabic.c?rev=1.1&view=auto

You'll need glib 2.x
To compile it: gcc  -o clean_arabic clean_arabic.c `pkg-config glib-2.0 --cflags --libs`

Pass it a file as an argument or it'll try to read from the standard input.
It'll write the "cleaned" Arabic on the standard output.

Good luck.

-- 
GNU/Linux registered user #224950
Proud Egyptian GNU/Linux User Group <www.eglug.org> Member.
Life powered by Debian, Homepage: www.foolab.org
--
Don't send me any attachment in Micro$oft (.DOC, .PPT) format please
Read http://www.gnu.org/philosophy/no-word-attachments.html
Preferable attachments: .PDF, .HTML, .TXT
Thanx for adding this text to Your signature

["signature.asc" (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic