[prev in list] [next in list] [prev in thread] [next in thread]
List: aspell-user
Subject: Re: [Aspell-user] using aspell for Arabic with diacritics
From: Mohammed Sameer <msameer () foolab ! org>
Date: 2007-01-21 1:47:00
Message-ID: 20070121014659.GB2012 () home ! foolab ! org
[Download RAW message or body]
On Thu, Jan 18, 2007 at 09:25:34AM +0100, Hugo Coolens wrote:
> Even though I know most written Arabic doesn't use diacritic marks, it
> is very helpful for beginners. My problem is that I have a lot of words
> written with diacritics, when trying to spellcheck them with aspell,
> aspell refuses them as correct. I thought it would be just a matter of
> telling aspell not to look at the diacritics as follows:
> cat tekst.ar |aspell -a -d ar --ignore-accents=true
>
> but it seems not the right way to do this
> I also thought of using a filter something like:
> cat tekst.ar | tr 'harakaat' -d |aspell -a -d ar
>
> but I don't know what to use for 'harakaat'
>
I once wrote a small program to do this. I've cleaned it a bit and here it is:
http://home.foolab.org/cgi-bin/viewcvs.cgi/src/clean_arabic.c?rev=1.1&view=auto
You'll need glib 2.x
To compile it: gcc -o clean_arabic clean_arabic.c `pkg-config glib-2.0 --cflags --libs`
Pass it a file as an argument or it'll try to read from the standard input.
It'll write the "cleaned" Arabic on the standard output.
Good luck.
--
GNU/Linux registered user #224950
Proud Egyptian GNU/Linux User Group <www.eglug.org> Member.
Life powered by Debian, Homepage: www.foolab.org
--
Don't send me any attachment in Micro$oft (.DOC, .PPT) format please
Read http://www.gnu.org/philosophy/no-word-attachments.html
Preferable attachments: .PDF, .HTML, .TXT
Thanx for adding this text to Your signature
["signature.asc" (application/pgp-signature)]
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic