[prev in list] [next in list] [prev in thread] [next in thread] 

List:       aspell-user
Subject:    Re: [aspell-user] Aspell Arabic support
From:       "Moe Elzubeir" <Elzubeir () cobaf ! coba ! unt ! edu>
Date:       2002-01-23 7:24:25
Message-ID: sc4e80a8.060 () gwia ! unt ! edu
[Download RAW message or body]

>>> Kevin Atkinson  01/23/02 04:46AM wrote

>>Sorry for not responding.  When people tend to ask hard questions I tend
>>to mark the post as important with the hope I will get back to it.
>>Unfortunately I don't always.  BTW, I prefer these type of things get
>>posted to the mailing list as it will create a public recorded of our
>>conversation.

Ah, so it is a hard question ;) At least I wouldn't feel bad for not understanding
things then ;)

> Using 8-bit characters internally as far Arabic is concerned is not a
> problem, although you would gain a lot more accuracy by considering
> double-width characters.

>>How is that?

Hrmm.. I take that back, I wasn't thinking ;) The ISO8859-6 charset is completely
sufficient for internal storage, but not enough to display.


>>Are you talking in terms of affix compression?

Well, I am not sure how much affix compression helps with Arabic. So let me give
an example and you can tell me (since I am a little confused about that as well).

Let's take a verb root: ktb ¯> pronounced 'kataba'

k = root letter 1
t = root letter 2
b = root letter 3

In Arabic, to represent a 3-letter verb (most common, quad-lettered root verbs are
rare), we symbolize them with 'FEH','AIN','LAM' (pronounced fa3al) [where the 3 is
the ain, excuse my not-so-professional transliteration].

Derived from 'ktb' are many words, like:

mktb (prnounced 'maktab') ¯> represented as 'mf3l'

so far so good.. only adding a prefix.

ktabT (pronounced 'kitaba' ¯ where T is a 'TEH MARBOOTA') ¯> represented as 'f3ala'
    (pronounced as 'fi3ala')

That adds an 'ALEF' (a) to the middle of the root verb as well as a 'TEH MARBOOTA' (suffix).

And the list goes on and on ;) Only for one verb. Most of the Arabic language is derived from
root verbs as this, and that is what makes it such a beautiful language (yet, so complex
when computerized) ;) Words like, office, book, writer, library, etc. are all derived from 'ktb'.

My question is, how does the affix compression come to play here?

>>OK. You got my attention ;)

Great, because I will need a lot of help ;) But I am very motivated and will spend hours upon
hours to get this working ;)

>>The current released version of Aspell/Pspell is now dead as far as
>>development is concerned all of the new development is talking please on
>>the "New Aspell" which can be found at http://aspell.net/.  Browse the
>>announcement archive for more information as I have not set up a real web
>>page yet and what is currently there is not up to date.

I see. It would be nice to kill the other pages or simply re-direct to aspell.net.. then
again, I found the right place, others can search <evil grin>


Thanks
Mohammed Elzubeir

[Attachment #3 (text/html)]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 5.50.4522.1800" name=GENERATOR></HEAD>
<BODY style="MARGIN-TOP: 2px; FONT: 8pt Tahoma; MARGIN-LEFT: 2px">
<DIV>&gt;&gt;&gt; Kevin Atkinson  01/23/02 04:46AM wrote<BR></DIV>
<DIV>&gt;&gt;Sorry for not responding.&nbsp; When people tend to ask hard 
questions I tend<BR>&gt;&gt;to mark the post as important with the hope I will 
get back to it.<BR>&gt;&gt;Unfortunately I don't always.&nbsp; BTW, I prefer 
these type of things get<BR>&gt;&gt;posted to the mailing list as it will create 
a public recorded of our<BR>&gt;&gt;conversation.<BR></DIV>
<DIV>Ah,&nbsp;so it is a hard question ;) At least I wouldn't feel bad for not 
understanding</DIV>
<DIV>things then ;)</DIV>
<DIV><BR>&gt; Using 8-bit characters internally as far Arabic is concerned is 
not a<BR>&gt; problem, although you would gain a lot more accuracy by 
considering<BR>&gt; double-width characters.<BR><BR>&gt;&gt;How is that?</DIV>
<DIV>&nbsp;</DIV>
<DIV>Hrmm.. I take that back, I wasn't thinking ;) The ISO8859-6 charset is 
completely</DIV>
<DIV>sufficient for internal storage, but not enough to 
display.<BR><BR><BR>&gt;&gt;Are you talking in terms of affix 
compression?<BR></DIV>
<DIV>Well, I am not sure how much affix compression helps with Arabic. So let me 
give</DIV>
<DIV>an example and you can tell me (since I am a little confused about that as 
well).</DIV>
<DIV>&nbsp;</DIV>
<DIV>Let's take a verb root: ktb &#8212;&gt; pronounced 'kataba'</DIV>
<DIV>&nbsp;</DIV>
<DIV>k = root letter 1</DIV>
<DIV>t = root letter 2</DIV>
<DIV>b = root letter 3</DIV>
<DIV>&nbsp;</DIV>
<DIV>In Arabic, to represent a 3-letter verb (most common, quad-lettered root 
verbs are</DIV>
<DIV>rare), we symbolize them with 'FEH','AIN','LAM' (pronounced fa3al) [where 
the 3 is</DIV>
<DIV>the ain, excuse my not-so-professional transliteration].</DIV>
<DIV>&nbsp;</DIV>
<DIV>Derived from 'ktb' are many words, like:</DIV>
<DIV>&nbsp;</DIV>
<DIV>mktb (prnounced 'maktab') &#8212;&gt; represented as 'mf3l'</DIV>
<DIV>&nbsp;</DIV>
<DIV>so far so good.. only adding a prefix.</DIV>
<DIV>&nbsp;</DIV>
<DIV>ktabT (pronounced 'kitaba' &#8212; where T is a 'TEH MARBOOTA') &#8212;&gt; represented 
as 'f3ala'</DIV>
<DIV>&nbsp;&nbsp;&nbsp; (pronounced as 'fi3ala')</DIV>
<DIV>&nbsp;</DIV>
<DIV>That adds an 'ALEF' (a) to the middle of the root verb as well as a 'TEH 
MARBOOTA' (suffix).</DIV>
<DIV>&nbsp;</DIV>
<DIV>And the list goes on and on ;) Only for one verb. Most of the Arabic 
language is derived from</DIV>
<DIV>root verbs as this, and that is what makes it such a beautiful language 
(yet, so complex</DIV>
<DIV>when computerized) ;) Words like, office, book, writer, library, etc. are 
all derived from 'ktb'.</DIV>
<DIV>&nbsp;</DIV>
<DIV>My question is, how does the affix compression come to play here?</DIV>
<DIV><BR>&gt;&gt;OK. You got my attention ;)<BR></DIV>
<DIV>Great, because I will need a lot of help ;) But I am very motivated and 
will spend hours upon</DIV>
<DIV>hours to get this working ;)</DIV>
<DIV><BR>&gt;&gt;The current released version of Aspell/Pspell is now dead as 
far as<BR>&gt;&gt;development is concerned all of the new development is talking 
please on<BR>&gt;&gt;the "New Aspell" which can be found at <A 
href="http://aspell.net/.&nbsp;">http://aspell.net/.&nbsp;</A> Browse 
the<BR>&gt;&gt;announcement archive for more information as I have not set up a 
real web<BR>&gt;&gt;page yet and what is currently there is not up to 
date.<BR></DIV>
<DIV>I see. It would be nice to kill the other pages or simply re-direct to 
aspell.net.. then</DIV>
<DIV>again, I found the right place, others can search &lt;evil grin&gt;</DIV>
<DIV>&nbsp;</DIV>
<DIV>&nbsp;</DIV>
<DIV>Thanks<BR>Mohammed Elzubeir</DIV></BODY></HTML>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic