[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    Re: JLemmaGen project
From:       Dawid Weiss <dawid.weiss () gmail ! com>
Date:       2013-11-05 7:27:20
Message-ID: CAM21Rt8kmqpQwgEVXp=syS07q9Fk8eE4GfC_=Xfh0Wv3cv-f+w () mail ! gmail ! com
[Download RAW message or body]

Hi Michal,

Pretty cool. Your work reminds me of what Leo Galambos did a while back:

http://link.springer.com/chapter/10.1007/978-3-540-39985-8_22

I believe his implementation is still available in the Egothor search
engine project.

Dawid



On Wed, Oct 23, 2013 at 5:17 PM, Michal Hlavac <hlavki@hlavki.eu> wrote:
> Hi,
> 
> I rewrote lemmatizer project LemmaGen (http://lemmatise.ijs.si/) to java. \
> Originally it's written in C#. Lemmagen project uses rules to lemmatize word. \
> Algorithm is described here: \
> http://lemmatise.ijs.si/Download/File/Documentation%23JournalPaper.pdf 
> Project is writtten under GPLv3. Sources are located on bitbucket server:
> https://bitbucket.org/hlavki/jlemmagen
> 
> There is also Lemmagen4j project which use more memory and without prebuilded \
> trees. 
> I obtained also licenced dictionaries to build rules tree for 15 languages. \
> Dictionaries are licenced, but prebuilded trees don't. But you can also build your \
> own dictionary. 
> Project contains also TokenFilter for lucene/solr.
> Project is not stable, but any feedback is appreciated.
> 
> Supported languages are:
> mlteast-bg - Bulgarian
> mlteast-cs - Czech
> mlteast-en - English
> mlteast-et - Estonian
> mlteast-fr - French
> mlteast-hu - Hungarian
> mlteast-mk - Macedonia
> mlteast-pl - Polish
> mlteast-ro - Romanian
> mlteast-ru - Russian
> mlteast-sk - Slovak
> mlteast-sl - Slovene
> mlteast-sr - Serbian
> mlteast-uk - Ukrainian
> 
> thanks, miso
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic