[prev in list] [next in list] [prev in thread] [next in thread]
List: lucene-user
Subject: Re: JLemmaGen project
From: Dawid Weiss <dawid.weiss () gmail ! com>
Date: 2013-11-05 7:27:20
Message-ID: CAM21Rt8kmqpQwgEVXp=syS07q9Fk8eE4GfC_=Xfh0Wv3cv-f+w () mail ! gmail ! com
[Download RAW message or body]
Hi Michal,
Pretty cool. Your work reminds me of what Leo Galambos did a while back:
http://link.springer.com/chapter/10.1007/978-3-540-39985-8_22
I believe his implementation is still available in the Egothor search
engine project.
Dawid
On Wed, Oct 23, 2013 at 5:17 PM, Michal Hlavac <hlavki@hlavki.eu> wrote:
> Hi,
>
> I rewrote lemmatizer project LemmaGen (http://lemmatise.ijs.si/) to java. \
> Originally it's written in C#. Lemmagen project uses rules to lemmatize word. \
> Algorithm is described here: \
> http://lemmatise.ijs.si/Download/File/Documentation%23JournalPaper.pdf
> Project is writtten under GPLv3. Sources are located on bitbucket server:
> https://bitbucket.org/hlavki/jlemmagen
>
> There is also Lemmagen4j project which use more memory and without prebuilded \
> trees.
> I obtained also licenced dictionaries to build rules tree for 15 languages. \
> Dictionaries are licenced, but prebuilded trees don't. But you can also build your \
> own dictionary.
> Project contains also TokenFilter for lucene/solr.
> Project is not stable, but any feedback is appreciated.
>
> Supported languages are:
> mlteast-bg - Bulgarian
> mlteast-cs - Czech
> mlteast-en - English
> mlteast-et - Estonian
> mlteast-fr - French
> mlteast-hu - Hungarian
> mlteast-mk - Macedonia
> mlteast-pl - Polish
> mlteast-ro - Romanian
> mlteast-ru - Russian
> mlteast-sk - Slovak
> mlteast-sl - Slovene
> mlteast-sr - Serbian
> mlteast-uk - Ukrainian
>
> thanks, miso
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic