[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    identifier n-gram tokenizer
From:       Michal Hlavac <hlavki () hlavki ! eu>
Date:       2016-01-11 15:40:38
Message-ID: 5772664.ZvmmFlCMAI () hlavki
[Download RAW message or body]

Hello,

I published some token filters that can be used to tokenize some kind of identifiers \
into punctation delimited n-grams (e.g. ip address). I think it needs some \
optimization, but it works for now.

https://github.com/hlavki/lucene-analyzers

You can find example of usage in unit test:
https://github.com/hlavki/lucene-analyzers/blob/master/src/test/java/eu/hlavki/lucene/analysis/identifier/IdentifierNGramFilterTest.java


m.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic