[prev in list] [next in list] [prev in thread] [next in thread]
List: lucene-user
Subject: identifier n-gram tokenizer
From: Michal Hlavac <hlavki () hlavki ! eu>
Date: 2016-01-11 15:40:38
Message-ID: 5772664.ZvmmFlCMAI () hlavki
[Download RAW message or body]
Hello,
I published some token filters that can be used to tokenize some kind of identifiers into punctation \
delimited n-grams (e.g. ip address). I think it needs some optimization, but it works for now.
https://github.com/hlavki/lucene-analyzers
You can find example of usage in unit test:
https://github.com/hlavki/lucene-analyzers/blob/master/src/test/java/eu/hlavki/lucene/analysis/identifier/IdentifierNGramFilterTest.java
m.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic