[prev in list] [next in list] [prev in thread] [next in thread]
List: lucene-user
Subject: Re: How to Perform a Full Text Search on a Number with Leading Zeros or Decimals?
From: Uwe Schindler <uwe () thetaphi ! de>
Date: 2013-06-28 19:39:42
Message-ID: 094658eb-db0a-467a-98d0-63f5875c318c () email ! android ! com
[Download RAW message or body]
You can add PatternReplaceFilter \
(http://lucene.apache.org/core/4_3_1/analyzers-common/org/apache/lucene/analysis/pattern/PatternReplaceFilter.html) \
to replace the tokens only consisting of digits by their vsrisnt with leading zeroes \
removed.
Uwe
Jack Krupansky <jack@basetechnology.com> schrieb:
> The user could use a regular expression query to match the numbers, but
>
> otherwise, you will have to write some specialized token filter to
> recognize
> numeric tokens and generate extra tokens at the same position for each
> token
> variant that you want to search for.
>
> -- Jack Krupansky
>
> -----Original Message-----
> From: Todd Hunt
> Sent: Friday, June 28, 2013 2:18 PM
> To: java-user@lucene.apache.org
> Subject: How to Perform a Full Text Search on a Number with Leading
> Zeros or
> Decimals?
>
> I have an application that is indexing the text from various reports
> and
> forms that are generated from our core system. The reports will
> contain
> dollar amounts and various indexes that contain all numbers, but have
> leading zeros.
>
> If a document contains that following text that is stored in one Lucene
>
> document field:
>
> "Account 00000012345 owes $321.98"
>
> What analyzer can be used to index this text and allow the user to find
> this
> document by searching on:
>
> 12345
>
> OR
>
> 321
>
> ???
>
> We are currently using a StandardAnalyzer which works well for most of
> our
> use cases, but not one like this.
>
> I realize that I could create my own token filter to convert any text
> that
> can be represented by an Integer or Long, with leading zeros or not,
> and
> convert the value to a normal looking integer without leading zeros.
> But
> I'd prefer to reuse and existing analyzer or technique to achieve the
> same
> results.
>
> Thank you.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic