[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    Re: Need help in alphanumeric search
From:       Bhaskar <bhaskar1484 () gmail ! com>
Date:       2015-09-30 8:21:00
Message-ID: CACr=8G+FtjxcKQK_aFGNtuX4enhvFvrqMEU9mSqv6tYUANCJUQ () mail ! gmail ! com
[Download RAW message or body]


Hi Uwe,

Wav!!! Thanks a lot. I changed to StandardAnalyzer  it is working. Thank
you, thank you.

Regards,
Bhaskar

On Wed, Sep 30, 2015 at 12:23 PM, Uwe Schindler <uwe@thetaphi.de> wrote:

> Hi Bhaskar,
> 
> the answer is very simple: Your analysis is not useful for the type of
> queries and data you are using. You are using SimpleAnalyzer in your
> search/indexing code:
> 
> 
> https://lucene.apache.org/core/5_3_1/analyzers-common/org/apache/lucene/analysis/core/SimpleAnalyzer.html
>  "An Analyzer that filters LetterTokenizer with LowerCaseFilter"
> 
> And LetterTokenizer does the following:
> 
> https://lucene.apache.org/core/5_3_1/analyzers-common/org/apache/lucene/analysis/core/LetterTokenizer.html
>  "A LetterTokenizer is a tokenizer that divides text at non-letters. That's
> to say, it defines tokens as maximal strings of adjacent letters, as
> defined by java.lang.Character.isLetter() predicate."
> 
> So it creates a new token at every non-letter boundary. All non-letters
> are discarded (because they are treated as token boundary). So your queries
> can never match.
> 
> I'd suggest to first inform yourself about analysis and choose a better
> one that suits your underlying data and the queries you want to do. Maybe
> use WhitespaceAnalyzer or better StandardAnalyzer as a first step. Be sure
> to reindex your data before querying. The Analyzer used on the search side
> must be the same like on the query side. If you want to use wildcards, you
> have to take care more, because wildcards are not really natural for "full
> text search engine" and may cause inconsistent results.
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> > -----Original Message-----
> > From: Bhaskar [mailto:bhaskar1484@gmail.com]
> > Sent: Wednesday, September 30, 2015 4:28 AM
> > To: java-user@lucene.apache.org
> > Subject: Re: Need help in alphanumeric search
> > 
> > Hi Uwe,
> > 
> > Below is my indexing code:
> > 
> > public static void main(String[] args) throws Exception { //Path
> indexDir =
> > new Path(INDEX_DIR); public static final String INDEX_DIR =
> "c:/DBIndexAll/";
> > final Path indexDir = Paths.get(INDEX_DIR); SimpleDBIndexer indexer = new
> > SimpleDBIndexer(); try{
> > Class.forName(JDBC_DRIVER).newInstance();
> > Connection conn = DriverManager.getConnection(CONNECTION_URL +
> > DBNAME, USER_NAME, PASSWORD);
> > SimpleAnalyzer analyzer = new SimpleAnalyzer();
> > IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
> > IndexWriter indexWriter = new IndexWriter(FSDirectory.open(indexDir),
> > indexWriterConfig);
> > System.out.println("Indexing to directory '" + indexDir + "'...");
> > int indexedDocumentCount = indexer.indexDocs(indexWriter, conn);
> > indexWriter.close();
> > System.out.println(indexedDocumentCount + " records have been indexed
> > successfully"); } catch (Exception e) {
> > e.printStackTrace();
> > }
> > }
> > 
> > int indexDocs(IndexWriter writer, Connection conn) throws Exception {
> > String sql = QUERY1;
> > Statement stmt = conn.createStatement();
> > ResultSet rs = stmt.executeQuery(sql);
> > int i=0;
> > while (rs.next()) {
> > Document d = new Document();
> > d.add(new TextField("cpn", rs.getString("cpn"), Field.Store.YES));
> > 
> > writer.addDocument(d);
> > i++;
> > }
> > stmt.close();
> > rs.close();
> > 
> > return i;
> > }
> > 
> > 
> > Searching code:
> > 
> > public class SimpleDBSearcher {
> > // PLASTRON
> > private static final String LUCENE_QUERY = "SD*"; private static final
> int
> > MAX_HITS = 500; private static final String INDEX_DIR = "C:/DBIndexAll/";
> > 
> > public static void main(String[] args) throws Exception { // File
> indexDir = new
> > File(SimpleDBIndexer.INDEX_DIR); final Path indexDir =
> > Paths.get(SimpleDBIndexer.INDEX_DIR);
> > String query = LUCENE_QUERY;
> > SimpleDBSearcher searcher = new SimpleDBSearcher();
> > searcher.searchIndex(indexDir, query); }
> > 
> > private void searchIndex(Path indexDir, String queryStr) throws
> Exception {
> > Directory directory = FSDirectory.open(indexDir); System.out.println("The
> > query string is " + queryStr); MultiFieldQueryParser queryParser = new
> > MultiFieldQueryParser(new String[] { "cpn" }, new StandardAnalyzer());
> > IndexReader reader = DirectoryReader.open(directory); IndexSearcher
> > searcher = new IndexSearcher(reader);
> > queryParser.getAllowLeadingWildcard();
> > 
> > Query query = queryParser.parse(queryStr); TopDocs topDocs =
> > searcher.search(query, MAX_HITS);
> > 
> > ScoreDoc[] hits = topDocs.scoreDocs;
> > System.out.println(hits.length + " Record(s) Found"); for (int i = 0; i <
> > hits.length; i++) { int docId = hits[i].doc; Document d =
> searcher.doc(docId);
> > System.out.println("\"cpn value is:\" " + d.get("cpn")); } if
> (hits.length == 0) {
> > System.out.println("No Data Founds "); }
> > 
> > }
> > }
> > 
> > 
> > Please help here, thanks in advance.
> > 
> > Regards,
> > Bhaskar
> > 
> > On Tue, Sep 29, 2015 at 3:47 AM, Uwe Schindler <uwe@thetaphi.de> wrote:
> > 
> > > Hi Erick,
> > > 
> > > This mail was in Lucene's user mailing list. This is not about Solr,
> > > so user cannot provide his Solr config! :-) In any case, it would be
> > > good to get the Analyzer + code you use while indexing and also the
> > > code (+ Analyzer) that creates the query while searching.
> > > 
> > > Uwe
> > > 
> > > -----
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, D-28213 Bremen
> > > http://www.thetaphi.de
> > > eMail: uwe@thetaphi.de
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: Erick Erickson [mailto:erickerickson@gmail.com]
> > > > Sent: Monday, September 28, 2015 6:01 PM
> > > > To: java-user
> > > > Subject: Re: Need help in alphanumeric search
> > > > 
> > > > You need to supply the definitions of this field from your
> > > > schema.xml
> > > file,
> > > > both the <field> and <fieldType>
> > > > 
> > > > Additionally, please provide the results of the query you're trying
> > > > with &debug=true appended.
> > > > 
> > > > The adminUI/analysis page is very helpful in these situations as
> well.
> > > Select
> > > > the appropriate core from the drop-down on the left and you'll see
> > > > an "analysis"
> > > > section appear that shows you exactly what happens when the field is
> > > > analyzed.
> > > > 
> > > > Best,
> > > > Erick
> > > > 
> > > > On Mon, Sep 28, 2015 at 5:01 AM, Bhaskar <bhaskar1484@gmail.com>
> > wrote:
> > > > > Thanks Lan for reply.
> > > > > 
> > > > > cpn values are like 123-0049, 342-043, ab23-090, hedwsdg
> > > > > 
> > > > > my application is working when i gave search  for below inputs
> > > > > 1) ab*
> > > > > 2)hedwsdg
> > > > > 3) hed*
> > > > > 
> > > > > but it is not working for
> > > > > 1) 123*
> > > > > 2) 123-0049
> > > > > 3) ab23*
> > > > > 
> > > > > 
> > > > > Note: if the search input has number then it is not working.
> > > > > 
> > > > > Thanks in advacne.
> > > > > 
> > > > > 
> > > > > On Mon, Sep 28, 2015 at 3:49 PM, Ian Lea <ian.lea@gmail.com>
> wrote:
> > > > > 
> > > > > > Hi
> > > > > > 
> > > > > > 
> > > > > > Can you provide a few examples of values of cpn that a) are and
> > > > > > b) are not being found, for indexing and searching.
> > > > > > 
> > > > > > You may also find some of the tips at
> > > > > > 
> > > > > > http://wiki.apache.org/lucene-
> > > > java/LuceneFAQ#Why_am_I_getting_no_hits
> > > > > > _.2F_incorrect_hits.3F
> > > > > > useful.
> > > > > > 
> > > > > > You haven't shown the code that created the IndexWriter so the
> > > > > > tip about using the same analyzer at index and search time might
> > > > > > be relevant.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > --
> > > > > > Ian.
> > > > > > 
> > > > > > 
> > > > > > On Mon, Sep 28, 2015 at 10:49 AM, Bhaskar
> > <bhaskar1484@gmail.com>
> > > > wrote:
> > > > > > > Hi,
> > > > > > > I am beginner in Apache lucene, I am using 5.3.1.
> > > > > > > I have created  the index on the database result. The index
> > > > > > > values are having alphanumeric and strings values. I am able to
> > > > > > > search the strings
> > > > > > but
> > > > > > > I am not able to search alphanumeric values.
> > > > > > > 
> > > > > > > Can someone help me here.
> > > > > > > 
> > > > > > > Below is indexing code...
> > > > > > > 
> > > > > > > int indexDocs(IndexWriter writer, Connection conn) throws
> > > > > > > Exception { Statement stmt = conn.createStatement();
> > > > > > > ResultSet rs = stmt.executeQuery(sql);
> > > > > > > int i=0;
> > > > > > > while (rs.next()) {
> > > > > > > Document d = new Document();
> > > > > > > // System.out.println("cpn is" + rs.getString("cpn"));
> > > > > > > // System.out.println("mpn is" + rs.getString("mpn"));
> > > > > > > 
> > > > > > > d.add(new TextField("cpn", rs.getString("cpn"),
> > > > > > > Field.Store.YES));
> > > > > > > 
> > > > > > > 
> > > > > > > writer.addDocument(d);
> > > > > > > i++;
> > > > > > > }
> > > > > > > }
> > > > > > > 
> > > > > > > Searching code:
> > > > > > > 
> > > > > > > 
> > > > > > > private void searchIndex(Path indexDir, String queryStr) throws
> > > > > > Exception {
> > > > > > > Directory directory = FSDirectory.open(indexDir);
> > > > > > > System.out.println("The query string is " + queryStr); //
> > > > > > > MultiFieldQueryParser queryParser = new
> > > > > > > MultiFieldQueryParser(new // String[] {"mpn"}, new
> > > > > > > StandardAnalyzer()); // IndexReader reader =
> > > > > > > IndexReader.open(directory); IndexReader reader =
> > > > > > > DirectoryReader.open(directory); IndexSearcher searcher = new
> > > > > > > IndexSearcher(reader); Analyzer analyzer = new
> > > > > > > StandardAnalyzer(); analyzer.tokenStream("cpn", queryStr);
> > > > > > > QueryParser parser = new QueryParser("cpn", analyzer);
> > > > > > > parser.setDefaultOperator(Operator.OR);
> > > > > > > parser.getAllowLeadingWildcard();
> > > > > > > parser.setAutoGeneratePhraseQueries(true);
> > > > > > > Query query = parser.parse(queryStr); searcher.search(query,
> > > > > > > 100); TopDocs topDocs = searcher.search(query, MAX_HITS);
> > > > > > > 
> > > > > > > ScoreDoc[] hits = topDocs.scoreDocs;
> > > > > > > System.out.println(hits.length
> > > > > > > + " Record(s) Found"); for (int i = 0; i < hits.length; i++) {
> > > > > > > + int
> > > > > > > docId = hits[i].doc; Document d = searcher.doc(docId);
> > > > > > > System.out.println("\"value is:\" " + d.get("cpn")); } if
> > > > > > > (hits.length == 0) { System.out.println("No Data Founds "); }
> > > > > > > 
> > > > > > > 
> > > > > > > Thanks in advance.
> > > > > > > 
> > > > > > > --
> > > > > > > Keep Smiling....
> > > > > > > Thanks & Regards
> > > > > > > Bhaskar.
> > > > > > > Mobile:9866724142
> > > > > > 
> > > > > > -----------------------------------------------------------------
> > > > > > ---- To unsubscribe, e-mail:
> > > > > > java-user-unsubscribe@lucene.apache.org
> > > > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > > --
> > > > > Keep Smiling....
> > > > > Thanks & Regards
> > > > > Bhaskar.
> > > > > Mobile:9866724142
> > > > 
> > > > --------------------------------------------------------------------
> > > > - To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > 
> > > 
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > 
> > > 
> > 
> > 
> > --
> > Keep Smiling....
> > Thanks & Regards
> > Bhaskar.
> > Mobile:9866724142
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 


-- 
Keep Smiling....
Thanks & Regards
Bhaskar.
Mobile:9866724142



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic