'How to set individual boost factor to each word in a phrase query'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    How to set individual boost factor to each word in a phrase query
From:       "Robichaud, Jean-Philippe" <Jean-Philippe.Robichaud () scansoft ! com>
Date:       2005-03-02 21:43:02
Message-ID: FBA9F803D33470489B9B4676A6B09F0B01239E90 () mt-exch1 ! montreal ! speechworks ! com
[Download RAW message or body]



Hi everyone.

I've been playing with Lucene a lot in the past few months for an important
project.  We are using the raw score returned by Lucene (we created a custom
similarity) as a part of a confidence score calculation.  My problem is
exactly what the subject line of this email says: How to set individual
boost factor to each word in a phrase query?

So I would like to handle the following situation:

The user asks for "some list of words".  I know that, for some reasons that
really are uninteresting for this thread, the query should be written as:
"some^0.81 list^0.12 of^0.5 words^0.99".  Sending this string to the query
parser simply return garbage...  I could add each words manually to a
BooleanQuery and use the setBoost() member, but I really want to match the
"sentence" i.e. I don't want documents that does not respect the word order.
Also, I cannot really call explain() function because of the CPU/IO
resources that it takes and the fact that I can only look at a certain
number of element at the top of the hits object

Any thought?

Thanks, 

Jp

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic