[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    Migration to Lucene 6.5 - Queries vs Filters
From:       Rilpa Jain <Rilpa.Jain () tradeweb ! com>
Date:       2017-07-18 20:26:27
Message-ID: A597E46153D8A042ACBFAB96DC848FD07ED940D6 () USOPWYMBX01 ! nyoffice ! tradeweb ! com
[Download RAW message or body]


Hi,

We plan to migrate from lucene 5.5 to 6.5. We have been using DocValuesTerm=
sFilter extensively which was deprecated in Lucene 5.5 and removed in Lucen=
e 6.0.
The Javadoc specifies to use DocValuesTermsQuery and BoolenaClause.Occur.Fi=
lter instead. However, as per our local tests, the time taken to search doc=
uments has increased with this change.

Below is one of the scenarios in our application -
We do a search within a search.

(Before migration to Lucene 5.5)
1.      The first search is on a text field with discrete values. (There is=
 no pattern to the value of this text field. Here the terms[] ranges from 1=
 to 200k in size.)  - We use DocValuesTermsFilter and pass it is as Filter =
parameter to search method.
2.      The second search is on result of step 1- This could be either a Te=
rmQuery or NumericRangeQuery, evaluated to query and added as query paramet=
er to search method.

(After migration to Lucene 6.5)
1.      The first search is on a text field with discrete values. (There is=
 no pattern to the value of this text field. Here the terms[] ranges from 1=
 to 200k in size.)  - We use DocValuesTermsQuery and add it to BooleanQuery=
 with Occur.Filter.
2.      The second search is on result of step 1- This could be either a Te=
rmQuery or NumericRangeQuery added to BooleanQuery with Occur.MUST.
3.      The booleanQuery is build and passed to search method.

This query execution after migration takes 5x-10x times more as compared to=
 using DocValuesTermsFilter.

Is there a better class to generate query in our scenario than the one used=
 above? Or is there anything that I am missing?
Any insights would help! Thanks.


________________________________________________________________________

The information in this email is confidential and may be legally privileged=
. It is intended solely for the addressee. Access to this email by anyone e=
lse is unauthorized. If you are not the intended recipient, any disclosure,=
 copying, distribution or any action taken or omitted to be taken in relian=
ce on it, is prohibited and may be unlawful.

Tradeweb reserves the right to monitor and review the content of all messag=
es sent to or from this e-mail address. Messages sent to or from this e-mai=
l address may be stored on the Tradeweb e-mail system.


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic