[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    Re: NumericDocValues vs SortedNumericDocValues
From:       Adrien Grand <jpountz () gmail ! com>
Date:       2018-02-06 9:00:55
Message-ID: CAPsWd+PZ-FixvO0f6h=GdMXC_u7C1ugh8jBNTqMFsJK86B_OBw () mail ! gmail ! com
[Download RAW message or body]


For a single-valued double, you should create a field at index time by
calling `new NumericDocValuesField(myDoubleFieldName,
Double.doubleToLongBits(myDoubleValue))` at index time and then sort using
`new SortField(myDoubleFieldName, SortField.Type.DOUBLE)`.

SortedNumericDocValues is about storing multiple values per document. The
reason it has "Sorted" in the name is because it doesn't try to remember
the insertion order and always returns values in ascending order for a
given document at search time. But for a single-valued field it would be
less efficient and harder to use than NumericDocValues.

Le lun. 5 f=C3=A9vr. 2018 =C3=A0 21:34, Tom Hirschfeld <tomhirschfeld@gmail=
.com> a
=C3=A9crit :

> Hello,
>
> I need to sort the results of a query based on a single double value for
> each document. I am not sure which of the NumericDocValues or the
> SortedNumericDocValues would be best to use. Each document will have a
> single value indexed. We are indexing ~100m - 1B documents and each query
> will sort about 200 results. My specific questions are, for our use case,
> how do these two fields differ in:
>
> 1) total index size
> 2) query time performance/impact on sorting
> 3) any other "gotchas" I may not have thought of yet
>
> Thanks for your time & assistance!
> Best,
> Tom Hirschfeld
>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic