[prev in list] [next in list] [prev in thread] [next in thread]
List: solr-dev
Subject: [jira] [Commented] (LUCENE-6645) BKD tree queries should use BitDocIdSet.Builder
From: "Michael McCandless (JIRA)" <jira () apache ! org>
Date: 2015-06-30 19:53:04
Message-ID: JIRA.12841751.1435685543000.62351.1435693984725 () Atlassian ! JIRA
[Download RAW message or body]
[ https://issues.apache.org/jira/browse/LUCENE-6645?page=com.atlassian.jira.plugin \
.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608958#comment-14608958 ]
Michael McCandless commented on LUCENE-6645:
--------------------------------------------
The lat/lons to index are here:
http://people.apache.org/~mikemccand/latlon.subsetPlusAllLondon.txt.lzma
it uncompresses to ~1.9 GB.
Then run IndexAndSearchOpenStreetMaps.java in
luceneutil/src/main/perf. (You have to edit the hard path to this
lat/lons input file).
Run it first with that createIndex uncommented, then comment it out
(you can just re-use that index to test searching).
When I run this on trunk I get 1.54 sec for 225 "bboxes around
London", and with the patch 3.89 seconds, or ~2.5 X slower ...
> BKD tree queries should use BitDocIdSet.Builder
> -----------------------------------------------
>
> Key: LUCENE-6645
> URL: https://issues.apache.org/jira/browse/LUCENE-6645
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: LUCENE-6645.patch
>
>
> When I was iterating on BKD tree originally I remember trying to use this builder \
> (which makes a sparse bit set at first and then upgrades to dense if enough bits \
> get set) and being disappointed with its performance. I wound up just making a \
> FixedBitSet every time, but this is obviously wasteful for small queries. It could \
> be the perf was poor because I was always .or'ing in DISIs that had 512 - 1024 hits \
> each time (the size of each leaf cell in the BKD tree)? I also had to make my own \
> DISI wrapper around each leaf cell... maybe that was the source of the slowness, \
> not sure. I also sort of wondered whether the SmallDocSet in spatial module (backed \
> by a SentinelIntSet) might be faster ... though it'd need to be sorted in the and \
> after building before returning to Lucene.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic