'Re: How to access DocValues inside a customized collector?'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    Re: How to access DocValues inside a customized collector?
From:       Lisheng Zhang <lz0522y2k () gmail ! com>
Date:       2018-09-21 18:36:26
Message-ID: CAFFu0ZNUtRfx7RrXrsAX8bQdiD0+OJgb3p1RatQc0x_UPxy1cA () mail ! gmail ! com
[Download RAW message or body]


Thanks very much Uwe and Mikhail!

Your points are all very well taken, so far it seems to work well, i will
test more to verify details.

Lisheng

On Fri, Sep 21, 2018 at 3:54 AM Uwe Schindler <uwe@thetaphi.de> wrote:

> Hi,
>
> in general your approach is right, but you have to do it correctly. It
> depends on the Collector subclass you are using. The simplest is to
> subclass SimpleCollector:
> https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/SimpleCollector.html
>
> There you have to override 2 methods:
>
> doSetNextReader(LeafReaderContext context): Here you call *once*
> context.reader().getBinaryDocValues(String field) and save the thing in a
> private member field "actReaderdocValues" of the collector (non-final).
>
> In collect(docId) you can then call actReaderdocValues.advanceExact(docId)
> and retrieve the value. As collect is always called "in order", its safe to
> use advanceExact().
>
> Important is: Don't get a new docvalues instance on each call and
> advanceExact()! This is only needed for out of order! So in combination
> with an collector (like above) you get maximum performance, as everything
> is per leaf reader and in order.
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
> > -----Original Message-----
> > From: Lisheng Zhang <lz0522y2k@gmail.com>
> > Sent: Friday, September 21, 2018 3:23 AM
> > To: java-user@lucene.apache.org
> > Subject: How to access DocValues inside a customized collector?
> >
> > we need to use binary DocValues (in a customized collector) added during
> > indexing, i first tested in standard TopScoreDocCollector, it seems that
> we
> > need to:
> >
> > LeafReaderContext => reader() => get binary iterator => advanced to
> correct
> > location
> >
> > Is this the correct way or actually we have a better API (since we
> already
> > in that docId it seems to me that the binary DocValues should be readily
> > available?
> >
> > Also do we have a way to see directly indexed data (Luke seems obsolete,
> > Marple does not work with lucene 7.4.0 yet)?
> >
> > Thanks very much for helps, Lisheng
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic