[prev in list] [next in list] [prev in thread] [next in thread]
List: lucene-dev
Subject: TermInfosReader optimisation?
From: Tony Bowden <tony-lucene () kasei ! com>
Date: 2004-03-31 9:07:12
Message-ID: 20040331090712.GA4699 () soto ! kasei ! com
[Download RAW message or body]
An interesting thing has come up with Plucene:
The code for TermInfosReader.get has an optimisation so that in
sequential access it doesn't need to keep seeking:
final synchronized TermInfo get(Term term) throws IOException {
if (size == 0) return null;
// optimize sequential access: first try scanning cached enum w/o seeking
if (enum.term() != null // term is at or past current
&& ((enum.prev != null && term.compareTo(enum.prev) > 0)
|| term.compareTo(enum.term()) >= 0)) {
int enumOffset = (enum.position/TermInfosWriter.INDEX_INTERVAL)+1;
if (indexTerms.length == enumOffset // but before end of block
|| term.compareTo(indexTerms[enumOffset]) < 0)
return scanEnum(term); // no need to seek
}
// random-access: must seek
seekEnum(getIndexOffset(term));
return scanEnum(term);
}
In the Perl version, this whole middle section slows everything down
considerably (by almost 50%). I'm not sure whether this is because of
bottlenecks being at different places in Perl vs Java, but I'm curious
as what impact this optimisation has in the Java.
I can't easily test it from here at the minute, but I'm curious if
there are any Benchmarks on the effect of having that optimisation vs
not having it.
Thanks,
Tony
Tony
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic