[prev in list] [next in list] [prev in thread] [next in thread]
List: lucene-user
Subject: Re: Lucene 3.6.0 Index Size
From: kiwi clive <kiwi_clive () yahoo ! com>
Date: 2012-10-26 20:27:40
Message-ID: 1351283260.19689.YahooMailNeo () web120702 ! mail ! ne1 ! yahoo ! com
[Download RAW message or body]
Hi Vitaly,
Your hunch is correct, yes there are unmerged segments leftover. However to get \
indexing throughput, we use multiple threads on the writer flushing to disk \
periodically, but the writer can stay open for some time (until the last thread \
terminates). However, after an optimize, the index is closed. Thanks for the advice, \
I need to revisit the merging section of the application.
Clive
________________________________
From: Vitaly Funstein <vfunstein@gmail.com>
To: java-user@lucene.apache.org
Sent: Friday, October 26, 2012 8:13 PM
Subject: Re: Lucene 3.6.0 Index Size
One thing to keep in mind is that the default merge policy has changed in
3.6 from 2.3.2 (I'm almost certain of that). So it's just a hunch but you
may have some unmerged segments left over at the end. Try calling
IndexWriter.close(true) after you're done indexing.
On Fri, Oct 26, 2012 at 10:50 AM, kiwi clive <kiwi_clive@yahoo.com> wrote:
> Hello.
>
> We have an index that when creted using lucene2.3.2, has a size of about
> 4G.
>
> Creating the same index (with the same parameters) with lucene 3.6.0
> results in an 11G index.
>
> Could someone shed some light into why the index is so much larger, given
> the same data and the same parameters?
>
> I realize this is a large version jump but a doubling in index size does
> not seem a step in the right direction to me ;-)
>
> I am using cfs format.
>
> Thanks,
> Clive
>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic