[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    Re: Lucene 3.6.0 Index Size
From:       kiwi clive <kiwi_clive () yahoo ! com>
Date:       2012-10-26 20:27:40
Message-ID: 1351283260.19689.YahooMailNeo () web120702 ! mail ! ne1 ! yahoo ! com
[Download RAW message or body]


Hi Vitaly,

Your hunch is correct, yes there are unmerged segments leftover. However to get \
indexing throughput, we use multiple threads on the writer flushing to disk \
periodically, but the writer can stay open for some time (until the last thread \
terminates). However, after an optimize, the index is closed. Thanks for the advice, \
I need to revisit the merging section of the application.

Clive




________________________________
 From: Vitaly Funstein <vfunstein@gmail.com>
To: java-user@lucene.apache.org 
Sent: Friday, October 26, 2012 8:13 PM
Subject: Re: Lucene 3.6.0 Index Size
 
One thing to keep in mind is that the default merge policy has changed in
3.6 from 2.3.2 (I'm almost certain of that). So it's just a hunch but you
may have some unmerged segments left over at the end. Try calling
IndexWriter.close(true) after you're done indexing.

On Fri, Oct 26, 2012 at 10:50 AM, kiwi clive <kiwi_clive@yahoo.com> wrote:

> Hello.
> 
> We have an index that when creted using lucene2.3.2, has a size of about
> 4G.
> 
> Creating the same index (with the same parameters) with lucene 3.6.0
> results in an 11G index.
> 
> Could someone shed some light into why the index is so much larger, given
> the same data and the same parameters?
> 
> I realize this is a large version jump but a doubling in index size does
> not seem a step in the right direction to me ;-)
> 
> I am using cfs format.
> 
> Thanks,
> Clive
> 



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic