[prev in list] [next in list] [prev in thread] [next in thread]
List: lucene-user
Subject: Re: Static index, fastest way to do forceMerge
From: Dawid Weiss <dawid.weiss () gmail ! com>
Date: 2018-11-30 11:01:54
Message-ID: CAM21Rt_BjgPD3554Lg7CtaFozQ6A7CPte2ooHX9YJ9BtphR6vA () mail ! gmail ! com
[Download RAW message or body]
Just FYI: I implemented a quick and dirty PoC to see what it'd work
like. Not much of a difference on my machine (since postings merging
dominates everything else). Interesting problem how to split it up to
saturate all of available resources though (CPU and I/O).
https://issues.apache.org/jira/browse/LUCENE-8580
Dawid
On Fri, Nov 2, 2018 at 10:17 PM Dawid Weiss <dawid.weiss@gmail.com> wrote:
>
> Thanks for chipping in, Toke. A ~1TB index is impressive.
>
> Back of the envelope says reading & writing 900GB in 8 hours is
> 2*900GB/(8*60*60s) = 64MB/s. I don't remember the interface for our
> SSD machine, but even with SATA II this is only ~1/5th of the possible
> fairly sequential IO throughput. So for us at least, NVMe drives are
> not needed to have single-threaded CPU as bottleneck.
>
> The mileage will vary depending on the CPU -- if it can merge the data
> from multiple files at ones fast enough then it may theoretically
> saturate the bandwidth... but I agree we also seem to be CPU bound on
> these N-to-1 merges, a regular SSD is enough.
>
> > And +1 to the issue BTW.
>
> I agree. Fine-grained granularity here would be a win even in the
> regular "merge is a low-priority citizen" case. At least that's what I
> tend to think. And if there are spare CPUs, the gain would be
> terrific.
>
> Dawid
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic