[prev in list] [next in list] [prev in thread] [next in thread] 

List:       solr-user
Subject:    Re: How might one search for dupe IDs other than faceting on the ID field?
From:       Shawn Heisey <solr () elyograg ! org>
Date:       2013-07-30 18:24:08
Message-ID: 51F804C8.8010605 () elyograg ! org
[Download RAW message or body]

On 7/30/2013 12:16 PM, Dotan Cohen wrote:
> To search for duplicate IDs, I am running the following query:
> select?q=*:*&facet=true&facet.field=id&rows=0
>
> However, since upgrading from Solr 4.1 to Solr 4.3 I am receiving
> OutOfMemoryError errors instead of the desired facet:

<snip>

> Might there be a less resource-intensive way to get this information.

Add &facet.method=enum to the query URL.  This will cause Solr to 
enumerate the facet information on every query rather than load it into 
the field cache, which takes a lot of memory.  Solr 4.1 was probably 
very close to running out of memory as well.

If you have enough OS disk cache for your index, the enum method should 
not cause an enormous slowdown.  If you don't have enough OS disk cache, 
then it can make the facets run very slowly.

Thanks,
Shawn

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic