[prev in list] [next in list] [prev in thread] [next in thread] 

List:       solr-user
Subject:    Re: Solr Indexing Performance
From:       Tomás_Fernández_Löbbe <tomasflobbe () gmail ! com>
Date:       2011-01-31 15:13:32
Message-ID: AANLkTi=BAAZZtnxZb0x-G9qep14aJuyEiiWec_R2RdGd () mail ! gmail ! com
[Download RAW message or body]


Well, I would say that the best way to be sure is to benchmark different
configurations.
As far as I know, it's usually not recommended such a big RAM Buffer size,
default is 32 MB and probably won't get any improvements using more than 128
MB.
The same with the mergeFactor, I know that a larger merge factor it's better
for indexing, but 50 sounds like a lot. Anyway, as I said before, the best
thing to do is benchmark different configurations and see which one works
better for you.

Have you tried assigning less memory to the JVM? That would leave more
memory available to the OS.

Tomás

On Sun, Jan 30, 2011 at 1:54 AM, Darx Oman <darxoman@gmail.com> wrote:

> Hi guys
>
>
>
> I'm running a solr instance (trunk)  in my dev. Server to test my
> configuration.  I'm doing a DIH full import to index 49 PDF files with
> their
> corresponding database records.  Both the PDF files and database are local
> in the server.
>
> *Server : *
>
> ·         Windows 2008 R2
>
> ·         MS SQL server 2008 R2
>
> ·         16 core processor
>
> ·         16 GB ram
>
> *Tomcat (7.0.5) : *
>
> ·         Set JAVA_OPTS = %JAVA_OPTS%  -Xms1024M  -Xmx8192M
>
> *Solrconfig:*
>
> ·         Main index configurations
>    <ramBufferSize>2048</ramBufferSize>
>    <mergeFactor>50</mergeFactor>
>
> *DIH configuration:*
>
> ·         2 data sources defined  jdbcDataSource and BinFileDataSource
>
> ·         One main entity with 3 sub entities
>
> <entity dataSource="myJdbc" …>
>
>    <entity dataSource="myBinFile" …> </entity>
>
>    <entity dataSource=" myJdbc" …> </entity>
>
>    <entity dataSource=" myJdbc" …> </entity>
>
> <entity/>
>
> ·         Total schema fields are 8, three of which are text type and
> multivalued.
>
> *My DIH import Status Messages:*
>
> ·         Total Requests made to DataSource = 99**
>
> ·         Total Rows Fetched = 2124**
>
> ·         Total DocumentsProcessed = 49**
>
> ·         Time Taken = *0:2:3:880***
>
> *
> Is this time reasonable or it can be improved?*
>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic