'Re: [PERFORM] "Select * " on 12-18M row table from remote machine thru JDBC - Performance nose-dives'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       pgsql-performance
Subject:    Re: [PERFORM] "Select * " on 12-18M row table from remote machine thru JDBC - Performance nose-dives
From:       Deron <fecastle () gmail ! com>
Date:       2012-09-28 15:10:26
Message-ID: CAF3Lvs5temEvkkao-Cs2giU6MUqm=X7N+t0DJjLqsq2X4T+i4g () mail ! gmail ! com
[Download RAW message or body]

I think the best advice I can think of is to go back to the basics.  Tools
like sar and top and look at logs.   Changing random settings on both the
client and server seems like guessing.  I find it unlikely that the changes
you made (jdbc and shared buffers) had the effects you noticed.  Determine
if it is I/O, CPU, or network.   Put all your settings back to the way they
were.  If the DB did not change, then look at OS and network.

Deron
On Sep 28, 2012 6:53 AM, "antthelimey" <antthelimey@gmail.com> wrote:

> On machine 1 - a table that contains between 12 and 18 million rows
> On machine 2 - a Java app that calls Select * on the table, and writes it
> into a Lucene index
> 
> Originally had a fetchSize of 10,000 and would take around 38 minutes for
> 12
> million, 50 minutes for 16ish million to read it all & write it all back
> out
> as the lucene index
> 
> One day it started taking 4 hours. If something changed, we dont know what
> it was
> 
> We tracked it down to, after 10 million or so rows, the Fetch to get the
> next 10,000 rows from the DB goes from like 1 second to 30 seconds, and
> stays there
> 
> After spending a week of two devs &  DBA trying to solve this, we
> eventually
> "solved" it by upping the FetchRowSize in the JDBC call to 50,000
> 
> It was performing well enough again for a few weeks
> 
> then...one day... it started taking 4 hours again
> 
> we tried upping the shared_buffer from 16GB to 20GB
> 
> And last night... it took 7 hours
> 
> we are using PGSQL 9.1
> 
> does anyone have ANY ideas?!
> 
> thanks much
> 
> 
> 
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Select-on-12-18M-row-table-from-remote-machine-thru-JDBC-Performance-nose-dives-after-10M-ish-records-tp5725853.html
>  Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
> 
> 
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
> 


[Attachment #3 (text/html)]

<p>I think the best advice I can think of is to go back to the basics.  Tools like \
sar and top and look at logs.   Changing random settings on both the client and \
server seems like guessing.  I find it unlikely that the changes you made (jdbc and \
shared buffers) had the effects you noticed.  Determine if it is I/O, CPU, or \
network.   Put all your settings back to the way they were.  If the DB did not \
change, then look at OS and network.</p>

<p>Deron</p>
<div class="gmail_quote">On Sep 28, 2012 6:53 AM, &quot;antthelimey&quot; &lt;<a \
href="mailto:antthelimey@gmail.com">antthelimey@gmail.com</a>&gt; wrote:<br \
type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 \
.8ex;border-left:1px #ccc solid;padding-left:1ex"> On machine 1 - a table that \
contains between 12 and 18 million rows<br> On machine 2 - a Java app that calls \
Select * on the table, and writes it<br> into a Lucene index<br>
<br>
Originally had a fetchSize of 10,000 and would take around 38 minutes for 12<br>
million, 50 minutes for 16ish million to read it all &amp; write it all back out<br>
as the lucene index<br>
<br>
One day it started taking 4 hours. If something changed, we dont know what<br>
it was<br>
<br>
We tracked it down to, after 10 million or so rows, the Fetch to get the<br>
next 10,000 rows from the DB goes from like 1 second to 30 seconds, and<br>
stays there<br>
<br>
After spending a week of two devs &amp;  DBA trying to solve this, we eventually<br>
&quot;solved&quot; it by upping the FetchRowSize in the JDBC call to 50,000<br>
<br>
It was performing well enough again for a few weeks<br>
<br>
then...one day... it started taking 4 hours again<br>
<br>
we tried upping the shared_buffer from 16GB to 20GB<br>
<br>
And last night... it took 7 hours<br>
<br>
we are using PGSQL 9.1<br>
<br>
does anyone have ANY ideas?!<br>
<br>
thanks much<br>
<br>
<br>
<br>
--<br>
View this message in context: <a \
href="http://postgresql.1045698.n5.nabble.com/Select-on-12-18M-row-table-from-remote-machine-thru-JDBC-Performance-nose-dives-after-10M-ish-records-tp5725853.html" \
target="_blank">http://postgresql.1045698.n5.nabble.com/Select-on-12-18M-row-table-fro \
m-remote-machine-thru-JDBC-Performance-nose-dives-after-10M-ish-records-tp5725853.html</a><br>


Sent from the PostgreSQL - performance mailing list archive at Nabble.com.<br>
<br>
<br>
--<br>
Sent via pgsql-performance mailing list (<a \
href="mailto:pgsql-performance@postgresql.org">pgsql-performance@postgresql.org</a>)<br>
 To make changes to your subscription:<br>
<a href="http://www.postgresql.org/mailpref/pgsql-performance" \
target="_blank">http://www.postgresql.org/mailpref/pgsql-performance</a><br> \
</blockquote></div>



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic