[prev in list] [next in list] [prev in thread] [next in thread] 

List:       pgsql-performance
Subject:    Re: [PERFORM] Less rows -> better performance?
From:       "Christian GRANDIN" <christian.grandin () gmail ! com>
Date:       2008-07-21 14:00:23
Message-ID: 1568f9ad0807210700k78d55744mcd36838df5b78e8e () mail ! gmail ! com
[Download RAW message or body]

Hi,

Reducing the amount of data will only have effect on table scan or index
scan. If your queries are selective and optimized, it will have no effect.

Before looking for solutions, the first thing to do is to understand what's
happen.

If you already know the queries then explain them. Otherwise, you must log
duration with the log_statement and log_min_duration parameters in the
postgresql.conf.

Before this, you must at least run VACUUM ANALYZE on the database to collect
actual statistics and have current explain plans.

Best regards.

Christian

2008/7/21 Richard Huxton <dev@archonet.com>

> Andreas Hartmann wrote:
>
>>
>> Here's some info about the actual amount of data:
>>
>> SELECT pg_database.datname,
>> pg_size_pretty(pg_database_size(pg_database.datname)) AS size
>> FROM pg_database where pg_database.datname = 'vvz_live_1';
>>
>>    datname    |  size
>> ---------------+---------
>>  vvz_live_1    | 2565 MB
>>
>> I wonder why the actual size is so much bigger than the data-only dump -
>> is this because of index data etc.?
>>
>
> I suspect Guillame is right and you've not been vacuuming. That or you've
> got a *LOT* of indexes. If the database is only 27MB dumped, I'd just
> dump/restore it.
>
> Since the database is read-only it might be worth running CLUSTER on the
>  main tables if there's a sensible ordering for them.
>
>  What in particular is slow?
>>>
>>
>> There's no particular bottleneck (at least that we're aware of). During
>> the first couple of days after the beginning of the semester the application
>> request processing tends to slow down due to the high load (many students
>> assemble their schedule). The customer upgraded the hardware (which already
>> helped a lot), but they asked us to find further approaches to performance
>> optimiziation.
>>
>
> 1. Cache sensibly at the application (I should have thought there's plenty
> of opportunity here).
> 2. Make sure you're using a connection pool and have sized it reasonably
> (try 4,8,16 see what loads you can support).
> 3. Use prepared statements where it makes sense. Not sure how you'll manage
> the interplay between this and connection pooling in JDBC. Not a Java man
> I'm afraid.
>
> If you're happy with the query plans you're looking to reduce overheads as
> much as possible during peak times.
>
> 4. Offload more of the processing to clients with some fancy ajax-ed
> interface.
> 5. Throw in a spare machine as an app server for the first week of term.
>  Presumably your load is 100 times average at this time.
>
> --
>  Richard Huxton
>  Archonet Ltd
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

[Attachment #3 (text/html)]

<div dir="ltr">Hi,<br><br>Reducing the amount of data will only have effect on table \
scan or index scan. If your queries are selective and optimized, it will have no \
effect.<br><br>Before looking for solutions, the first thing to do is to understand \
what&#39;s happen. <br> <br>If you already know the queries then explain them. \
Otherwise, you must log duration with the log_statement and log_min_duration \
parameters in the postgresql.conf.<br><br>Before this, you must at least run VACUUM \
ANALYZE on the database to collect actual statistics and have current explain \
plans.<br> <br>Best regards.<br><br>Christian<br><br><div \
class="gmail_quote">2008/7/21 Richard Huxton <span dir="ltr">&lt;<a \
href="mailto:dev@archonet.com">dev@archonet.com</a>&gt;</span><br><blockquote \
class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt \
0pt 0.8ex; padding-left: 1ex;"> <div class="Ih2E3d">Andreas Hartmann wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); \
margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> <br>
Here&#39;s some info about the actual amount of data:<br>
<br>
SELECT pg_database.datname,<br>
pg_size_pretty(pg_database_size(pg_database.datname)) AS size<br>
FROM pg_database where pg_database.datname = &#39;vvz_live_1&#39;;<br>
<br>
 &nbsp; &nbsp;datname &nbsp; &nbsp;| &nbsp;size<br>
---------------+---------<br>
&nbsp;vvz_live_1 &nbsp; &nbsp;| 2565 MB<br>
<br>
I wonder why the actual size is so much bigger than the data-only dump - is this \
because of index data etc.?<br> </blockquote>
<br></div>
I suspect Guillame is right and you&#39;ve not been vacuuming. That or you&#39;ve got \
a *LOT* of indexes. If the database is only 27MB dumped, I&#39;d just dump/restore \
it.<br> <br>
Since the database is read-only it might be worth running CLUSTER on the &nbsp;main \
tables if there&#39;s a sensible ordering for them.<div class="Ih2E3d"><br> <br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); \
margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><blockquote class="gmail_quote" \
style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; \
padding-left: 1ex;">

What in particular is slow?<br>
</blockquote>
<br>
There&#39;s no particular bottleneck (at least that we&#39;re aware of). During the \
first couple of days after the beginning of the semester the application request \
processing tends to slow down due to the high load (many students assemble their \
schedule). The customer upgraded the hardware (which already helped a lot), but they \
asked us to find further approaches to performance optimiziation.<br>

</blockquote>
<br></div>
1. Cache sensibly at the application (I should have thought there&#39;s plenty of \
opportunity here).<br> 2. Make sure you&#39;re using a connection pool and have sized \
it reasonably (try 4,8,16 see what loads you can support).<br> 3. Use prepared \
statements where it makes sense. Not sure how you&#39;ll manage the interplay between \
this and connection pooling in JDBC. Not a Java man I&#39;m afraid.<br> <br>
If you&#39;re happy with the query plans you&#39;re looking to reduce overheads as \
much as possible during peak times.<br> <br>
4. Offload more of the processing to clients with some fancy ajax-ed interface.<br>
5. Throw in a spare machine as an app server for the first week of term.  &nbsp; \
&nbsp;Presumably your load is 100 times average at this time.<br><font \
color="#888888"> <br>
-- <br>
 &nbsp;Richard Huxton<br>
 &nbsp;Archonet Ltd</font><div><div></div><div class="Wj3C7c"><br>
<br>
-- <br>
Sent via pgsql-performance mailing list (<a \
href="mailto:pgsql-performance@postgresql.org" \
target="_blank">pgsql-performance@postgresql.org</a>)<br> To make changes to your \
subscription:<br> <a href="http://www.postgresql.org/mailpref/pgsql-performance" \
target="_blank">http://www.postgresql.org/mailpref/pgsql-performance</a><br> \
</div></div></blockquote></div><br><br> </div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic