[prev in list] [next in list] [prev in thread] [next in thread]
List: pgsql-performance
Subject: Re: Please help! Query jumps from 1s -> 4m
From: James Thompson <james () jthompson ! dev>
Date: 2020-05-13 14:17:03
Message-ID: CABoe=cTRMXFBTJ93nVk=JyPsFLTfxBZ+emGQU_wzPxiX40tMQA () mail ! gmail ! com
[Download RAW message or body]
Just to follow up on this...
Tried increasing stats targets last week + re-analyzing but the query was
just as bad.
Ended up increasing the prepareThreshold to prevent server-side prepares
for now (and thus later generic statements). This 'fixed' the issue and had
no noticeable negative effect for our workloads.
I still don't understand why the plan being off makes the query so much
slower in this case (the plans I shared in the last email don't look too
different, I don't understand how the filter can add on 2mins of execution
time to an index-only scan). If anyone does have thoughts on what could be
happening I would be very interested to hear, but the main performance
problem is effectively solved.
Thanks all for the valuable help getting to the bottom of what was
happening.
On Tue, 5 May 2020 at 22:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> James Thompson <james@jthompson.dev> writes:
> > The slowness occurs when the prepared statement changes to a generic
> plan.
>
> > Initial plan:
> > -> Index Only Scan using table1_typea_include_uniqueid_col16_idx on
> table1
> > table1alias2 (cost=0.56..2549.70 rows=70 width=36) (actual
> > time=1.901..45.256 rows=65000 loops=1)
> > Output: table1alias2.uniqueid
> > Index Cond: ((table1alias2.col20 = '12345'::bigint) AND
> (table1alias2.
> > col8 = ANY ('{c5986b02-3a02-4639-8147-f286972413ba,...
> > 98ed24b1-76f5-4b0e-bb94-86cf13a4809c}'::text[])))
> > Heap Fetches: 10
> > Buffers: shared hit=5048
>
> > after 5 executions of the statement:
> > -> Index Only Scan using table1_typea_include_uniqueid_col16_idx on
> table1
> > table1alias2 (cost=0.56..17.23 rows=1 width=36) (actual
> > time=125.344..126877.822 rows=65000 loops=1)
> > Output: table1alias2.uniqueid
> > Index Cond: (table1alias2.col20 = $1001)
> > Filter: ((table1alias2.col8)::text = ANY ((ARRAY[$1, ...,
> > $1000])::text[]))
> > Rows Removed by Filter: 2670023
> > Heap Fetches: 428
> > Buffers: shared hit=45933 read=42060 dirtied=4
>
> Yeah, this is a dynamic we've seen before. The rowcount estimate, and
> hence the cost estimate, for the plan with explicit parameter values is
> way off; but the estimate for the generic plan is even more way off,
> causing the system to falsely decide that the latter is cheaper.
>
> I've speculated about refusing to believe generic cost estimates if they
> are
> more than epsilon less than the concrete cost estimate, but it's not quite
> clear how that should work or whether it'd have its own failure modes.
>
> The one thing that is totally clear is that these rowcount estimates are
> crappy. Can you improve them by increasing the stats target for that
> table? Maybe with less-garbage-y inputs, the system would make the right
> plan choice here.
>
> regards, tom lane
>
[Attachment #3 (text/html)]
<div dir="ltr"><div>Just to follow up on this...</div><div>Tried increasing stats \
targets last week + re-analyzing but the query was just as bad. <br></div><div>Ended \
up increasing the prepareThreshold to prevent server-side prepares for now (and thus \
later generic statements). This 'fixed' the issue and had no noticeable \
negative effect for our workloads.<br></div><div><br></div><div>I still don't \
understand why the plan being off makes the query so much slower in this case (the \
plans I shared in the last email don't look too different, I don't understand \
how the filter can add on 2mins of execution time to an index-only scan). If anyone \
does have thoughts on what could be happening I would be very interested to hear, but \
the main performance problem is effectively solved.</div><div><br></div><div>Thanks \
all for the valuable help getting to the bottom of what was \
happening.<br></div></div><br><div class="gmail_quote"><div dir="ltr" \
class="gmail_attr">On Tue, 5 May 2020 at 22:42, Tom Lane <<a \
href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br></div><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex">James Thompson <<a \
href="mailto:james@jthompson.dev" target="_blank">james@jthompson.dev</a>> \
writes:<br> > The slowness occurs when the prepared statement changes to a generic \
plan.<br> <br>
> Initial plan:<br>
> -> Index Only Scan using table1_typea_include_uniqueid_col16_idx on \
table1<br> > table1alias2 (cost=0.56..2549.70 rows=70 width=36) (actual<br>
> time=1.901..45.256 rows=65000 loops=1)<br>
> Output: table1alias2.uniqueid<br>
> Index Cond: ((table1alias2.col20 = '12345'::bigint) AND \
(table1alias2.<br> > col8 = ANY \
('{c5986b02-3a02-4639-8147-f286972413ba,...<br> > \
98ed24b1-76f5-4b0e-bb94-86cf13a4809c}'::text[])))<br> > Heap Fetches: \
10<br> > Buffers: shared hit=5048<br>
<br>
> after 5 executions of the statement:<br>
> -> Index Only Scan using table1_typea_include_uniqueid_col16_idx on \
table1<br> > table1alias2 (cost=0.56..17.23 rows=1 width=36) (actual<br>
> time=125.344..126877.822 rows=65000 loops=1)<br>
> Output: table1alias2.uniqueid<br>
> Index Cond: (table1alias2.col20 = $1001)<br>
> Filter: ((table1alias2.col8)::text = ANY ((ARRAY[$1, ...,<br>
> $1000])::text[]))<br>
> Rows Removed by Filter: 2670023<br>
> Heap Fetches: 428<br>
> Buffers: shared hit=45933 read=42060 dirtied=4<br>
<br>
Yeah, this is a dynamic we've seen before. The rowcount estimate, and<br>
hence the cost estimate, for the plan with explicit parameter values is<br>
way off; but the estimate for the generic plan is even more way off,<br>
causing the system to falsely decide that the latter is cheaper.<br>
<br>
I've speculated about refusing to believe generic cost estimates if they are<br>
more than epsilon less than the concrete cost estimate, but it's not quite<br>
clear how that should work or whether it'd have its own failure modes.<br>
<br>
The one thing that is totally clear is that these rowcount estimates are<br>
crappy. Can you improve them by increasing the stats target for that<br>
table? Maybe with less-garbage-y inputs, the system would make the right<br>
plan choice here.<br>
<br>
regards, tom lane<br>
</blockquote></div>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic