'Re: Help needed to understand query planner regression with incremental sort'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    Re: Help needed to understand query planner regression with incremental sort
From:       "Henrik Peinar (nodeSWAT.com)" <henrik () nodeswat ! com>
Date:       2023-05-31 20:45:20
Message-ID: CAEwi-F_HrFf5ug_FyWLB=Vtb7L_j1QiuXjY65Dw+0oRwoWR2gA () mail ! gmail ! com
[Download RAW message or body]

Thank you for taking the time to look this through.

After reading your answer it obviously makes perfect sense, I was just
thrown off by a fact that when it started happening, it happened on every
query execution (for the same test query that I used). But I failed to
think about trying different search keywords to see if that made any
difference in the query plan / estimations.

I'll investigate this further and will try to clean the indexes up and make
sure the filter has correct index to use.

And yes, the schema attachment seems a bit off, the DDL copy didn't include
the ordering for the fields inside composite indexes I think, so that's why
they appear the same.

Very much appreciated of your time.

With best,
Henrik.



On Wed, May 31, 2023 at 6:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

> "Henrik Peinar (nodeSWAT.com)" <henrik@nodeswat.com> writes:
> > *Quick description: *After upgrading our Aurora PostgreSQL engine from
> v11
> > to v15, one of our often run SQL's started taking tens of seconds instead
> > of running sub 100ms. Explain analyze showed that the query planner had
> > switched to using incremental sort. Running ANALYZE on the table fixed
> the
> > issue temporarily, but it came back after few days.
>
> Hmm.  I think it's quite accidental that you get one plan over the other,
> because it looks like the key difference is something the planner doesn't
> account for.  In the fast case you have
>
>         ->  Index Scan using products_pinned_at_listed_at_ix on products
> (cost=0.43..3929423.12 rows=26523 width=1463) (actual time=2.325..32.872
> rows=61 loops=1)
>               Filter: ((desc_tsv @@ '''nike'':*'::tsquery) AND (status =
> 'published'::prod_status))
>               Rows Removed by Filter: 3376
>
> The index is only being used to produce ordered output here: the filter
> condition isn't related to the index.  And what we see is that the
> query is fast because the desired rows are found in the first 3376+61
> rows visited in this direction.  Meanwhile in the slow case you have
>
>               ->  Index Scan Backward using products_pinned_at_ix on
> products  (cost=0.43..1172249.47 rows=26139 width=1460) (actual
> time=5.263..5203.180 rows=18411 loops=1)
>                     Filter: ((desc_tsv @@ '''nike'':*'::tsquery) AND
> (status = 'published'::prod_status))
>                     Rows Removed by Filter: 5415895
>
> Again, the filter condition isn't exploiting the index, we're just using
> the index to (partially) satisfy the ORDER BY.  This one takes a long time
> because it has to trawl through 5415895+61 rows before reaching the LIMIT.
>
> So AFAICS, the runtime differential is mostly/entirely because the rows
> satisfying the filter condition are located near one end of the range of
> pinned_at.  That is not a correlation that the planner knows anything
> about, so it's unable to see that these two ways of scanning the table
> will have significantly different costs.  Moreover, I think you'd be
> mistaken to draw any great conclusions from this specific example,
> because with some other search term(s) the results might be totally
> different due to the required rows not falling in the same place.
>
> What I'd think about if this type of query is important is to set up
> an index that can actually be used to satisfy the filter condition,
> so that you're not forcing it into "scan the whole table till you
> find the rows you want".  It looks like you already have such an
> index, ie a GIN index on the desc_tsv column, although I don't trust
> that your schema attachment is actually accurate because if it is
> then you have a bunch of duplicative indexes.  You might try
> dropping the other indexes to see if you can coerce the planner
> into using that one, and then seeing what the cost estimate is.
>
>                         regards, tom lane
>

[Attachment #3 (text/html)]

<div dir="ltr"><div><div><div><div>Thank you for taking the time to look this \
through.<br></div><div><br>After reading your answer it obviously makes perfect \
sense, I was just thrown off by a fact that when it started happening, it happened on \
every query execution (for the same test query that I used). But I failed to think \
about trying different search keywords to see if that made any difference in the \
query plan / estimations.<br><br></div><div>I&#39;ll investigate this further and \
will try to clean the indexes up and make sure the filter has correct index to \
use.<br></div><div><br></div>And yes, the schema attachment seems a bit off, the DDL \
copy didn&#39;t include the ordering for the fields inside composite indexes I think, \
so that&#39;s why they appear the same.<br><br></div>Very much appreciated of your \
time.<br><br></div>With \
best,<br></div>Henrik.<br><div><div><div><div><br><br></div></div></div></div></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, May 31, 2023 at \
6:31 PM Tom Lane &lt;<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>&gt; \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">&quot;Henrik Peinar \
(nodeSWAT.com)&quot; &lt;<a href="mailto:henrik@nodeswat.com" \
target="_blank">henrik@nodeswat.com</a>&gt; writes:<br> &gt; *Quick description: \
*After upgrading our Aurora PostgreSQL engine from v11<br> &gt; to v15, one of our \
often run SQL&#39;s started taking tens of seconds instead<br> &gt; of running sub \
100ms. Explain analyze showed that the query planner had<br> &gt; switched to using \
incremental sort. Running ANALYZE on the table fixed the<br> &gt; issue temporarily, \
but it came back after few days.<br> <br>
Hmm.   I think it&#39;s quite accidental that you get one plan over the other,<br>
because it looks like the key difference is something the planner doesn&#39;t<br>
account for.   In the fast case you have<br>
<br>
            -&gt;   Index Scan using products_pinned_at_listed_at_ix on products   \
(cost=0.43..3929423.12 rows=26523 width=1463) (actual time=2.325..32.872 rows=61 \
                loops=1)<br>
                     Filter: ((desc_tsv @@ \
&#39;&#39;&#39;nike&#39;&#39;:*&#39;::tsquery) AND (status = \
&#39;published&#39;::prod_status))<br>  Rows Removed by Filter: 3376<br>
<br>
The index is only being used to produce ordered output here: the filter<br>
condition isn&#39;t related to the index.   And what we see is that the<br>
query is fast because the desired rows are found in the first 3376+61<br>
rows visited in this direction.   Meanwhile in the slow case you have<br>
<br>
                     -&gt;   Index Scan Backward using products_pinned_at_ix on \
products   (cost=0.43..1172249.47 rows=26139 width=1460) (actual time=5.263..5203.180 \
                rows=18411 loops=1)<br>
                              Filter: ((desc_tsv @@ \
&#39;&#39;&#39;nike&#39;&#39;:*&#39;::tsquery) AND (status = \
&#39;published&#39;::prod_status))<br>  Rows Removed by Filter: 5415895<br>
<br>
Again, the filter condition isn&#39;t exploiting the index, we&#39;re just using<br>
the index to (partially) satisfy the ORDER BY.   This one takes a long time<br>
because it has to trawl through 5415895+61 rows before reaching the LIMIT.<br>
<br>
So AFAICS, the runtime differential is mostly/entirely because the rows<br>
satisfying the filter condition are located near one end of the range of<br>
pinned_at.   That is not a correlation that the planner knows anything<br>
about, so it&#39;s unable to see that these two ways of scanning the table<br>
will have significantly different costs.   Moreover, I think you&#39;d be<br>
mistaken to draw any great conclusions from this specific example,<br>
because with some other search term(s) the results might be totally<br>
different due to the required rows not falling in the same place.<br>
<br>
What I&#39;d think about if this type of query is important is to set up<br>
an index that can actually be used to satisfy the filter condition,<br>
so that you&#39;re not forcing it into &quot;scan the whole table till you<br>
find the rows you want&quot;.   It looks like you already have such an<br>
index, ie a GIN index on the desc_tsv column, although I don&#39;t trust<br>
that your schema attachment is actually accurate because if it is<br>
then you have a bunch of duplicative indexes.   You might try<br>
dropping the other indexes to see if you can coerce the planner<br>
into using that one, and then seeing what the cost estimate is.<br>
<br>
                                    regards, tom lane<br>
</blockquote></div>



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic