[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    Re: Strange performance degregation in sql function (PG11.1)
From:       Pavel Stehule <pavel.stehule () gmail ! com>
Date:       2019-05-24 3:36:29
Message-ID: CAFj8pRCS_wJwcPs42sMKjuR6AsmakGu1Chbu2a8-Rbv6AviYhg () mail ! gmail ! com
[Download RAW message or body]

čt 23. 5. 2019 v 23:38 odesílatel Alastair McKinley <
a.mckinley@analyticsengines.com> napsal:

> Hi Andrew,
>
> Thanks for your in-depth response.  I found that adding the stable
> qualifier didn't solve the issue unfortunately.  I actually encountered the
> same issue (or at least extremely similar) today and made a bit more
> progress on defining it.
>
> I have a carefully indexed and optimised query that runs in about 2
> seconds with this structure and signature.
>
>     create or replace function example_function(param1 int, param2 int)
> returns setof custom_type as
>     $$
>         with a_few_ctes ()
>         select * from big_table where col1 = param1 and col2 = param2;
>     $$ language sql stable;
>
> This function works as expected when executed from a psql client.  I am
> calling this function via another framework (Postgrest) that executes the
> function using the following pattern:
>
>     with args as (
>         select json_to_record($1) as (param1 int,param2 int)
>     ),
>     output as (
>         select *
>         from example_function(
>             param1 := (select param1 from args),
>             param2 := (select param2 from args)
>         )
>     )
>     select * from output;
>
> Running this query with the args coming from the CTE resulted in my query
> running for many tens of minutes before I gave up.  Changing the underlying
> function to use plpgsql fixes the issue and the query runs as expected.
> Both versions work as expected when called directly, but the SQL version
> does not when called with the args coming from the CTE as shown above.
>
> The new function signature is
>
>     create or replace function example_function(param1 int, param2 int)
> returns setof custom_type as
>     $$
>         with a_few_ctes ()
>         return query select * from big_table where col1 = param1 and col2
> = param2;
>     $$ language plpgsql stable;
>
> I haven't been able to check the bad query plan with auto_explain as the
> query doesn't seem to finish.
>
> So to summarise, changing a stable SQL function to a stable plpgsql
> function when called with function args from a CTE fixes a huge performance
> issue of uncertain origin.  I hope someone can offer some suggestions as
> this has really confused me.
>

SQL functions are fast when they are inlined, but when are not inlined,
then they are significantly slower than plpgsql.

I am not sure, long time I didn't see this code. If I remember well, SQL
functions doesn't cache plans - so creates and lost plans every time.



> Best regards,
>
> Alastair
> ------------------------------
> *From:* Andrew Gierth <andrew@tao11.riddles.org.uk>
> *Sent:* 19 May 2019 03:48
> *To:* Alastair McKinley
> *Cc:* pgsql-general@lists.postgresql.org
> *Subject:* Re: Strange performance degregation in sql function (PG11.1)
>
> >>>>> "Alastair" == Alastair McKinley <a.mckinley@analyticsengines.com>
> writes:
>
>  Alastair> Hi all,
>
>  Alastair> I recently experienced a performance degradation in an
>  Alastair> operational system that I can't explain. I had a function
>  Alastair> wrapper for a aggregate query that was performing well using
>  Alastair> the expected indexes with the approximate structure as shown
>  Alastair> below.
>
>  Alastair> create or replace function example_function(param1 int, param2
> int) returns setof custom_type as
>  Alastair> $$
>  Alastair>     select * from big_table where col1 = param1 and col2 =
> param2;
>  Alastair> $$ language sql;
>
> This function isn't inlinable due to missing a STABLE qualifier; that's
> a pretty big issue.
>
> Without inlining, the function will be run only with generic plans,
> which means that the decision about index usage will be made without
> knowledge of the parameter values.
>
> Was your actual function inlinable? See
> https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
>
> You can get the query plan of a non-inlined function using the
> auto_explain module (with its log_nested_statements option). The query
> plan of non-inlined function calls is not otherwise shown by EXPLAIN.
>
>  Alastair> After creating two new indexes on this table to support a
>  Alastair> different use case during a migration, this unchanged
>  Alastair> function reduced in performance by several orders of
>  Alastair> magnitude. Running the query inside the function manually on
>  Alastair> the console however worked as expected and the query plan did
>  Alastair> not appear to have changed.
>
> But when you run it manually, you'll get a custom plan, based on the
> parameter values.
>
>  Alastair> On a hunch I changed the structure of the function to the
>  Alastair> structure below and immediately the query performance
>  Alastair> returned to the expected baseline.
>
>  Alastair> create or replace function example_function(param1 int, param2
> int) returns setof custom_type as
>  Alastair> $$
>  Alastair> BEGIN
>  Alastair>     return query execute format($query$
>  Alastair>         select * from big_table where col1 = %1$L and col2 = %1$
>  Alastair>     $query$,param1,param2);
>  Alastair> END;
>  Alastair> $$ language plpgsql;
>
> Using EXECUTE in plpgsql will get you a custom plan every time (though
> you really should have used EXECUTE USING rather than interpolating the
> parameters into the query string).
>
> I suggest looking into the inlining question first.
>
> --
> Andrew (irc:RhodiumToad)
>

[Attachment #3 (text/html)]

<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" \
class="gmail_attr">čt 23. 5. 2019 v  23:38 odesílatel Alastair McKinley &lt;<a \
href="mailto:a.mckinley@analyticsengines.com">a.mckinley@analyticsengines.com</a>&gt; \
napsal:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">




<div dir="ltr">
<div><span>Hi Andrew,<br>
</span></div>
<div><br>
</div>
<div>Thanks for your in-depth response.   I found that adding the stable qualifier \
didn&#39;t solve the issue unfortunately.   I actually encountered the same issue (or \
at least extremely similar) today and made a bit more progress on defining it.   <br>
</div>
<div><br>
</div>
<div>I have a carefully indexed and optimised query that runs in about 2 seconds with \
this structure and signature.<br> </div>
<div><br>
</div>
<div>      create or replace function example_function(param1 int, param2 int) \
returns setof custom_type as<br> </div>
<div>      $$</div>
<div>               with a_few_ctes ()<br>
</div>
<div>            select * from big_table where col1 = param1 and col2 = param2;<br>
</div>
<div>      $$ language sql stable;<br>
</div>
<div><br>
</div>
<div>This function works as expected when executed from a psql client.   I am calling \
this function via another framework (Postgrest) that executes the function using the \
following pattern:<br> </div>
<div><br>
</div>
<div>      with args as (<br>
</div>
<div>            select json_to_record($1) as (param1 int,param2 int)<br>
</div>
<div>      ),<br>
</div>
<div>      output as (<br>
</div>
<div>            select * <br>
</div>
<div>            from example_function(<br>
</div>
<div>                  param1 := (select param1 from args),<br>
</div>
<div>                  param2 := (select param2 from args)<br>
</div>
<div>            )<br>
</div>
<div>      )<br>
</div>
<div>      select * from output;<br>
</div>
<div><br>
</div>
<div>Running this query with the args coming from the CTE resulted in my query \
running for many tens of minutes before I gave up.   Changing the underlying function \
to use plpgsql fixes the issue and the query runs as expected.   Both versions work \
as expected  when called directly, but the SQL version does not when called with the \
args coming from the CTE as shown above.<br> </div>
<div><br>
</div>
<div>The new function signature is<br>
</div>
<div><br>
</div>
<div>      create or replace function example_function(param1 int, param2 int) \
returns setof custom_type as<br> </div>
<div>      $$</div>
<div>               with a_few_ctes ()<br>
</div>
<div>            return query select * from big_table where col1 = param1 and col2 = \
param2;<br> </div>
<div>      $$ language plpgsql stable;<br>
</div>
<div><br>
</div>
<div>I haven&#39;t been able to check the bad query plan with auto_explain as the \
query doesn&#39;t seem to finish.<br> </div>
<div><br>
</div>
<div><span>So to summarise, changing a stable SQL function to a stable plpgsql \
function when called with function args from a CTE fixes a huge performance issue of \
uncertain origin.   I hope someone can offer some suggestions as this has really \
confused me.<br></span></div></div></blockquote><div><br></div><div>SQL functions are \
fast when they are inlined, but when are not inlined, then they are significantly \
slower than plpgsql.</div><div><br></div><div>I am not sure, long time I didn&#39;t \
see this code. If I remember well, SQL functions doesn&#39;t cache plans - so creates \
and lost plans every time.</div><div><br></div><div><br></div><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><span> </span></div>
<div><span><br>
</span></div>
<div><span>Best regards,</span></div>
<div><span><br>
</span></div>
<div><span>Alastair<br>
</span></div>
<div id="gmail-m_4200886512980612535appendonsend"></div>
<hr style="display:inline-block;width:98%">
<div id="gmail-m_4200886512980612535divRplyFwdMsg" dir="ltr"><font \
style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b> Andrew \
Gierth &lt;<a href="mailto:andrew@tao11.riddles.org.uk" \
target="_blank">andrew@tao11.riddles.org.uk</a>&gt;<br> <b>Sent:</b> 19 May 2019 \
03:48<br> <b>To:</b> Alastair McKinley<br>
<b>Cc:</b> <a href="mailto:pgsql-general@lists.postgresql.org" \
target="_blank">pgsql-general@lists.postgresql.org</a><br> <b>Subject:</b> Re: \
Strange performance degregation in sql function (PG11.1)</font> <div>  </div>
</div>
<div class="gmail-m_4200886512980612535BodyFragment"><font size="2"><span \
style="font-size:11pt"> <div \
class="gmail-m_4200886512980612535PlainText">&gt;&gt;&gt;&gt;&gt; \
&quot;Alastair&quot; == Alastair McKinley &lt;<a \
href="mailto:a.mckinley@analyticsengines.com" \
target="_blank">a.mckinley@analyticsengines.com</a>&gt; writes:<br> <br>
  Alastair&gt; Hi all,<br>
<br>
  Alastair&gt; I recently experienced a performance degradation in an<br>
  Alastair&gt; operational system that I can&#39;t explain. I had a function<br>
  Alastair&gt; wrapper for a aggregate query that was performing well using<br>
  Alastair&gt; the expected indexes with the approximate structure as shown<br>
  Alastair&gt; below.<br>
<br>
  Alastair&gt; create or replace function example_function(param1 int, param2 int) \
returns setof custom_type as<br>  Alastair&gt; $$<br>
  Alastair&gt;         select * from big_table where col1 = param1 and col2 = \
param2;<br>  Alastair&gt; $$ language sql;<br>
<br>
This function isn&#39;t inlinable due to missing a STABLE qualifier; that&#39;s<br>
a pretty big issue.<br>
<br>
Without inlining, the function will be run only with generic plans,<br>
which means that the decision about index usage will be made without<br>
knowledge of the parameter values.<br>
<br>
Was your actual function inlinable? See<br>
<a href="https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions" \
target="_blank">https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions</a><br> \
<br> You can get the query plan of a non-inlined function using the<br>
auto_explain module (with its log_nested_statements option). The query<br>
plan of non-inlined function calls is not otherwise shown by EXPLAIN.<br>
<br>
  Alastair&gt; After creating two new indexes on this table to support a<br>
  Alastair&gt; different use case during a migration, this unchanged<br>
  Alastair&gt; function reduced in performance by several orders of<br>
  Alastair&gt; magnitude. Running the query inside the function manually on<br>
  Alastair&gt; the console however worked as expected and the query plan did<br>
  Alastair&gt; not appear to have changed.<br>
<br>
But when you run it manually, you&#39;ll get a custom plan, based on the<br>
parameter values.<br>
<br>
  Alastair&gt; On a hunch I changed the structure of the function to the<br>
  Alastair&gt; structure below and immediately the query performance<br>
  Alastair&gt; returned to the expected baseline.<br>
<br>
  Alastair&gt; create or replace function example_function(param1 int, param2 int) \
returns setof custom_type as<br>  Alastair&gt; $$<br>
  Alastair&gt; BEGIN<br>
  Alastair&gt;         return query execute format($query$<br>
  Alastair&gt;                 select * from big_table where col1 = %1$L and col2 = \
%1$<br>  Alastair&gt;         $query$,param1,param2);<br>
  Alastair&gt; END;<br>
  Alastair&gt; $$ language plpgsql;<br>
<br>
Using EXECUTE in plpgsql will get you a custom plan every time (though<br>
you really should have used EXECUTE USING rather than interpolating the<br>
parameters into the query string).<br>
<br>
I suggest looking into the inlining question first.<br>
<br>
-- <br>
Andrew (irc:RhodiumToad)<br>
</div>
</span></font></div>
</div>

</blockquote></div></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic