'Re: Understanding query planner cpu usage'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    Re: Understanding query planner cpu usage
From:       Lucas Fairchild-Madar <lucas.madar () gmail ! com>
Date:       2018-02-22 20:04:16
Message-ID: CAJmoq7N0eQtk_FLWLfYTLsRdu2Grq-E0m686RPPfbfR0PcbK1Q () mail ! gmail ! com
[Download RAW message or body]

On Wed, Feb 21, 2018 at 7:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > What is the planner doing when trying to find the current live max value
> of
> > the column?
>
> It's trying to estimate whether a mergejoin will be able to stop short of
> reading all the tuples from the other side of the join.  (For instance,
> if you've got 1,3,5 on one side, and 1,4,5,7,8,9,19 on the other, the
> second input doesn't have to be read past "7" because once we run off the
> end of the first input, we know we couldn't see any matches later on the
> second input.  So the planner wants to compare the ending key value on
> each side to the key distribution on the other side, to see what this might
> save.)  Now, that's a unidirectional question for any particular mergejoin
> plan, so that for any one cost estimate it's only going to need to look at
> one end of the key range.  But I think it will consider merge joins with
> both sort directions, so that both ends of the key range will get
> investigated in this way.  I might be wrong though; it's late and I've
> not looked at that code in awhile ...
>

I'm thinking the least painful solution here might be to set
enable_mergejoin = false for this particular query, since the rows joined
are quite sparse.

[Attachment #3 (text/html)]

<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Feb 21, 2018 \
at 7:28 PM, Tom Lane <span dir="ltr">&lt;<a href="mailto:tgl@sss.pgh.pa.us" \
target="_blank">tgl@sss.pgh.pa.us</a>&gt;</span> wrote:<blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><span class=""> &gt; What is the planner doing when trying to \
find the current live max value of<br> &gt; the column?<br>
<br>
</span>It&#39;s trying to estimate whether a mergejoin will be able to stop short \
of<br> reading all the tuples from the other side of the join.   (For instance,<br>
if you&#39;ve got 1,3,5 on one side, and 1,4,5,7,8,9,19 on the other, the<br>
second input doesn&#39;t have to be read past &quot;7&quot; because once we run off \
the<br> end of the first input, we know we couldn&#39;t see any matches later on \
the<br> second input.   So the planner wants to compare the ending key value on<br>
each side to the key distribution on the other side, to see what this might<br>
save.)   Now, that&#39;s a unidirectional question for any particular mergejoin<br>
plan, so that for any one cost estimate it&#39;s only going to need to look at<br>
one end of the key range.   But I think it will consider merge joins with<br>
both sort directions, so that both ends of the key range will get<br>
investigated in this way.   I might be wrong though; it&#39;s late and I&#39;ve<br>
not looked at that code in awhile ...<br></blockquote><div><br></div><div>I&#39;m \
thinking the least painful solution here might be to set enable_mergejoin = false for \
this particular query, since the rows joined are quite \
sparse.</div></div></div></div>



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic