'[HACKERS] Performance improvement for joins where outer side is unique'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    [HACKERS] Performance improvement for joins where outer side is unique
From:       David Rowley <dgrowleyml () gmail ! com>
Date:       2014-12-31 13:47:49
Message-ID: CAApHDvqF6Sw-TK98bW48TdtFJ+3a7D2mFyZ7++=D-RyPsL76gw () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hi,

I've been hacking a bit at the join code again... This time I've been
putting some effort into  optimising the case where the inner side of the
join is known to be unique.
For example, given the tables:

create table t1 (id int primary key);
create table t2 (id int primary key);

And query such as:

select * from t1 left outer join t2 on t1.id=t2.id;

It is possible to deduce at planning time that "for each row in the outer
relation, only 0 or 1 rows can exist in the inner relation", (inner being
t2)

In all of our join algorithms in the executor, if the join type is SEMI,
 we skip to the next outer row once we find a matching inner row. This is
because we don't want to allow duplicate rows in the inner side to
duplicate outer rows in the result set. Obviously this is required per SQL
spec. I believe we can also skip to the next outer row in this case when
we've managed to prove that no other row can possibly exist that matches
the current outer row, due to a unique index or group by/distinct clause
(for subqueries).

I've attached a patch which aims to prove this idea works. Although so far
I've only gotten around to implementing this for outer joins.

Since code already existed in analyzejoin.c which aimed to determine if a
join to a relation was unique on the join's condition, the patch is pretty
much just some extra plumbing and a small rewire of analyzejoin.c, which
just aims to get the "unique_inner" bool value down to the join node.

The performance improvement is somewhere around 5-10% for hash joins, and a
bit less for merge join. In theory it could be 50% for nested loop joins.
In my life in the OLTP world, these "unique joins" pretty much cover all
joins that ever happen ever. Perhaps the OLAP world is different, so from
my point of view this is going to be a very nice performance gain.

I'm seeing this patch (or a more complete version) as the first step to a
whole number of other planner improvements:

A couple of examples of those are:

1.
explain select * from sales s inner join product p on p.productid =
s.productid order by s.productid,p.name;

The current plan for this looks like:
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Sort  (cost=149.80..152.80 rows=1200 width=46)
   Sort Key: s.productid, p.name
   ->  Hash Join  (cost=37.00..88.42 rows=1200 width=46)
         Hash Cond: (s.productid = p.productid)
         ->  Seq Scan on sales s  (cost=0.00..31.40 rows=2140 width=8)
         ->  Hash  (cost=22.00..22.00 rows=1200 width=38)
               ->  Seq Scan on product p  (cost=0.00..22.00 rows=1200
width=38)

But in reality we could have executed this with a simple merge join using
the PK of product (productid) to provide presorted input. The extra sort
column on p.name is redundant due to productid being unique.

The UNION planning is crying out for help for cases like this. Where it
could provide sorted input on a unique index, providing the union's
targetlist contained all of the unique index's columns, and we also managed
to find an index on the other part of the union on the same set of columns,
then we could perform a Merge Append and a Unique. This would cause a
signification improvement in execution time for these types of queries, as
the current planner does an append/sort/unique, which especially sucks for
plans with a LIMIT clause.

I think this should solve some of the problems that Kyotarosan encountered
in his episode of dancing with indices over here ->
http://www.postgresql.org/message-id/20131031.194310.212107585.horiguchi.kyotaro@lab.ntt.co.jp
where he was unable to prove that he could trim down sort nodes once all of
the columns of a unique index had been seen in the order by due to not
knowing if joins were going to duplicate the outer rows.

2.
It should also be possible to reuse a join in situations like:
create view product_sales as select p.productid,sum(s.qty) soldqty from
product p inner join sales s group by p.productid;

Where the consumer does:

select ps.*,p.current_price from product_sales ps inner join product p on
ps.productid = p.productid;

Here we'd currently join the product table twice, both times on the same
condition, but we could safely not bother doing that if we knew that the
join could never match more than 1 row on the inner side. Unfortunately I
deal with horrid situations like this daily, where people have used views
from within views, and all the same tables end up getting joined multiple
times :-(

Of course, both 1 and 2 should be considered separately from the attached
patch, this was just to show where it might lead.

Performance of the patch are as follows:


Test case:
create table t1 (id int primary key);
create table t2 (id int primary key);

insert into t1 select x.x from generate_series(1,1000000) x(x);
insert into t2 select x.x from generate_series(1,1000000) x(x);

vacuum analyze;

Query:
select count(t2.id) from t1 left outer join t2 on t1.id=t2.id;

Values are in transactions per second.

JOIN type Patched  Unpatched  new runtime
hash  1.295764  1.226188  94.63%
merge       1.786248  1.776605  99.46%
nestloop 0.465356  0.443015  95.20%

Things improve a bit more when using a varchar instead of an int:

hash 1.198821 1.102183 91.94%
I've attached the full benchmark results so as not to spam the thread.

Comments are welcome

Regards

David Rowley

[Attachment #5 (text/html)]

<div dir="ltr">Hi,<div><br></div><div>I&#39;ve been hacking a bit at the join code \
again... This time I&#39;ve been putting some effort into   optimising the case where \
the inner side of the join is known to be unique.</div><div>For example, given the \
tables:</div><div><br></div><div><div>create table t1 (id int primary \
key);</div><div>create table t2 (id int primary \
key);</div></div><div><br></div><div>And query such \
as:</div><div><br></div><div><div>select * from t1 left outer join t2 on <a \
href="http://t1.id">t1.id</a>=<a \
href="http://t2.id">t2.id</a>;</div></div><div><br></div><div>It is possible to \
deduce at planning time that &quot;for each row in the outer relation, only 0 or 1 \
rows can exist in the inner relation&quot;, (inner being \
t2)</div><div><br></div><div>In all of our join algorithms in the executor, if the \
join type is SEMI,   we skip to the next outer row once we find a matching inner row. \
This is because we don&#39;t want to allow duplicate rows in the inner side to \
duplicate outer rows in the result set. Obviously this is required per SQL spec. I \
believe we can also skip to the next outer row in this case when we&#39;ve managed to \
prove that no other row can possibly exist that matches the current outer row, due to \
a unique index or group by/distinct clause (for \
subqueries).</div><div><br></div><div>I&#39;ve attached a patch which aims to prove \
this idea works. Although so far I&#39;ve only gotten around to implementing this for \
outer joins.</div><div><br></div><div>Since code already existed in analyzejoin.c \
which aimed to determine if a join to a relation was unique on the join&#39;s \
condition, the patch is pretty much just some extra plumbing and a small rewire of \
analyzejoin.c, which just aims to get the &quot;unique_inner&quot; bool value down to \
the join node.</div><div><br></div><div>The performance improvement is somewhere \
around 5-10% for hash joins, and a bit less for merge join. In theory it could be 50% \
for nested loop joins.</div><div>In my life in the OLTP world, these &quot;unique \
joins&quot; pretty much cover all joins that ever happen ever. Perhaps the OLAP world \
is different, so from my point of view this is going to be a very nice performance \
gain.</div><div><br></div><div>I&#39;m seeing this patch (or a more complete version) \
as the first step to a whole number of other planner \
improvements:</div><div><br></div><div>A couple of examples of those \
are:</div><div><br></div><div>1.</div><div><div>explain select * from sales s inner \
join product p on p.productid = s.productid order by s.productid,<a \
href="http://p.name">p.name</a>;</div><div><br></div><div>The current plan for this \
looks like:</div><div>                                                     QUERY \
PLAN</div><div>--------------------------------------------------------------------------------</div><div> \
Sort   (cost=149.80..152.80 rows=1200 width=46)</div><div>     Sort Key: s.productid, \
<a href="http://p.name">p.name</a></div><div>     -&gt;   Hash Join   \
(cost=37.00..88.42 rows=1200 width=46)</div><div>              Hash Cond: \
(s.productid = p.productid)</div><div>              -&gt;   Seq Scan on sales s   \
(cost=0.00..31.40 rows=2140 width=8)</div><div>              -&gt;   Hash   \
(cost=22.00..22.00 rows=1200 width=38)</div><div>                       -&gt;   Seq \
Scan on product p   (cost=0.00..22.00 rows=1200 \
width=38)</div></div><div><br></div><div>But in reality we could have executed this \
with a simple merge join using the PK of product (productid) to provide presorted \
input. The extra sort column on <a href="http://p.name">p.name</a> is redundant due \
to productid being unique.</div><div><br></div><div>The UNION planning is crying out \
for help for cases like this. Where it could provide sorted input on a unique index, \
providing the union&#39;s targetlist contained all of the unique index&#39;s columns, \
and we also managed to find an index on the other part of the union on the same set \
of columns, then we could perform a Merge Append and a Unique. This would cause a \
signification improvement in execution time for these types of queries, as the \
current planner does an append/sort/unique, which especially sucks for plans with a \
LIMIT clause.</div><div><br></div><div>I think this should solve some of the problems \
that Kyotarosan encountered in his episode of dancing with indices over here -&gt;   \
<a href="http://www.postgresql.org/message-id/20131031.194310.212107585.horiguchi.kyot \
aro@lab.ntt.co.jp">http://www.postgresql.org/message-id/20131031.194310.212107585.horiguchi.kyotaro@lab.ntt.co.jp</a> \
where he was unable to prove that he could trim down sort nodes once all of the \
columns of a unique index had been seen in the order by due to not knowing if joins \
were going to duplicate the outer rows.  </div><div><br></div><div>2.</div><div>It \
should also be possible to reuse a join in situations like:</div><div>create view \
product_sales as select p.productid,sum(s.qty) soldqty from product p inner join \
sales s group by p.productid;</div><div><br></div><div>Where the consumer \
does:</div><div><br></div><div>select ps.*,p.current_price from product_sales ps \
inner join product p on ps.productid = p.productid;</div><div><br></div><div>Here \
we&#39;d currently join the product table twice, both times on the same condition, \
but we could safely not bother doing that if we knew that the join could never match \
more than 1 row on the inner side. Unfortunately I deal with horrid situations like \
this daily, where people have used views from within views, and all the same tables \
end up getting joined multiple times :-(</div><div><br></div><div>Of course, both 1 \
and 2 should be considered separately from the attached patch, this was just to show \
where it might lead.</div><div><br></div><div>Performance of the patch are as \
follows:</div><div><br></div><div><br></div><div>Test case:</div><div><div>create \
table t1 (id int primary key);</div><div>create table t2 (id int primary \
key);</div><div><br></div><div>insert into t1 select x.x from \
generate_series(1,1000000) x(x);</div><div>insert into t2 select x.x from \
generate_series(1,1000000) x(x);</div><div><br></div><div>vacuum \
analyze;</div></div><div><br></div><div>Query:</div><div><div>select count(<a \
href="http://t2.id">t2.id</a>) from t1 left outer join t2 on <a \
href="http://t1.id">t1.id</a>=<a \
href="http://t2.id">t2.id</a>;</div></div><div><br></div><div>Values are in \
transactions per second.</div><div><br></div><div><div>JOIN type<span class="" \
style="white-space:pre">	</span>Patched<span style="white-space:pre">     \
</span>Unpatched<span style="white-space:pre">    </span>new \
runtime</div><div>hash<span style="white-space:pre">          </span>1.295764<span \
style="white-space:pre">    </span>1.226188<span style="white-space:pre">      \
</span>94.63%</div><div>merge           1.786248<span style="white-space:pre">    \
</span>1.776605<span style="white-space:pre">      \
</span>99.46%</div><div>nestloop<span class="" \
style="white-space:pre">	</span>0.465356<span style="white-space:pre">    \
</span>0.443015<span style="white-space:pre">      \
</span>95.20%</div></div><div><br></div><div>Things improve a bit more when using a \
varchar instead of an int:</div><div><br></div><div><table border="0" cellpadding="0" \
cellspacing="0" width="303" style="border-collapse:collapse;width:228pt"><tbody><tr \
height="20" style="height:15pt">  <td height="20" class="" width="78" \
style="height:15pt;width:59pt">hash</td>  <td align="right" width="64" \
style="width:48pt">1.198821</td>  <td align="right" width="74" \
style="width:56pt">1.102183</td>  <td class="" align="right" width="87" \
style="width:65pt">91.94%</td></tr></tbody></table><br></div><div>I&#39;ve attached \
the full benchmark results so as not to spam the \
thread.</div><div><br></div><div>Comments are \
welcome</div><div><br></div><div>Regards</div><div><br></div><div>David \
Rowley</div></div>


["unijoin_benchmark.txt" (text/plain)]

create table t1 (id int primary key);
create table t2 (id int primary key);

insert into t1 select x.x from generate_series(1,1000000) x(x);
insert into t2 select x.x from generate_series(1,1000000) x(x);

vacuum analyze;


select count(t2.id) from t1 left outer join t2 on t1.id=t2.id;

** Patched:
D:\Postgres\install\bin>pgbench -f d:\unijoin.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 77
latency average: 779.221 ms
tps = 1.269234 (including connections establishing)
tps = 1.269637 (excluding connections establishing)


** Unpatched
D:\Postgres\install\bin>pgbench -f d:\unijoin.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 70
latency average: 857.143 ms
tps = 1.152724 (including connections establishing)
tps = 1.153080 (excluding connections establishing)

** 10 min run: Patched:
D:\Postgres\install\bin>pgbench -f d:\unijoin.sql -n -T 600 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 600 s
number of transactions actually processed: 778
latency average: 771.208 ms
tps = 1.295764 (including connections establishing)
tps = 1.295800 (excluding connections establishing)

** 10 min run: Unpatched:
D:\Postgres\install\bin>pgbench -f d:\unijoin.sql -n -T 600 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 600 s
number of transactions actually processed: 705
latency average: 851.064 ms
tps = 1.173946 (including connections establishing)
tps = 1.173984 (excluding connections establishing)


SET enable_hashjoin = off;

** 1 min run (Patched) using merge join
D:\Postgres\install\bin>pgbench -f d:\unijoin.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 108
latency average: 555.556 ms
tps = 1.785677 (including connections establishing)
tps = 1.786248 (excluding connections establishing)

** 1 min run (Unpatched) using merge join
D:\Postgres\install\bin>pgbench -f d:\unijoin.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 107
latency average: 560.748 ms
tps = 1.776052 (including connections establishing)
tps = 1.776605 (excluding connections establishing)

SET enable_mergejoin = off;

** 1 min run (Patched) nested loop join
D:\Postgres\install\bin>pgbench -f d:\unijoin.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 28
latency average: 2142.857 ms
tps = 0.465196 (including connections establishing)
tps = 0.465356 (excluding connections establishing)

** 1 min run (Unpatched) nested loop join
D:\Postgres\install\bin>pgbench -f d:\unijoin.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 27
latency average: 2222.222 ms
tps = 0.442878 (including connections establishing)
tps = 0.443015 (excluding connections establishing)


New Query:

create table s1 (id varchar(32) primary key);
insert into s1 select x.x::varchar from generate_series(1,1000000) x(x);
create table s2 (id varchar(32) primary key);
insert into s2 select x.x::varchar from generate_series(1,1000000) x(x);
vacuum analyze;

select count(s2.id) from s1 left outer join s2 on s1.id=s2.id;

** 1 min run (Patched) hash join
D:\Postgres\install\bin>pgbench -f d:\unijoin2.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 73
latency average: 821.918 ms
tps = 1.198489 (including connections establishing)
tps = 1.198821 (excluding connections establishing)

** 1 min run (Unpatched) Hash join
D:\Postgres\install\bin>pgbench -f d:\unijoin2.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 67
latency average: 895.522 ms
tps = 1.101832 (including connections establishing)
tps = 1.102183 (excluding connections establishing)

["unijoins_v1.patch" (application/octet-stream)]

diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 7eec3f3..5350ab5 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -306,10 +306,12 @@ ExecHashJoin(HashJoinState *node)
 					}
 
 					/*
-					 * In a semijoin, we'll consider returning the first
-					 * match, but after that we're done with this outer tuple.
+					 * In a semijoin or if the planner found that no more than
+					 * 1 inner row will match a single outer row, we'll
+					 * consider returning the first match, but after that we're
+					 * done with this outer tuple.
 					 */
-					if (node->js.jointype == JOIN_SEMI)
+					if (node->js.jointype == JOIN_SEMI || node->js.unique_inner)
 						node->hj_JoinState = HJ_NEED_NEW_OUTER;
 
 					if (otherqual == NIL ||
@@ -469,6 +471,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 		ExecInitExpr((Expr *) node->join.plan.qual,
 					 (PlanState *) hjstate);
 	hjstate->js.jointype = node->join.jointype;
+	hjstate->js.unique_inner = node->join.unique_inner;
 	hjstate->js.joinqual = (List *)
 		ExecInitExpr((Expr *) node->join.joinqual,
 					 (PlanState *) hjstate);
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index fdf2f4c..6122b47 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -832,10 +832,12 @@ ExecMergeJoin(MergeJoinState *node)
 					}
 
 					/*
-					 * In a semijoin, we'll consider returning the first
-					 * match, but after that we're done with this outer tuple.
+					 * In a semijoin or if the planner found that no more than
+					 * 1 inner row will match a single outer row, we'll
+					 * consider returning the first match, but after that we're
+					 * done with this outer tuple.
 					 */
-					if (node->js.jointype == JOIN_SEMI)
+					if (node->js.jointype == JOIN_SEMI || node->js.unique_inner )
 						node->mj_JoinState = EXEC_MJ_NEXTOUTER;
 
 					qualResult = (otherqual == NIL ||
@@ -1503,6 +1505,7 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 		ExecInitExpr((Expr *) node->join.plan.qual,
 					 (PlanState *) mergestate);
 	mergestate->js.jointype = node->join.jointype;
+	mergestate->js.unique_inner = node->join.unique_inner;
 	mergestate->js.joinqual = (List *)
 		ExecInitExpr((Expr *) node->join.joinqual,
 					 (PlanState *) mergestate);
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 6cdd4ff..a89be06 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -247,10 +247,11 @@ ExecNestLoop(NestLoopState *node)
 			}
 
 			/*
-			 * In a semijoin, we'll consider returning the first match, but
-			 * after that we're done with this outer tuple.
+			 * In a semijoin or if the planner found that no more than 1 inner
+			 * row will match a single outer row, we'll consider returning the
+			 * first match, but after that we're done with this outer tuple.
 			 */
-			if (node->js.jointype == JOIN_SEMI)
+			if (node->js.jointype == JOIN_SEMI || node->js.unique_inner)
 				node->nl_NeedNewOuter = true;
 
 			if (otherqual == NIL || ExecQual(otherqual, econtext, false))
@@ -327,6 +328,7 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
 		ExecInitExpr((Expr *) node->join.plan.qual,
 					 (PlanState *) nlstate);
 	nlstate->js.jointype = node->join.jointype;
+	nlstate->js.unique_inner = node->join.unique_inner;
 	nlstate->js.joinqual = (List *)
 		ExecInitExpr((Expr *) node->join.joinqual,
 					 (PlanState *) nlstate);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index a737d7d..1ea346a 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -639,6 +639,7 @@ CopyJoinFields(const Join *from, Join *newnode)
 	CopyPlanFields((const Plan *) from, (Plan *) newnode);
 
 	COPY_SCALAR_FIELD(jointype);
+	COPY_SCALAR_FIELD(unique_inner);
 	COPY_NODE_FIELD(joinqual);
 }
 
@@ -1938,6 +1939,7 @@ _copySpecialJoinInfo(const SpecialJoinInfo *from)
 	COPY_SCALAR_FIELD(jointype);
 	COPY_SCALAR_FIELD(lhs_strict);
 	COPY_SCALAR_FIELD(delay_upper_joins);
+	COPY_SCALAR_FIELD(unique_inner);
 	COPY_NODE_FIELD(join_quals);
 
 	return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 14e190a..f6672c6 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -796,6 +796,7 @@ _equalSpecialJoinInfo(const SpecialJoinInfo *a, const SpecialJoinInfo *b)
 	COMPARE_SCALAR_FIELD(jointype);
 	COMPARE_SCALAR_FIELD(lhs_strict);
 	COMPARE_SCALAR_FIELD(delay_upper_joins);
+	COMPARE_SCALAR_FIELD(unique_inner);
 	COMPARE_NODE_FIELD(join_quals);
 
 	return true;
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index e99d416..e764bda 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -32,12 +32,33 @@
 #include "utils/lsyscache.h"
 
 /* local functions */
+static bool join_is_unique_join(PlannerInfo *root, SpecialJoinInfo *sjinfo);
 static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
 static void remove_rel_from_query(PlannerInfo *root, int relid,
 					  Relids joinrelids);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
 static Oid	distinct_col_search(int colno, List *colnos, List *opids);
 
+void
+mark_unique_joins(PlannerInfo *root, List *joinlist)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->join_info_list)
+	{
+		SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(lc);
+
+		/* it may have already been marked by a previous call. */
+		if (sjinfo->unique_inner)
+			continue;
+
+		/*
+		 * Determine if the inner side of the join can produce at most 1 row
+		 * for any single outer row.
+		 */
+		sjinfo->unique_inner = join_is_unique_join(root, sjinfo);
+	}
+}
 
 /*
  * remove_useless_joins
@@ -91,6 +112,16 @@ restart:
 		root->join_info_list = list_delete_ptr(root->join_info_list, sjinfo);
 
 		/*
+		 * The removal of the join could have caused push down quals to
+		 * have been removed, which could previously have stopped us
+		 * from marking the join as a unique join. We'd better make
+		 * another pass over each join to make sure otherwise we may fail
+		 * to remove other joins which had a join condition on the join
+		 * that we just removed.
+		 */
+		mark_unique_joins(root, joinlist);
+
+		/*
 		 * Restart the scan.  This is necessary to ensure we find all
 		 * removable joins independently of ordering of the join_info_list
 		 * (note that removal of attr_needed bits may make a join appear
@@ -135,36 +166,21 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
 	return false;				/* no good for these input relations */
 }
 
-/*
- * join_is_removable
- *	  Check whether we need not perform this special join at all, because
- *	  it will just duplicate its left input.
- *
- * This is true for a left join for which the join condition cannot match
- * more than one inner-side row.  (There are other possibly interesting
- * cases, but we don't have the infrastructure to prove them.)  We also
- * have to check that the inner side doesn't generate any variables needed
- * above the join.
- */
 static bool
-join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
+join_is_unique_join(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 {
 	int			innerrelid;
 	RelOptInfo *innerrel;
 	Query	   *subquery = NULL;
 	Relids		joinrelids;
-	List	   *clause_list = NIL;
 	ListCell   *l;
-	int			attroff;
+	List	   *clause_list = NIL;
 
-	/*
-	 * Must be a non-delaying left join to a single baserel, else we aren't
-	 * going to be able to do anything with it.
-	 */
-	if (sjinfo->jointype != JOIN_LEFT ||
-		sjinfo->delay_upper_joins)
+	/* if there's any delayed upper joins then we'll need to punt. */
+	if (sjinfo->delay_upper_joins)
 		return false;
 
+	/* if there's more than 1 relation involved then punt */
 	if (!bms_get_singleton_member(sjinfo->min_righthand, &innerrelid))
 		return false;
 
@@ -174,9 +190,9 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 		return false;
 
 	/*
-	 * Before we go to the effort of checking whether any innerrel variables
-	 * are needed above the join, make a quick check to eliminate cases in
-	 * which we will surely be unable to prove uniqueness of the innerrel.
+	 * Before we go to the effort of pulling out the join condition's columns,
+	 * make a quick check to eliminate cases in which we will surely be unable
+	 * to prove uniqueness of the innerrel.
 	 */
 	if (innerrel->rtekind == RTE_RELATION)
 	{
@@ -207,53 +223,6 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	joinrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
 
 	/*
-	 * We can't remove the join if any inner-rel attributes are used above the
-	 * join.
-	 *
-	 * Note that this test only detects use of inner-rel attributes in higher
-	 * join conditions and the target list.  There might be such attributes in
-	 * pushed-down conditions at this join, too.  We check that case below.
-	 *
-	 * As a micro-optimization, it seems better to start with max_attr and
-	 * count down rather than starting with min_attr and counting up, on the
-	 * theory that the system attributes are somewhat less likely to be wanted
-	 * and should be tested last.
-	 */
-	for (attroff = innerrel->max_attr - innerrel->min_attr;
-		 attroff >= 0;
-		 attroff--)
-	{
-		if (!bms_is_subset(innerrel->attr_needed[attroff], joinrelids))
-			return false;
-	}
-
-	/*
-	 * Similarly check that the inner rel isn't needed by any PlaceHolderVars
-	 * that will be used above the join.  We only need to fail if such a PHV
-	 * actually references some inner-rel attributes; but the correct check
-	 * for that is relatively expensive, so we first check against ph_eval_at,
-	 * which must mention the inner rel if the PHV uses any inner-rel attrs as
-	 * non-lateral references.  Note that if the PHV's syntactic scope is just
-	 * the inner rel, we can't drop the rel even if the PHV is variable-free.
-	 */
-	foreach(l, root->placeholder_list)
-	{
-		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
-
-		if (bms_is_subset(phinfo->ph_needed, joinrelids))
-			continue;			/* PHV is not used above the join */
-		if (bms_overlap(phinfo->ph_lateral, innerrel->relids))
-			return false;		/* it references innerrel laterally */
-		if (!bms_overlap(phinfo->ph_eval_at, innerrel->relids))
-			continue;			/* it definitely doesn't reference innerrel */
-		if (bms_is_subset(phinfo->ph_eval_at, innerrel->relids))
-			return false;		/* there isn't any other place to eval PHV */
-		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
-						innerrel->relids))
-			return false;		/* it does reference innerrel */
-	}
-
-	/*
 	 * Search for mergejoinable clauses that constrain the inner rel against
 	 * either the outer rel or a pseudoconstant.  If an operator is
 	 * mergejoinable then it behaves like equality for some btree opclass, so
@@ -368,6 +337,94 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	return false;
 }
 
+/*
+ * join_is_removable
+ *	  Check whether we need not perform this special join at all, because
+ *	  it will just duplicate its left input.
+ *
+ * This is true for a left join for which the join condition cannot match
+ * more than one inner-side row.  (There are other possibly interesting
+ * cases, but we don't have the infrastructure to prove them.)  We also
+ * have to check that the inner side doesn't generate any variables needed
+ * above the join.
+ */
+static bool
+join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
+{
+	int			innerrelid;
+	RelOptInfo *innerrel;
+	Relids		joinrelids;
+	int			attroff;
+	ListCell   *l;
+
+	/*
+	 * We can only remove LEFT JOINs where the inner side is unique on the
+	 * join condition.
+	 */
+	if (sjinfo->jointype != JOIN_LEFT ||
+		sjinfo->unique_inner == false)
+		return false;
+
+	if (!bms_get_singleton_member(sjinfo->min_righthand, &innerrelid))
+		return false;
+
+	innerrel = find_base_rel(root, innerrelid);
+
+	Assert(innerrel->reloptkind == RELOPT_BASEREL);
+
+	/* Compute the relid set for the join we are considering */
+	joinrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
+
+	/*
+	 * We can't remove the join if any inner-rel attributes are used above the
+	 * join.
+	 *
+	 * Note that this test only detects use of inner-rel attributes in higher
+	 * join conditions and the target list.  There might be such attributes in
+	 * pushed-down conditions at this join, too.  We check that case below.
+	 *
+	 * As a micro-optimization, it seems better to start with max_attr and
+	 * count down rather than starting with min_attr and counting up, on the
+	 * theory that the system attributes are somewhat less likely to be wanted
+	 * and should be tested last.
+	 */
+	for (attroff = innerrel->max_attr - innerrel->min_attr;
+		 attroff >= 0;
+		 attroff--)
+	{
+		if (!bms_is_subset(innerrel->attr_needed[attroff], joinrelids))
+			return false;
+	}
+
+	/*
+	 * Similarly check that the inner rel isn't needed by any PlaceHolderVars
+	 * that will be used above the join.  We only need to fail if such a PHV
+	 * actually references some inner-rel attributes; but the correct check
+	 * for that is relatively expensive, so we first check against ph_eval_at,
+	 * which must mention the inner rel if the PHV uses any inner-rel attrs as
+	 * non-lateral references.  Note that if the PHV's syntactic scope is just
+	 * the inner rel, we can't drop the rel even if the PHV is variable-free.
+	 */
+	foreach(l, root->placeholder_list)
+	{
+		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
+
+		if (bms_is_subset(phinfo->ph_needed, joinrelids))
+			continue;			/* PHV is not used above the join */
+		if (bms_overlap(phinfo->ph_lateral, innerrel->relids))
+			return false;		/* it references innerrel laterally */
+		if (!bms_overlap(phinfo->ph_eval_at, innerrel->relids))
+			continue;			/* it definitely doesn't reference innerrel */
+		if (bms_is_subset(phinfo->ph_eval_at, innerrel->relids))
+			return false;		/* there isn't any other place to eval PHV */
+		if (bms_overlap(pull_varnos((Node *) phinfo->ph_var->phexpr),
+						innerrel->relids))
+			return false;		/* it does reference innerrel */
+	}
+
+	return true;
+}
+
 
 /*
  * Remove the target relid from the planner's data structures, having
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 8f9ae4f..00c746c 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -132,12 +132,12 @@ static BitmapOr *make_bitmap_or(List *bitmapplans);
 static NestLoop *make_nestloop(List *tlist,
 			  List *joinclauses, List *otherclauses, List *nestParams,
 			  Plan *lefttree, Plan *righttree,
-			  JoinType jointype);
+			  JoinType jointype, bool unique_inner);
 static HashJoin *make_hashjoin(List *tlist,
 			  List *joinclauses, List *otherclauses,
 			  List *hashclauses,
 			  Plan *lefttree, Plan *righttree,
-			  JoinType jointype);
+			  JoinType jointype, bool unique_inner);
 static Hash *make_hash(Plan *lefttree,
 		  Oid skewTable,
 		  AttrNumber skewColumn,
@@ -152,7 +152,7 @@ static MergeJoin *make_mergejoin(List *tlist,
 			   int *mergestrategies,
 			   bool *mergenullsfirst,
 			   Plan *lefttree, Plan *righttree,
-			   JoinType jointype);
+			   JoinType jointype, bool unique_inner);
 static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
 		  AttrNumber *sortColIdx, Oid *sortOperators,
 		  Oid *collations, bool *nullsFirst,
@@ -2189,7 +2189,8 @@ create_nestloop_plan(PlannerInfo *root,
 							  nestParams,
 							  outer_plan,
 							  inner_plan,
-							  best_path->jointype);
+							  best_path->jointype,
+							  best_path->unique_inner);
 
 	copy_path_costsize(&join_plan->join.plan, &best_path->path);
 
@@ -2483,7 +2484,8 @@ create_mergejoin_plan(PlannerInfo *root,
 							   mergenullsfirst,
 							   outer_plan,
 							   inner_plan,
-							   best_path->jpath.jointype);
+							   best_path->jpath.jointype,
+							   best_path->jpath.unique_inner);
 
 	/* Costs of sort and material steps are included in path cost already */
 	copy_path_costsize(&join_plan->join.plan, &best_path->jpath.path);
@@ -2609,7 +2611,8 @@ create_hashjoin_plan(PlannerInfo *root,
 							  hashclauses,
 							  outer_plan,
 							  (Plan *) hash_plan,
-							  best_path->jpath.jointype);
+							  best_path->jpath.jointype,
+							  best_path->jpath.unique_inner);
 
 	copy_path_costsize(&join_plan->join.plan, &best_path->jpath.path);
 
@@ -3714,7 +3717,8 @@ make_nestloop(List *tlist,
 			  List *nestParams,
 			  Plan *lefttree,
 			  Plan *righttree,
-			  JoinType jointype)
+			  JoinType jointype,
+			  bool unique_inner)
 {
 	NestLoop   *node = makeNode(NestLoop);
 	Plan	   *plan = &node->join.plan;
@@ -3725,6 +3729,7 @@ make_nestloop(List *tlist,
 	plan->lefttree = lefttree;
 	plan->righttree = righttree;
 	node->join.jointype = jointype;
+	node->join.unique_inner = unique_inner;
 	node->join.joinqual = joinclauses;
 	node->nestParams = nestParams;
 
@@ -3738,7 +3743,8 @@ make_hashjoin(List *tlist,
 			  List *hashclauses,
 			  Plan *lefttree,
 			  Plan *righttree,
-			  JoinType jointype)
+			  JoinType jointype,
+			  bool unique_inner)
 {
 	HashJoin   *node = makeNode(HashJoin);
 	Plan	   *plan = &node->join.plan;
@@ -3750,6 +3756,7 @@ make_hashjoin(List *tlist,
 	plan->righttree = righttree;
 	node->hashclauses = hashclauses;
 	node->join.jointype = jointype;
+	node->join.unique_inner = unique_inner;
 	node->join.joinqual = joinclauses;
 
 	return node;
@@ -3798,7 +3805,8 @@ make_mergejoin(List *tlist,
 			   bool *mergenullsfirst,
 			   Plan *lefttree,
 			   Plan *righttree,
-			   JoinType jointype)
+			   JoinType jointype,
+			   bool unique_inner)
 {
 	MergeJoin  *node = makeNode(MergeJoin);
 	Plan	   *plan = &node->join.plan;
@@ -3814,6 +3822,7 @@ make_mergejoin(List *tlist,
 	node->mergeStrategies = mergestrategies;
 	node->mergeNullsFirst = mergenullsfirst;
 	node->join.jointype = jointype;
+	node->join.unique_inner = unique_inner;
 	node->join.joinqual = joinclauses;
 
 	return node;
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 93484a0..b794d8e 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -175,6 +175,12 @@ query_planner(PlannerInfo *root, List *tlist,
 	fix_placeholder_input_needed_levels(root);
 
 	/*
+	 * Analyze all joins and mark which ones can produce at most 1 row
+	 * for each outer row.
+	 */
+	mark_unique_joins(root, joinlist);
+
+	/*
 	 * Remove any useless outer joins.  Ideally this would be done during
 	 * jointree preprocessing, but the necessary information isn't available
 	 * until we've built baserel data structures and classified qual clauses.
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 319e8b2..e6bb11f 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1722,6 +1722,7 @@ create_nestloop_path(PlannerInfo *root,
 								  &restrict_clauses);
 	pathnode->path.pathkeys = pathkeys;
 	pathnode->jointype = jointype;
+	pathnode->unique_inner = sjinfo->unique_inner;
 	pathnode->outerjoinpath = outer_path;
 	pathnode->innerjoinpath = inner_path;
 	pathnode->joinrestrictinfo = restrict_clauses;
@@ -1779,6 +1780,7 @@ create_mergejoin_path(PlannerInfo *root,
 								  &restrict_clauses);
 	pathnode->jpath.path.pathkeys = pathkeys;
 	pathnode->jpath.jointype = jointype;
+	pathnode->jpath.unique_inner = sjinfo->unique_inner;
 	pathnode->jpath.outerjoinpath = outer_path;
 	pathnode->jpath.innerjoinpath = inner_path;
 	pathnode->jpath.joinrestrictinfo = restrict_clauses;
@@ -1847,6 +1849,7 @@ create_hashjoin_path(PlannerInfo *root,
 	 */
 	pathnode->jpath.path.pathkeys = NIL;
 	pathnode->jpath.jointype = jointype;
+	pathnode->jpath.unique_inner = sjinfo->unique_inner;
 	pathnode->jpath.outerjoinpath = outer_path;
 	pathnode->jpath.innerjoinpath = inner_path;
 	pathnode->jpath.joinrestrictinfo = restrict_clauses;
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 41b13b2..c0192b7 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1560,6 +1560,7 @@ typedef struct JoinState
 {
 	PlanState	ps;
 	JoinType	jointype;
+	bool		unique_inner;	/* Inner side will match 0 or 1 outer rows */
 	List	   *joinqual;		/* JOIN quals (in addition to ps.qual) */
 } JoinState;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 48203a0..61a1898 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -524,9 +524,12 @@ typedef struct CustomScan
 /* ----------------
  *		Join node
  *
- * jointype:	rule for joining tuples from left and right subtrees
- * joinqual:	qual conditions that came from JOIN/ON or JOIN/USING
- *				(plan.qual contains conditions that came from WHERE)
+ * jointype:		rule for joining tuples from left and right subtrees
+ * unique_inner:	true if the planner discovered that each inner tuple
+ *					can only match at most 1 outer tuple. Proofs include
+ *					UNIQUE indexes and GROUP BY/DISTINCT clauses etc.
+ * joinqual:		qual conditions that came from JOIN/ON or JOIN/USING
+ *					(plan.qual contains conditions that came from WHERE)
  *
  * When jointype is INNER, joinqual and plan.qual are semantically
  * interchangeable.  For OUTER jointypes, the two are *not* interchangeable;
@@ -541,6 +544,7 @@ typedef struct Join
 {
 	Plan		plan;
 	JoinType	jointype;
+	bool		unique_inner;	/* Inner side will match 0 or 1 outer rows */
 	List	   *joinqual;		/* JOIN quals (in addition to plan.qual) */
 } Join;
 
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 7116496..da28b46 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -1028,6 +1028,8 @@ typedef struct JoinPath
 
 	JoinType	jointype;
 
+	bool		unique_inner;	/* inner side of join is unique */
+
 	Path	   *outerjoinpath;	/* path for the outer side of the join */
 	Path	   *innerjoinpath;	/* path for the inner side of the join */
 
@@ -1404,6 +1406,7 @@ typedef struct SpecialJoinInfo
 	JoinType	jointype;		/* always INNER, LEFT, FULL, SEMI, or ANTI */
 	bool		lhs_strict;		/* joinclause is strict for some LHS rel */
 	bool		delay_upper_joins;		/* can't commute with upper RHS */
+	bool		unique_inner;	/* inner side is unique on join condition */
 	List	   *join_quals;		/* join quals, in implicit-AND list format */
 } SpecialJoinInfo;
 
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 3fdc2cb..c4710de 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -121,6 +121,7 @@ extern RestrictInfo *build_implied_join_equality(Oid opno,
 /*
  * prototypes for plan/analyzejoins.c
  */
+extern void mark_unique_joins(PlannerInfo *root, List *joinlist);
 extern List *remove_useless_joins(PlannerInfo *root, List *joinlist);
 extern bool query_supports_distinctness(Query *query);
 extern bool query_is_distinct_for(Query *query, List *colnos, List *opids);


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic