[prev in list] [next in list] [prev in thread] [next in thread] 

List:       pgsql-patches
Subject:    [PATCHES] WIP Join Removal
From:       Simon Riggs <simon () 2ndQuadrant ! com>
Date:       2008-08-31 9:52:52
Message-ID: 1220176372.4371.118.camel () ebony ! 2ndQuadrant
[Download RAW message or body]

As discussed on 26 June, "Join Removal/Vertical Partitioning", here's a
patch to remove joins in certain circumstances.

Tested and blind reviewed, but this is complex and subtle enough I
welcome and expect your comments on corner cases and missed
complications. (Lord knows, I've been down a few blind alleys writing it
to date...)

Patch works, but there's a bit I haven't finished yet - checking unique
indexes. So patch is marked WIP for now. Depending upon how long
commitfest lasts I may have a fully working version. (Yes, I know its
only 1 hours work, but haven't worked out how yet...)

-- 
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

["join_removal.v4.patch" (join_removal.v4.patch)]

Index: src/backend/optimizer/path/joinrels.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/optimizer/path/joinrels.c,v
retrieving revision 1.94
diff -c -r1.94 joinrels.c
*** src/backend/optimizer/path/joinrels.c	17 Aug 2008 19:40:11 -0000	1.94
--- src/backend/optimizer/path/joinrels.c	31 Aug 2008 09:41:53 -0000
***************
*** 467,472 ****
--- 467,568 ----
  	return true;
  }
  
+ /*
+  * consider_join_removal_path
+  *	   Determine whether a proposed join is removable, and if this results
+  *	   in a cost reduction for this join, remove the join.
+  *
+  *	   Callers *must* have already determined that the relation
+  *	   is on the non-outer side of an outer join.
+  *
+  *	   Note that removal is correct *even* if there are restrictions on the
+  *	   checkrel. That is because missing rows don't alter the overall result,
+  *	   so removing a few more makes no difference at all.
+  *	   XXX should we check that they are all immutable before we ignore them?
+  */
+ static void
+ consider_join_removal_path(RelOptInfo *keeprel, RelOptInfo *checkrel, RelOptInfo *joinrel)
+ {
+ 	bool	removable = false;
+ 	int		attrno;
+ 	int		minattr = 0;
+ 
+ 	if (checkrel->rtekind != RTE_RELATION)
+ 		return;
+ 
+ 	/*
+ 	 * Check that the query targetlist does not contain any non-junk
+ 	 * target entries for the relation. If it does, we cannot remove join.
+ 	 */
+ 	if (checkrel->min_attr > 0)
+ 		minattr = checkrel->min_attr;
+ 
+ 	for (attrno = minattr; attrno < checkrel->max_attr; attrno++)
+ 	{
+ 		if (!bms_is_empty(checkrel->attr_needed[attrno]))
+ 			return;
+ 	}
+ 
+ 	/*
+ 	 * Check that each row of keeprel joins to exactly one row of checkrel. 
+ 	 * So we check whether the columns of the join condition exactly match
+ 	 * a unique index on checkrel. It must be an *exact* match, no more, 
+ 	 * no less. If less, we would fail to prove uniqueness; if more we would 
+ 	 * effectively have an extra restriction condition that must be applied.
+ 	 */
+ 	if (checkrel->indexlist)
+ 	{
+ 		ListCell   *l;
+ 
+ 		foreach(l, checkrel->indexlist)
+ 		{
+ 			IndexOptInfo *info = lfirst(l);
+ 
+ 			/*
+ 			 * Stop searching as soon as we find a matching unique index.
+ 			 * We don't care if the index allows nulls, we only care
+ 			 * if the join cannot return more than one row for keeprel.
+ 			 *
+ 			 * XXX Seems not worth searching partial indexes because those
+ 			 * are only usable with a appropriate restriction, so the
+ 			 * only way they could ever be usable is if there was a
+ 			 * restriction that exactly matched the index predicate,
+ 			 * which is possible but generally unlikely.
+ 			 *
+ 			 * XXX Expressional indexes would only be useful if the 
+ 			 * join condition contained that expression: also unlikely.
+ 			 */
+ 			if (info->unique && !info->indexprs && !info->indpred)
+ 			{
+ 				removable = true;
+ 
+ 				/* check joinrel->joininfo 
+ 				 * but also somehow consider equivalence classes
+ 				 * if (joinrel->has_eclass_joins)
+ 				 */
+ 
+ 				/* if matches then return true */
+ 			}
+ 		}
+ 	}
+ 
+ 	/*
+ 	 * We can now remove join by pulling up child plan from the keeprel.
+ 	 * This needs to be done considering costs, since its possible for
+ 	 * a nested inner indexscan plan to be cheaper. So it isn't
+ 	 * always desirable to remove the join.
+ 	 *
+ 	 * XXX we keep resjunk attributes in the targetlist of keeprel
+ 	 * for now, but these seem to be potentially removable also
+ 	 */
+ 	if (removable && 
+ 		joinrel->cheapest_total_path < keeprel->cheapest_total_path)
+ 	{
+ 		elog(LOG, "join removed");
+ 		joinrel->pathlist = keeprel->pathlist;
+ 		joinrel->joininfo = keeprel->baserestrictinfo;
+ 	}
+ }
  
  /*
   * make_join_rel
***************
*** 597,602 ****
--- 693,699 ----
  			add_paths_to_joinrel(root, joinrel, rel2, rel1,
  								 JOIN_RIGHT, sjinfo,
  								 restrictlist);
+ 			consider_join_removal_path(rel1, rel2, joinrel);
  			break;
  		case JOIN_FULL:
  			if (is_dummy_rel(rel1) && is_dummy_rel(rel2))


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic