'Re: [GENERAL] why hash on the primary key?'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    Re: [GENERAL] why hash on the primary key?
From:       "Robert Haas" <robertmhaas () gmail ! com>
Date:       2008-11-29 16:42:33
Message-ID: 603c8f070811290842r214628a8h5f1503b44a07c037 () mail ! gmail ! com
[Download RAW message or body]

>>> Well, that just says your cost parameters need a bit of adjustment
>>> if you'd like the planner to get the crossover point exactly right.
>
>> Any sense of which ones might be worth fiddling with?
>
> random_page_cost, effective_cache_size, maybe the cpu_xxx parameters.

I fiddled with this some more on a more complex query with more than
20 joins.  It appears that the culprit is from_collapse_limit.  If I
raise from_collapse_limit, most of the hash joins turn into Nested
Loops or Nested Loop Left Joins with Index Scans.  The estimated cost
is about 20% of the cost of the original plan, but the planning takes
so much longer that the actual time is 60% larger.  This is really the
pits.

I had hoped that the work Simon Riggs was doing on join removal would
hit 8.4, but that doesn't seem likely at this point.  The problem is
that the application is a web app where the user can select which
columns they want to see.  Most of the time only ~6-8 columns of 40+
that are available are selected, and the joins needed to generate the
unselected columns can be tossed (which would hopefully both avoid
unnecessary join steps and also speed up planning the remaining
joins).

...Robert

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic