[prev in list] [next in list] [prev in thread] [next in thread] 

List:       pgsql-performance
Subject:    Re: creating hash indexes
From:       Peter Geoghegan <pg () bowt ! ie>
Date:       2022-12-14 20:28:47
Message-ID: CAH2-WzkGNid0VNCL7Sq+=F6fnzWzvDeuvQLi1MKpJfgpzwoe8Q () mail ! gmail ! com
[Download RAW message or body]

On Wed, Dec 14, 2022 at 12:03 PM Rick Otten <rottenwindfish@gmail.com> wrot=
e:
> Assuming I can live with the slower inserts, is there any parameter in pa=
rticular I can tweak that would make the time it takes to create the hash i=
ndex closer to the btree index creation time?  In particular if I wanted to=
 try this on a several billion row table in a busy database?

No. B-Tree index builds are parallelized, and are far better optimized
in general.

> -  As long as the index fits in memory, varchar btree isn't really that m=
uch slower in postgresql 14 (the way it was a few years ago), so we'll prob=
ably just live with that for the forseeable future given the complexity of =
changing things at the moment.

The other things to consider are 1.) the index size after retail
inserts, 2.) the index size following some number of updates and
deletes.

Even if you just had plain inserts for your production workload, the
picture will not match your test case (which I gather just looked at
the index size after a CREATE INDEX ran). I think that B-Tree indexes
will still come out ahead if you take this growth into account, and by
quite a bit, but probably not due to any effect that your existing test cas=
e
exercises.

B-Tree indexes are good at accommodating unpredictable growth, without
ever getting terrible performance on any metric of interest. So it's
not just that they tend to have better performance on average than
hash indexes (though they do); it's that they have much more
*predictable* performance characteristics as conditions change.

--
Peter Geoghegan


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic