[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgis-devel
Subject:    Re: [postgis-devel] cost of some functions (e.g. st_within) might be too high after 3ddc11483b
From:       Paul Ramsey <pramsey () cleverelephant ! ca>
Date:       2022-10-20 21:32:54
Message-ID: 8A7E8FB6-97E6-46C3-A35E-F3935D219B61 () cleverelephant ! ca
[Download RAW message or body]

Appreciated. I'm not sure what tests to suggest however, it's quite hard to pin down \
specific workloads. Like, "would like to get parallel plans for CPU bound things", \
but... we don't have a fixed test set for that, and even if we did it would just be \
onen workload. P

> On Oct 18, 2022, at 7:34 PM, Bruce Rindahl <bruce.rindahl@gmail.com> wrote:
> 
> I am willing to set up a test environment on a arm64 machine to run tests for this. \
> I did something like this to test the advantage of a distributed database vs stand \
> alone.  If you could set up a matrix on the requirements (function, PG/PostGIS \
> versions, input data, etc.) I could play with it.  It would be a raspberry pi 4, 64 \
> bit, and 4 or 8GB memory.  Doing it via Jenkins would be also possible.  Advantage \
> on the pi is it is slow enough that improvements should be obvious. 
> On Tue, Oct 18, 2022, 3:49 PM Paul Ramsey <pramsey@cleverelephant.ca> wrote:
> Going to track this here https://trac.osgeo.org/postgis/ticket/5259
> 
> On Tue, Oct 4, 2022 at 7:26 PM Regina Obe <lr@pcorp.us> wrote:
> > 
> > > On Mon, Oct 3, 2022 at 9:59 AM Tomas Vondra
> > > <tomas.vondra@2ndquadrant.com> wrote:
> > > 
> > > > 
> > > > On 9/29/22 23:11, Paul Ramsey wrote:
> > > > > So, historically, we have had these these functions like
> > > > > ST_Contains() that have an implicit index operation within them, so
> > > > > you could do
> > > > > 
> > > > > SELECT * FROM foo WHERE ST_Contains(geom, ...)
> > > > > 
> > > > > and you'd get an efficient plan, even without adding the magical
> > > > > index operator (&&). This was being friendly to our users and
> > > > > following the standard for spatial SQL to the letter.
> > > > > 
> > > > > This was accomplished with a SQL function ST_Contains that wrapped
> > > > > up the C function _ST_Contains() and the magic operator. This trick
> > > > > worked for a long time indeed (and was, in fact, the only solution)
> > > > > but it did force us to not "correctly" cost our (really expensive,
> > > > > compared to other database functions) spatial functions. When we
> > > > > pushed up costs for functions, the planner began to get angry.
> > > > 
> > > > > /*
> > > > > * High costs can only be used for PostGIS 3/PgSQL 12
> > > > > * where the support functions have been used in
> > > > > * place of index SQL inlining.
> > > > > * See https://trac.osgeo.org/postgis/ticket/3675
> > > > > * for sideffects of costing inlined SQL.
> > > > > */
> > > > > #if POSTGIS_PGSQL_VERSION >= 120
> > > > > #define _COST_DEFAULT COST 1
> > > > > #define _COST_LOW COST 50
> > > > > #define _COST_MEDIUM COST 500
> > > > > #define _COST_HIGH COST 10000
> > > > > #else
> > > > > #define _COST_DEFAULT COST 1
> > > > > #define _COST_LOW COST 1
> > > > > #define _COST_MEDIUM COST 10
> > > > > #define _COST_HIGH COST 10
> > > > > #endif
> > > > > 
> > > > > With PgSQL 12, we were no longer in-lining, and we could bump up
> > > > > those costs without any apparently downside, with the upside that we
> > > > > even got more parallel plans for things like joins and so on. So we
> > > > > did that.
> > > > > 
> > > > > Notwithstanding that there were some attempts to get metrics on the
> > > > > cost of some of our functions, we didn't really do anything
> > > > > generating true metrics behind these numbers, except in an
> > > > > order-of-magnitude sense, sort of. So there's no moral reason not to
> > > > > jimmy with them. What we lack is any sort of way of cross-comparing
> > > > > the upsides and downsides of various arbitrary numbers. Maybe we
> > > > > should but HIGH_COST in half? Why not? I don't know? Why? Because in
> > > > > this case it seems to fit better.
> > > > > 
> > > > 
> > > > Maybe. I agree our costing model is not really exact science, so maybe
> > > > tweaking it a bit is the right thing to do. OTOH once the costs get
> > > > into production it's generally harder because it may change behavior
> > > > (even if the negative cases are rare).
> > > > 
> > > > > Since the actual cost of these high cost functions is in fact
> > > > > incredibly variable (depending on size of inputs) there's no actual
> > > > > right answer. Doing some extra work on function level costing could
> > > > > potentially make things less variable, but we'd still have an opaque
> > > > > function cost multiplier in the equation.
> > > > > 
> > > > 
> > > > Yeah.
> > > > 
> > > > > I'm tempted to just bump the value down, and see what happens,
> > > > > navigating by sonar, as it were, since we lack any other basis for
> > > > > these numbers at the moment.
> > > 
> > > Any objections to cutting the HIGH value in half? Honestly we could cut
> > all
> > > values in half. In main? In 3.3? 3.2? comments?
> > > 
> > > P
> > > 
> > I'd feel more comfortable if we did it only for 3.4 and perhaps do a
> > backport later.  I'd like to create some queries in my performance suite to
> > try to exercise some of these and would be easier if I can compare 3.4 and
> > 3.3, which are largely still the same.
> > 
> > I would also half the MEDIUM cost one.
> > 
> > 
> > _______________________________________________
> > postgis-devel mailing list
> > postgis-devel@lists.osgeo.org
> > https://lists.osgeo.org/mailman/listinfo/postgis-devel
> _______________________________________________
> postgis-devel mailing list
> postgis-devel@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
> _______________________________________________
> postgis-devel mailing list
> postgis-devel@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel

_______________________________________________
postgis-devel mailing list
postgis-devel@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/postgis-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic