[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgis-users
Subject:    Re: [postgis-users] Changed intersection test speed after PG 11 -> 12 and PostGIS 2.5 -> 3 upgrade
From:       Tom Kazimiers <tom () voodoo-arts ! net>
Date:       2020-02-04 21:01:45
Message-ID: 20200204210145.nz45w6f5wrzk4ute () dewberry ! localdomain
[Download RAW message or body]

> > Thanks for the report.  I've ticketed as a bug - 
> > https://trac.osgeo.org/postgis/ticket/4635

Thanks for filing the ticket Regina!

On Tue, Feb 04, 2020 at 11:49:44AM -0800, Paul Ramsey wrote:
> OK, cleaning this out a little.
> Ticket for reference https://trac.osgeo.org/postgis/ticket/4635#comment:2

"Good" to know that you seem to be able to reproduce this slow-down.

> First, drop all the rest of your testing SQL and just test the contents 
> of the bb_edge CTE. That seems to be sufficient, and it's the only 
> spatial part.

Good point, I should have restricted it to that from beginning.

> Second, for testing 3.0, drop the use of the &&& operator and see what happens with \
> just the ST_DWithin3D function call, which should be throwing an index op in there \
> implicitly. 
> SELECT Count(*) FROM (
> SELECT te.id, te.parent_id
> FROM treenode_edge te
> WHERE ST_3DDWithin(te.edge, ST_MakePolygon(ST_MakeLine(ARRAY[
> ST_MakePoint(471548.0,  290140.0,    160420.0),
> ST_MakePoint(542460.0, 290140.0,    160420.0),
> ST_MakePoint(542460.0, 330140.0, 160420.0),
> ST_MakePoint(471548.0,  330140.0, 160420.0),
> ST_MakePoint(471548.0,  290140.0,    160420.0)]::geometry[])),
> 20.0)
> ) a;
> 
> We are definitely seeing different plans in that CTE.

This is what I get in the test database with 12/3:

  Aggregate  (cost=506.38..506.41 rows=1 width=8) (actual time=1.530..1.530 rows=1 \
loops=1)  Buffers: shared hit=198
    ->  Index Scan using treenode_edge_gix on treenode_edge te  (cost=0.53..506.29 \
rows=33 width=0) (actual time=0.168..1.512 rows=143 loops=1)  Index Cond: (edge &&& \
st_expand('0103000080010000000500000000000000F0C71C410000000070B5114100000000209503410 \
0000000F88D20410000000070B51141000000002095034100000000F88D204100000000702614410000000 \
02095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, \
'20'::double precision))  Filter: ((project_id = 1) AND st_3ddwithin(edge, \
'0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88 \
D20410000000070B51141000000002095034100000000F88D2041000000007026144100000000209503410 \
0000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, \
'20'::double precision))  Buffers: shared hit=198
  Planning Time: 0.654 ms
  Execution Time: 1.583 ms

Dropping the &&& operator still allows the use of the index it seems and 
is noticeably faster. (In my original query, I used both &&& and 
ST_3DDWithin, because I wanted to reduce the number of false positives 
that I would get with only one of them in some densely populated areas.  
The &&& would make sure every result object bounding box intersects with 
the query bounding box and the ST_3DDWithin would ensure I don't get 
edges that are actually only passing by somewhere close, but not 
actually intersecting the field of view. The line/geometry used with 
ST_3DDWithin is in fact a flat cut through the center of the query 
bounding box. Therefore I believe I still need the &&& in production, 
unfortunately.)

This is what I get for the above query with the test database in the 
11/2.5 setup:

  Finalize Aggregate  (cost=10302.14..10302.17 rows=1 width=8) (actual \
time=43.245..43.245 rows=1 loops=1)  Buffers: shared hit=3768
    ->  Gather  (cost=10301.91..10302.14 rows=2 width=8) (actual time=43.125..47.544 \
rows=3 loops=1)  Workers Planned: 2
          Workers Launched: 2
          Buffers: shared hit=3768
          ->  Partial Aggregate  (cost=9301.91..9301.94 rows=1 width=8) (actual \
time=27.190..27.190 rows=1 loops=3)  Buffers: shared hit=3768
                ->  Parallel Bitmap Heap Scan on treenode_edge te  \
(cost=693.46..9299.66 rows=899 width=0) (actual time=4.058..27.175 rows=48 loops=3)  \
Recheck Cond: (edge && \
'0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C \
71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C09503410 \
0000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry)
  Filter: ((project_id = 1) AND \
('0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F8 \
8D20410000000070B51141000000002095034100000000F88D204100000000702614410000000020950341 \
00000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry \
&& st_expand(edge, '20'::double precision)) AND _st_3ddwithin(edge, \
'0103000080010000000500000000000000F0C71C410000000070B51141000000002095034100000000F88 \
D20410000000070B51141000000002095034100000000F88D2041000000007026144100000000209503410 \
0000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, \
'20'::double precision))  Rows Removed by Filter: 10800
                      Heap Blocks: exact=2171
                      Buffers: shared hit=3768
                      ->  Bitmap Index Scan on treenode_edge_2d_gist  \
(cost=0.00..692.92 rows=32352 width=0) (actual time=5.430..5.430 rows=32543 loops=1)  \
Index Cond: (edge && \
'0103000080010000000500000000000000A0C71C410000000020B51141000000008094034100000000A0C \
71C4100000000C0261441000000008094034100000000208E204100000000C026144100000000C09503410 \
0000000208E20410000000020B5114100000000C095034100000000A0C71C410000000020B511410000000080940341'::geometry)
  Buffers: shared hit=261
  Planning Time: 22.836 ms
  Execution Time: 47.885 ms

> We are definitely seeing different plans in that CTE.

Out of curiosity, what is the plan you are seeing?

> Testing for 12/3:
> 
> 12/3 is picking an index scan on the spatial index, which is turning out to be \
> slower. You can adjust the COST of the ST_3DDWithin down: default cost is 10000.
> Where does the plan change as you adjust it down?
> ALTER FUNCTION ST_3DDWithin COST 9000;

Once I set it to 608 and lower, this is what I get:

  Aggregate  (cost=59.66..59.69 rows=1 width=8) (actual time=1.436..1.436 rows=1 \
loops=1)  Buffers: shared hit=183
    ->  Bitmap Heap Scan on treenode_edge te  (cost=2.18..59.58 rows=33 width=0) \
(actual time=1.055..1.418 rows=143 loops=1)  Filter: ((project_id = 1) AND \
st_3ddwithin(edge, '0103000080010000000500000000000000F0C71C410000000070B5114100000000 \
2095034100000000F88D20410000000070B51141000000002095034100000000F88D204100000000702614 \
41000000002095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, \
'20'::double precision))  Heap Blocks: exact=119
          Buffers: shared hit=183
          ->  Bitmap Index Scan on treenode_edge_gix  (cost=0.00..2.17 rows=19 \
width=0) (actual time=1.027..1.027 rows=143 loops=1)  Index Cond: (edge &&& \
st_expand('0103000080010000000500000000000000F0C71C410000000070B5114100000000209503410 \
0000000F88D20410000000070B51141000000002095034100000000F88D204100000000702614410000000 \
02095034100000000F0C71C410000000070261441000000002095034100000000F0C71C410000000070B511410000000020950341'::geometry, \
'20'::double precision))  Buffers: shared hit=64
  Planning Time: 0.703 ms
  Execution Time: 1.516 ms

The planner doesn't come up with any other plan if I set it to anything 
lower than 608.

> A big change between 2.5 and 3.0 was costing on spatial functions, so 
> seeing different plans is not surprising, especially since the 
> mechanisms for building the plan have changed so much (SQL wrapper 
> functions vs Pg12 support functions). It's possible that with the cost 
> of the function set so high, and actually being seen by the planner 
> now, we're getting an index scan that, in this case, is less efficient 
> than the alternative.

I see, thanks for the explanation. In my the 12/3 version of my 
production database (50 million edges), I can't reproduce the cost based 
plan change from above though. No matter how low I set the function cost 
there, I always get the spatial scan in the plan.

Cheers,
Tom
_______________________________________________
postgis-users mailing list
postgis-users@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/postgis-users


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic