[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgis-devel
Subject:    Re: [postgis-devel] Query Optimizer on outdb rasters
From:       Bborie Park <dustymugs () gmail ! com>
Date:       2015-06-19 1:48:31
Message-ID: CAKVfRvH54G-YgzEjMO-t8KQKArt11WAeFni7p_EKOkXdNf+4gg () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Heh. Super late. But, I believe the problem has something to do with the
out-db and how it plays with PostgreSQL's cache, or more specifically, the
lack there of. In-db, once warm, is essentially using some cache
(PostgreSQL or the OS).

Out-db is a constant hit. How is your out-db stored? Local store? Shared
file-system like NFS? If shared, using some caching?

-bborie

On Tue, May 19, 2015 at 1:53 PM, David Haynes II <dahaynes@umn.edu> wrote:

> Hello,
> 
> I have a question about the query optimizer and its performance on spatial
> datasets, specifically rasters. My use case is rather unique, the
> application that I am developing allows researchers to request
> summarizations of various geographic boundaries around the world. Therefore
> our raster datasets are global. We are conducting some benchmarks for our
> system and we noticed something unexpected. Hopefully we are doing
> something wrong.
> 
> The query is the same except the first is run on a raster (46gigs) in
> outdb and the second is the same raster (46gigs) stored indb.
> The raster is multibanded (13), with each band representing one entire
> MODIS global scene. A single year of MODIS is approximately 3.6 gigs.
> 
> The outdb is being out performed by indb, because the query optimizer gets
> smarter. In the explain syntax you can all see all the extra pieces that
> are brought in with oudb.
> 
> with poly as
> ( SELECT gid, label as name, ST_Transform(geom, 6842) as geom  FROM
> us_counties )
> , rast_select as
> ( SELECT r.rid as id, s.name , ST_CLIP(r.rast, s.geom) as rast from
> rasters.glc2000 as r inner join poly as s on ST_Intersects(r.rast, s.geom) )
> select r.id, r.name, ST_Count(r.rast, 1, True)
> 
> 
> QUERY PLAN With Outdb
> 
> --------------------------------------------------------------------------------------------------
>  Sort   (cost=93911.29..93926.80 rows=6204 width=254)
> Sort Key: r.id, r.name
> CTE poly
> ->  Seq Scan on us_counties  (cost=0.00..112.86 rows=3109 width=62247)
> CTE rast_select
> ->  Nested Loop  (cost=0.28..76131.41 rows=62033 width=1086)
> ->  CTE Scan on poly s  (cost=0.00..62.18 rows=3109 width=250)
> ->  Index Scan using modis_rast_gist on modis r_1
> (cost=0.28..24.40 rows=2 width=836)
> Index Cond: ((rast)::geometry && s.geom)
> Filter: _st_intersects(s.geom, rast, NULL::integer)
> ->  HashAggregate  (cost=17214.16..17276.20 rows=6204 width=254)
> ->  CTE Scan on rast_select r  (cost=0.00..1240.66 rows=62033
> width=254)
> 
> QUERY PLAN With Indb
> 
> -------------------------------------------------------------------------------------------------------------
>  Sort   (cost=69547.29..69562.80 rows=6204 width=254)
> Sort Key: r.id, r.name
> CTE poly
> ->  Seq Scan on us_counties  (cost=0.00..112.86 rows=3109 width=62247)
> CTE rast_select
> ->  Nested Loop  (cost=0.28..51767.41 rows=62033 width=272)
> ->  CTE Scan on poly s  (cost=0.00..62.18 rows=3109 width=250)
> ->  Index Scan using modis_noout_rast_gist on modis_noout r_1
> (cost=0.28..16.56 rows=2 width=22)
> Index Cond: ((rast)::geometry && s.geom)
> Filter: _st_intersects(s.geom, rast, NULL::integer)
> ->  HashAggregate  (cost=17214.16..17276.20 rows=6204 width=254)
> ->  CTE Scan on rast_select r  (cost=0.00..1240.66 rows=62033
> width=254)
> --
> David Haynes, Ph.D.
> Research Associate Terra Populus
> Minnesota Population Center
> www.terrapop.org
> 
> _______________________________________________
> postgis-devel mailing list
> postgis-devel@lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
> 


[Attachment #5 (text/html)]

<div dir="ltr">Heh. Super late. But, I believe the problem has something to do with \
the out-db and how it plays with PostgreSQL&#39;s cache, or more specifically, the \
lack there of. In-db, once warm, is essentially using some cache (PostgreSQL or the \
OS).<div><br></div><div>Out-db is a constant hit. How is your out-db stored? Local \
store? Shared file-system like NFS? If shared, using some \
caching?</div><div><br></div><div>-bborie</div></div><div \
class="gmail_extra"><br><div class="gmail_quote">On Tue, May 19, 2015 at 1:53 PM, \
David Haynes II <span dir="ltr">&lt;<a href="mailto:dahaynes@umn.edu" \
target="_blank">dahaynes@umn.edu</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr"><div><div>Hello,</div><div><br></div><div>I \
have a question about the query optimizer and its performance on spatial datasets, \
specifically rasters. My use case is rather unique, the application that I am \
developing allows researchers to request summarizations of various geographic \
boundaries around the world. Therefore our raster datasets are global. We are \
conducting some benchmarks for our system and we noticed something unexpected. \
Hopefully we are doing something wrong.</div><div><br></div><div>The query is the \
same except the first is run on a raster (46gigs) in outdb and the second is the same \
raster (46gigs) stored indb.</div><div>The raster is multibanded (13), with each band \
representing one entire MODIS global scene. A single year of MODIS is approximately \
3.6 gigs.</div><div><br></div><div>The outdb is being out performed by indb, because \
the query optimizer gets smarter. In the explain syntax you can all see all the extra \
pieces that are brought in with oudb.  </div><div><br></div><div>with poly as  \
</div><div>( SELECT gid, label as name, ST_Transform(geom, 6842) as geom   FROM \
us_counties )</div><div>, rast_select as</div><div>( SELECT r.rid as id, <a \
href="http://s.name" target="_blank">s.name</a> , ST_CLIP(r.rast, s.geom) as rast \
from rasters.glc2000 as r inner join poly as s on ST_Intersects(r.rast, s.geom) \
)</div><div>select <a href="http://r.id" target="_blank">r.id</a>, <a \
href="http://r.name" target="_blank">r.name</a>, ST_Count(r.rast, 1, \
True)</div><div><span style="white-space:pre-wrap">										</span>     \
</div><div><span style="white-space:pre-wrap">										</span>     </div><div>QUERY \
PLAN With Outdb                                                                  \
<span style="white-space:pre-wrap">	</span></div><div>--------------------------------------------------------------------------------------------------<span \
style="white-space:pre-wrap">	</span></div><div>Sort    (cost=93911.29..93926.80 \
rows=6204 width=254)<span style="white-space:pre-wrap">	</span></div><div>   Sort \
Key: <a href="http://r.id" target="_blank">r.id</a>, <a href="http://r.name" \
target="_blank">r.name</a><span style="white-space:pre-wrap">	</span></div><div>   \
CTE poly<span style="white-space:pre-wrap">	</span></div><div>      -&gt;   Seq Scan \
on us_counties   (cost=0.00..112.86 rows=3109 width=62247)<span \
style="white-space:pre-wrap">	</span></div><div>   CTE rast_select<span \
style="white-space:pre-wrap">	</span></div><div>      -&gt;   Nested Loop   \
(cost=0.28..76131.41 rows=62033 width=1086)<span \
style="white-space:pre-wrap">	</span></div><div>               -&gt;   CTE Scan on \
poly s   (cost=0.00..62.18 rows=3109 width=250)<span \
style="white-space:pre-wrap">	</span></div><div>               -&gt;   Index Scan \
using modis_rast_gist on modis r_1   (cost=0.28..24.40 rows=2 width=836)<span \
style="white-space:pre-wrap">	</span></div><div>                        Index Cond: \
((rast)::geometry &amp;&amp; s.geom)<span \
style="white-space:pre-wrap">	</span></div><div>                        Filter: \
_st_intersects(s.geom, rast, NULL::integer)<span \
style="white-space:pre-wrap">	</span></div><div>   -&gt;   HashAggregate   \
(cost=17214.16..17276.20 rows=6204 width=254)<span \
style="white-space:pre-wrap">	</span></div><div>            -&gt;   CTE Scan on \
rast_select r   (cost=0.00..1240.66 rows=62033 width=254)<span \
style="white-space:pre-wrap">		</span></div><div><br></div><div>QUERY PLAN With Indb  \
<span style="white-space:pre-wrap">	</span></div><div>-------------------------------- \
-----------------------------------------------------------------------------<span \
style="white-space:pre-wrap">	</span></div><div>Sort    (cost=69547.29..69562.80 \
rows=6204 width=254)<span style="white-space:pre-wrap">	</span></div><div>   Sort \
Key: <a href="http://r.id" target="_blank">r.id</a>, <a href="http://r.name" \
target="_blank">r.name</a><span style="white-space:pre-wrap">	</span></div><div>   \
CTE poly<span style="white-space:pre-wrap">	</span></div><div>      -&gt;   Seq Scan \
on us_counties   (cost=0.00..112.86 rows=3109 width=62247)<span \
style="white-space:pre-wrap">	</span></div><div>   CTE rast_select<span \
style="white-space:pre-wrap">	</span></div><div>      -&gt;   Nested Loop   \
(cost=0.28..51767.41 rows=62033 width=272)<span \
style="white-space:pre-wrap">	</span></div><div>               -&gt;   CTE Scan on \
poly s   (cost=0.00..62.18 rows=3109 width=250)<span \
style="white-space:pre-wrap">	</span></div><div>               -&gt;   Index Scan \
using modis_noout_rast_gist on modis_noout r_1   (cost=0.28..16.56 rows=2 \
width=22)<span style="white-space:pre-wrap">	</span></div><div>                       \
Index Cond: ((rast)::geometry &amp;&amp; s.geom)<span \
style="white-space:pre-wrap">	</span></div><div>                        Filter: \
_st_intersects(s.geom, rast, NULL::integer)<span \
style="white-space:pre-wrap">	</span></div><div>   -&gt;   HashAggregate   \
(cost=17214.16..17276.20 rows=6204 width=254)<span \
style="white-space:pre-wrap">	</span></div><div>            -&gt;   CTE Scan on \
rast_select r   (cost=0.00..1240.66 rows=62033 width=254)</div></div><span \
class="HOEnZb"><font color="#888888">-- <br><div><div dir="ltr">David Haynes, \
Ph.D.<div>Research Associate Terra Populus</div><div>Minnesota Population \
Center</div><div><a href="http://www.terrapop.org" \
target="_blank">www.terrapop.org</a></div></div></div> </font></span></div>
<br>_______________________________________________<br>
postgis-devel mailing list<br>
<a href="mailto:postgis-devel@lists.osgeo.org">postgis-devel@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel" \
rel="noreferrer" target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel</a><br></blockquote></div><br></div>




_______________________________________________
postgis-devel mailing list
postgis-devel@lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic