[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    Re: [GENERAL] small table, huge table, and a join = slow and tough query. cake inside!
From:       "Ed Fialkowski" <edfialk () gmail ! com>
Date:       2008-05-30 15:58:33
Message-ID: f4c7e92f0805300858n5893ca25t1c39950df0e873c1 () mail ! gmail ! com
[Download RAW message or body]

Hey guys, thanks so much for all the help.  I had never seen the HAVING
clause anywhere.  That solved so many problems.

As for speed, I do not have indexing (most of my sql experience so far has
been on fairly 'small' tables, which I guess would be considered tiny for
everyone else).

test=# VACUUM nei_area_val;
VACUUM
test=# ANALYZE nei_area_val;
ANALYZE

and..I think one of those two helped out quite a bit.  Here's an explain
analyze for one of my more complicated queries: (nei_area_val = "huge",
nei_area = "small")

test=# EXPLAIN ANALYZE SELECT nei_area.fips, AsText(nei_area.the_geom) as
fs_text_geom, nei_area.name, nei_area_val.strpollutantcode,
SUM(nei_area_val.dblemissionnumericvalue_ton) FROM nei_area INNER JOIN
nei_area_val ON nei_area.fips = nei_area_val.fips WHERE
nei_area_val.strpollutantcode='CO' AND the_geom &&
SetSRID('BOX3D(-100.000000 40.000000,-90.000000 50.000000)'::box3d, 4269)
and intersects(the_geom, SetSRID('BOX3D(-100.000000 40.000000,-90.000000 5
0.000000)'::box3d, 4269)) GROUP BY nei_area.fips, nei_area.the_geom,
nei_area.name, nei_area_val.strpollutantcode HAVING
SUM(nei_area_val.dblemissionnumericvalue_ton) > 500;

QUERY
PLAN

-------------------------------------------------------------------------------------- \
-------------------------------------------------------------------------------------- \
-------------------------------------------------------------------------------------- \
-------------------------------------------------------------------------------------- \
---------------------------------------------------------------------------------------------------------------------------------------------------
  GroupAggregate  (cost=106998.22..107035.89 rows=39 width=3797) (actual
time=4712.708..5743.313 rows=230 loops=1)
   Filter: (sum(dblemissionnumericvalue_ton) > 500::double precision)
   ->  Sort  (cost=106998.22..107003.49 rows=2108 width=3797) (actual
time=4708.411..5330.771 rows=15679 loops=1)
         Sort Key: nei_area.fips, nei_area.the_geom, nei_area.name,
nei_area_val.strpollutantcode
         ->  Hash Join  (cost=30627.47..103430.84 rows=2108 width=3797)
(actual time=2555.057..3938.329 rows=15679 loops=1)
               Hash Cond: (nei_area_val.fips = nei_area.fips)
               ->  Seq Scan on nei_area_val  (cost=0.00..72346.21
rows=116288 width=25) (actual time=46.964..2446.264 rows=122885 loops=1)
                     Filter: (strpollutantcode = 'CO'::text)
               ->  Hash  (cost=30626.84..30626.84 rows=50 width=3781)
(actual time=1193.834..1193.834 rows=415 loops=1)
                     ->  Seq Scan on nei_area  (cost=0.00..30626.84 rows=50
width=3781) (actual time=1038.950..1187.324 rows=415 loops=1)
                           Filter: ((the_geom &&
'0103000020AD100000010000000500000000000000000059C0000000000000444000000000000059C0000 \
000000000494000000000008056C0000000000000494000000000008056C0000000000000444000000000000059C00000000000004440'::geometry)
 AND intersects(the_geom,
'0103000020AD100000010000000500000000000000000059C0000000000000444000000000000059C0000 \
000000000494000000000008056C0000000000000494000000000008056C0000000000000444000000000000059C00000000000004440'::geometry))
  Total runtime: 5762.061 ms
(12 rows)



5.7 seconds!  I can live with that! So, maybe it was just the vacuum I
needed?

Anyway, I don't have indexing, I'll read about it and try not to screw
anything up when I add it, but seriously thanks so much all of you!

-Ed


[Attachment #3 (text/html)]

Hey guys, thanks so much for all the help.&nbsp; I had never seen the HAVING clause \
anywhere.&nbsp; That solved so many problems.<br><br>As for speed, I do not have \
indexing (most of my sql experience so far has been on fairly &#39;small&#39; tables, \
which I guess would be considered tiny for everyone else).<br> <br>test=# VACUUM \
nei_area_val;<br> VACUUM<br>
test=# ANALYZE nei_area_val;<br>ANALYZE<br><br>and..I think one of those two helped \
out quite a bit.&nbsp; Here&#39;s an explain analyze for one of my more complicated \
queries: (nei_area_val = &quot;huge&quot;, nei_area = &quot;small&quot;)<br> \
<br>test=# EXPLAIN ANALYZE SELECT nei_area.fips, AsText(nei_area.the_geom) as \
fs_text_geom, <a href="http://nei_area.name">nei_area.name</a>, \
nei_area_val.strpollutantcode, SUM(nei_area_val.dblemissionnumericvalue_ton) FROM \
nei_area INNER JOIN nei_area_val ON nei_area.fips = nei_area_val.fips WHERE \
nei_area_val.strpollutantcode=&#39;CO&#39; AND the_geom &amp;&amp; \
SetSRID(&#39;BOX3D(-100.000000 40.000000,-90.000000 50.000000)&#39;::box3d, 4269) and \
intersects(the_geom, SetSRID(&#39;BOX3D(-100.000000 40.000000,-90.000000 5<br> \
0.000000)&#39;::box3d, 4269)) GROUP BY nei_area.fips, nei_area.the_geom, <a \
href="http://nei_area.name">nei_area.name</a>, nei_area_val.strpollutantcode HAVING \
SUM(nei_area_val.dblemissionnumericvalue_ton) &gt; \
500;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n \
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs \
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n \
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs \
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n \
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs \
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n \
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs \
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n \
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs \
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
QUERY PLAN&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs \
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n \
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs \
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n \
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs \
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n \
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs \
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n \
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs \
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n \
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
                <br>
-------------------------------------------------------------------------------------- \
-------------------------------------------------------------------------------------- \
-------------------------------------------------------------------------------------- \
-------------------------------------------------------------------------------------- \
---------------------------------------------------------------------------------------------------------------------------------------------------<br>
 &nbsp;GroupAggregate&nbsp; (cost=106998.22..107035.89 rows=39 width=3797) (actual \
time=4712.708..5743.313 rows=230 loops=1)<br>&nbsp;&nbsp; Filter: \
(sum(dblemissionnumericvalue_ton) &gt; 500::double precision)<br>&nbsp;&nbsp; \
-&gt;&nbsp; Sort&nbsp; (cost=106998.22..107003.49 rows=2108 width=3797) (actual \
time=4708.411..5330.771 rows=15679 loops=1)<br> \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Sort Key: nei_area.fips, \
nei_area.the_geom, <a href="http://nei_area.name">nei_area.name</a>, \
nei_area_val.strpollutantcode<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
-&gt;&nbsp; Hash Join&nbsp; (cost=30627.47..103430.84 rows=2108 width=3797) (actual \
time=2555.057..3938.329 rows=15679 loops=1)<br> \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
Hash Cond: (nei_area_val.fips = \
nei_area.fips)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
-&gt;&nbsp; Seq Scan on nei_area_val&nbsp; (cost=0.00..72346.21 rows=116288 width=25) \
(actual time=46.964..2446.264 rows=122885 \
loops=1)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
Filter: (strpollutantcode = &#39;CO&#39;::text)<br> \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
-&gt;&nbsp; Hash&nbsp; (cost=30626.84..30626.84 rows=50 width=3781) (actual \
time=1193.834..1193.834 rows=415 \
loops=1)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
-&gt;&nbsp; Seq Scan on nei_area&nbsp; (cost=0.00..30626.84 rows=50 width=3781) \
(actual time=1038.950..1187.324 rows=415 loops=1)<br> \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
Filter: ((the_geom &amp;&amp; \
&#39;0103000020AD100000010000000500000000000000000059C0000000000000444000000000000059C \
0000000000000494000000000008056C0000000000000494000000000008056C0000000000000444000000000000059C00000000000004440&#39;::geometry) \
AND intersects(the_geom, \
&#39;0103000020AD100000010000000500000000000000000059C0000000000000444000000000000059C \
0000000000000494000000000008056C0000000000000494000000000008056C0000000000000444000000000000059C00000000000004440&#39;::geometry))<br>
 &nbsp;Total runtime: 5762.061 ms<br>(12 rows)<br><br><br><br>5.7 seconds!&nbsp; I \
can live with that! So, maybe it was just the vacuum I needed?&nbsp; <br><br>Anyway, \
I don&#39;t have indexing, I&#39;ll read about it and try not to screw anything up \
when I add it, but seriously thanks so much all of you!<br> <br>-Ed<br>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic