[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgis-users
Subject:    RE: [postgis-users] point-in-polygon SQL sentence performance
From:       "Obe, Regina" <robe.dnd () cityofboston ! gov>
Date:       2008-06-30 19:09:14
Message-ID: 53F9CF533E1AA14EA1F8C5C08ABC08D20197A0F2 () ZDND ! DND ! boston ! cob
[Download RAW message or body]

--===============0376696906==
Content-class: urn:content-classes:message
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C8DAE4.C98BD357"

This is a multi-part message in MIME format.



Pere,
 
I'm still not clear why you are creating an individual user_points table for each and \
deleting it.  I think I understand a little better, but I'm still guessing a bit.  
Below is how I would do it without creating a temp table for each user.  By the way \
ST_Contains is a simpler way of achieving for A && B And Contains(A,B)  
INSERT INTO user_points(geom, specie,genus,code, userid, the_geom)
SELECT p.geom, p.specie, p.genus, poly.code, p.userid, p.the_geom
FROM points  p INNER JOIN polygon poly ON ST_Contains(poly.the_geom, p.the_geom)
WHERE p.userid = '$user'  
 
(You can leave the WHERE out if you plan to do it for all users - it will save you \
the trouble to just do it in one step.  Although sounds like you maybe creating these \
temp tables for the logged in user)  
 
You may also want to replace ST_Contains with ST_Intersects (I'm nost sure about \
performance differences).  although not sure how all these work in long lat since \
they assume cartesian rather than spherical.  I think it should be fine.  
 
--Next Polygon part - I am assuming genus, species etc are counts (or are those just \
words)  at each point location and you want for each user  to count the total of each \
they captured in each  (you would use SUM not COUNT for this - this is just a guess \
on my part - they may be actual genus, species etc - in which case just throw out \
                those fields)
--To make this efficient, you will want to put a btree index on user_points.code and \
polygon.poly_code  
INSERT INTO user_polygon (userid, polygon_code,the_geom, numrecords,numgenus, \
numspecies) SELECT p.userid, polygon.polygon_code, polygon.the_geom, COUNT(p.code),
SUM(genus), SUM(species)
FROM user_points p
               INNER JOIN polygon ON p.code = polygon.poly_code
GROUP BY p.userid, polygon.polygon_code, polygon.the_geom;
 
---If you don't really need user_points except for rolling up to user_polygon, then \
you can forget about all I said above and just do a one statement sql  
INSERT INTO user_polygon (userid, polygon_code,the_geom, numrecords,numgenus, \
numspecies) SELECT p.userid, polygon.polygon_code, polygon.the_geom, COUNT(p.code),
SUM(genus), SUM(species)
FROM user_points p
               INNER JOIN polygon ON ST_Contains(polygon.the_geom, p.the_geom)
GROUP BY p.userid, polygon.polygon_code, polygon.the_geom;
 
(normally I would frown on putting the_geom in a group by since it does an extent \
like grouping, but in this case I think its fine because your polygon.polygon_code \
will guarantee uniqueness. So the extra the_geom is a necessary but redundant \
grouping)  
IF you still need to do an update  - some how you took my example and rewoded it so \
it does something completely different from what I had intended.  It should read like \
below not what you reiterated (e.g.  standard form is UPDATE sometable1 
             FROM sometable2 
      WHERE sometable1.somefield1 = sometable2.somefield2  
 
my sometable2 in this case is a subselect aliased as u
 
(SELECT
          COUNT(user_point.code) as cnt, user_point.code  
            FROM user_point  GROUP BY
user_point.code) as u  ;
 
---
 
  UPDATE user_polygon SET numtax = u.cnt 
FROM  (SELECT
          COUNT(user_point.code) as cnt, user_point.code  
            FROM user_point  GROUP BY
user_point.code) as u  
WHERE u.code = user_polygon.code;

 
Hope that helps,
Regina
 
 
 
 

________________________________

From: postgis-users-bounces@postgis.refractions.net on behalf of pere roca
Sent: Mon 6/30/2008 12:40 PM
To: postgis-users@postgis.refractions.net
Subject: RE: [postgis-users] point-in-polygon SQL sentence performance





   dear Regina,

   first, thanks to take some time.
   yes, it's a little complicated to explain; I will try again:
  -My webapp lets user insert point data; all users insert this data in the
same table (point)
  -I have several polygon tables in postGIS (to cross with point data)

I thought about:

Updating an unique,exclusive user_point table with fields: geom, specie,
genus... and code. All fields except the code are extracted from point table
using $user as identifier ($user comes to PHP from user identification). The
code for each point is extracted using CONTAINS and && sentence (see the PHP
script) from polygon table.

A user_polygon table will have the geom, id... and fields like
number_of_records (see SET numtax below), number_genera that require
counting how many records fall inside EACH polygon. This fields are filled
with the COUNT sentences taking into consideration the previous point table
updating.

I've tried the sentence below, following your code:
  UPDATE user_polygon SET numtax = (select u.cnt from (SELECT
COUNT(user_point.code) as cnt, user_point.code  FROM user_point  GROUP BY
user_point.code) as u  WHERE u.code = user_polygon.code);";

The sentence inside (select u.cnt...) goes very very fast, but "loses"  a
lot of time (also takes 1min 20sec) comparing WHERE u.code =
user_polygon.code. I think this is the problem, but there is some solution
to this? isn't the main idea good? everything goes fast except this last
part.

After executing all the SQL sentences, data from  user_point and
user_polygon is deleted (but not the table!).

Some answers:
-"1) Are you using inherited tables and having a different table for each
user?". No inherited tables (was wrong using ONLY parameter). A user = a
point and a polygon table (info deleted each time session expires)
-"How big is your user_polygon table". My testings are based on a 20 Mb
"Toast table size", 544 Kb Table size. It has 1991 records.

  Thanks again
 
  Pere Roca

Obe, Regina     DND\MIS wrote:
> 
> Pere,
> 
> Somehow I think your whole code is needlessly complicated, but I still
> don't have a clear idea what you are trying to do and looking at your
> code is actually more confusing than anything.  Maybe try to restate
> exactly what you are trying to do minus the code. 
> 
> I'm guessing you are trying to do this (by the way wrapping a (SELECT
> ...) like that is very non-standard)
> 
> INSERT INTO user_polygon (userid, polygon_code,the_geom)
> SELECT
> distinct(polygon_code),the_geom FROM polygon where polygon."."the_geom
> &&
> GeometryFromText('".$row['astext']."',4326);
> 
> 
> 1) Are you using inherited tables and having a different table for each
> user?  I'm assuming that is why you are doing
> 
> UPDATE ONLY
> 
> rather than UPDATE  (UPDATE ONLY really only makes sense if you are
> using inherited tables and don't want the update to cascade to child
> tables, but then if its inherited - its still silly since you are just
> updating child tables anyway)
> 
> 
> 2) Where is that $user coming from.  Make sure you know where that $user
> is coming from and you properly sanitize it otherwise its a prime target
> for SQL Injection attack.
> 
> 3) How big is your user_polygon table?  Speed would depend on that
> 
> - this particular statement you have is inefficient - because it is
> either doing a correlated subselect for every record or its updating
> every record to the same value. Since you are not prefixing things - its
> hard for me to tell what exactly this is doing.  This I suppose would
> make sense if you are creating a different table for each user
> 
> UPDATE ONLY user_polygon SET numreg = (SELECT COUNT (code) FROM
> user_points
> WHERE name = polygon_code);
> 
> but its generally more efficient to write a sub select
> 
> UPDATE user_polygon SET numreg = u.cnt
> FROM (SELECT COUNT(up.code) as cnt, up.name
> FROM user_point up
> GROUP BY up.name) u
> WHERE u.name = user_polygon.polygon_code;
> 
> 
> 3) I have no idea what the point of the first bounding box insert you
> have is for
> since ST_intersects, ST_Contains already have bounding box checks in
> them.  So I suspect this step you have is is not necessary
> 
> For this part
> 
> SELECT
> distinct(polygon_code),the_geom FROM polygon where polygon."."the_geom
> &&
> GeometryFromText('".$row['astext']."',4326)
> 
> 
> 4) Again this statement
> UPDATE ONLY user_polygon SET userid ='$user';
> 
> Is updating every record in user_polygon to the current user.  So if you
> are looping thru a user set which I can only guess you are, this
> statement is overwritting the previous updates.
> 
> If again you have one table per user, then this is fine - but its
> cheaper to do everything in your insert rather than doing an update
> after the fact.  Updates tend to be slower than inserts.
> 
> Hope that helps a bit,
> Regina
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: postgis-users-bounces@postgis.refractions.net
> [mailto:postgis-users-bounces@postgis.refractions.net] On Behalf Of pere
> roca
> Sent: Monday, June 30, 2008 4:25 AM
> To: postgis-users@postgis.refractions.net
> Subject: [postgis-users] point-in-polygon SQL sentence performance
> 
> 
> hi!
> 
> I wanna make a point-in-polygon operation but takes looong time (about 4
> minutes!) and the CPU consuming of the server
> becomes huge when performing this operation for 15000 points /
> polygons:
> 
> I need to count how many user points fall in each polygon and perform
> some
> operations based on that.
> We initially have a point_table that stores multi-user point data. We
> extract from there only this-user-data and create new tables:
> user_points
> and user_polygon. To the first one we will insert the code of the
> polygon in
> which it falls; counting how many many times the same polygon_code is
> repeated, we will calculate number-points_in_polygon...
> The important code is commented below (calling SQL from PHP):
> 
> //we use the points bbox to narrow the future spatial queries (will only
> select polygons falling inside this bbox; from 1991 initial polygons,
> only
> 85 are from now on used); it is useful when points are close each other
> (if
> there is an outlayer then bbox is not useful...)
> 
> $extent="select astext(extent(points_table.the_geom)) from points_table
> where userid='$user'";
> $extent_result=pg_query($extent);
> 
> while ($row = pg_fetch_array($extent_result, NULL, PGSQL_ASSOC))
> {
> //creates a NEW polygon table that will store only polygon code and
> polygon
> geometry that fall INSIDE the point's bbox
> $poligon_sql2="INSERT INTO user_polygon (polygon_code,the_geom)
> (SELECT
> distinct(polygon_code),the_geom FROM polygon where polygon."."the_geom
> &&
> GeometryFromText('".$row['astext']."',4326));
> UPDATE ONLY user_polygon SET userid ='$user';
> 
> //we previously created a new user point table, where we insert some
> info.
> The table is created because we are taking data from a multi-user table
> and
> now we want a single user-->a single point table
> 
> INSERT INTO user_points (genus, species, name, code, usuario, the_geom)
> (SELECT genus, specie, codigo, oid, userid, the_geom FROM points WHERE
> userid = '$user');
> 
> //in this user POINT table we set the code from the polygon's table
> where
> the point falls inside;
> //I expected this one to be the "big" operation , but it's not (takes
> "only"
> 1 min 18 sec)
> 
> UPDATE ONLY user_points SET code = (SELECT polygon_code FROM
> user_polygon
> WHERE user_polygon.the_geom && user_points.the_geom AND CONTAINS
> (user_polygon.the_geom,user_points.the_geom));
> 
> //updating user polygon table, counting how many times do we have a
> polygon
> code in our POINT table --> number of points in each polygon
> // this updating operation takes 1 min 20 seconds!
> 
> UPDATE ONLY user_polygon SET numreg = (SELECT COUNT (code) FROM
> user_points
> WHERE name = polygon_code);
> 
> // this updating operation also takes about 1 min 20 seconds!
> 
> UPDATE ONLY user_polygon SET numtax = (SELECT COUNT (genus) FROM (SELECT
> DISTINCT (genus) FROM user_points WHERE code = polygon_code) AS foo);
> 
> All important data is indexed; Any ideas to get better performance?
> 
> thanks!
> 
> Pere Roca
> EDIT project (http://edit.csic.es/edit_geo/prototype/edit.html)
> 
> 
> --
> View this message in context:
> http://www.nabble.com/point-in-polygon-SQL-sentence-performance-tp181908
> 54p18190854.html
> Sent from the PostGIS - User mailing list archive at Nabble.com.
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> -----------------------------------------
> The substance of this message, including any attachments, may be
> confidential, legally privileged and/or exempt from disclosure
> pursuant to Massachusetts law. It is intended
> solely for the addressee. If you received this in error, please
> contact the sender and delete the material from any computer.
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 
> 

--
View this message in context: \
http://www.nabble.com/point-in-polygon-SQL-sentence-performance-tp18190854p18199519.html
 Sent from the PostGIS - User mailing list archive at Nabble.com.

_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users




-----------------------------------------
The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended
solely for the addressee. If you received this in error, please
contact the sender and delete the material from any computer.


[Attachment #3 (text/html)]

<HTML dir=ltr><HEAD><TITLE>RE: [postgis-users] point-in-polygon SQL sentence \
performance</TITLE> <META http-equiv=Content-Type content="text/html; \
charset=unicode"> <META content="MSHTML 6.00.2900.3354" name=GENERATOR></HEAD>
<BODY>
<DIV id=idOWAReplyText10229 dir=ltr>
<DIV dir=ltr><FONT face=Arial color=#000000 size=2>Pere,</FONT></DIV>
<DIV dir=ltr><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV dir=ltr><FONT face=Arial color=#000000 size=2>I'm still not clear why you are \
creating an individual user_points table for each and deleting it.&nbsp; I think I \
understand a little better, but I'm still guessing a bit.</FONT></DIV> <DIV \
dir=ltr><FONT face=Arial color=#000000 size=2></FONT>&nbsp;</DIV> <DIV dir=ltr><FONT \
face=Arial color=#000000 size=2>Below is how I would do it without creating a temp \
table for each user.&nbsp; By the way ST_Contains is&nbsp;a simpler&nbsp;way of \
achieving&nbsp;for A &amp;&amp; B And Contains(A,B)</FONT></DIV> <DIV dir=ltr><FONT \
face=Arial size=2></FONT>&nbsp;</DIV> <DIV dir=ltr><FONT face=Arial color=#000000 \
size=2>INSERT INTO user_points(<FONT face="Times New Roman">geom, specie,genus,code, \
userid, the_geom)</FONT></FONT></DIV> <DIV dir=ltr><FONT size=2>SELECT p.geom, \
p.specie, p.genus, poly.code, p.userid, p.the_geom</FONT></DIV> <DIV dir=ltr><FONT \
size=2>FROM&nbsp;points &nbsp;p </FONT><FONT size=2>INNER JOIN polygon poly ON \
ST_Contains(poly.the_geom, p.the_geom)</FONT></DIV> <DIV dir=ltr><FONT size=2>WHERE \
p.userid = '$user'&nbsp; </FONT></DIV> <DIV dir=ltr><FONT size=2></FONT>&nbsp;</DIV>
<DIV dir=ltr><FONT size=2>(You can leave the WHERE out if you plan to do it for all \
users - it will save you the trouble to just do it in one step.&nbsp; Although sounds \
like you maybe creating these temp tables for the logged in user)</FONT></DIV> <DIV \
dir=ltr><FONT size=2></FONT>&nbsp;</DIV> <DIV dir=ltr><FONT \
size=2></FONT>&nbsp;</DIV> <DIV dir=ltr><FONT size=2>You may also want to replace \
ST_Contains with ST_Intersects (I'm nost sure about performance differences).&nbsp; \
although not sure how all these work in long lat since they assume cartesian rather \
than spherical.&nbsp; I think it should be fine.</FONT></DIV> <DIV dir=ltr><FONT \
size=2></FONT>&nbsp;</DIV> <DIV dir=ltr><FONT size=2></FONT>&nbsp;</DIV>
<DIV dir=ltr><FONT size=2>--Next Polygon part - I am assuming genus, species etc are \
counts (or are those just words)&nbsp; at each point location and you want for \
each</FONT></DIV> <DIV dir=ltr><FONT size=2>user &nbsp;to count the total of each \
they captured in&nbsp;each &nbsp;(you would use SUM not COUNT for this - this is just \
a guess on my part - they may be actual genus, species etc - in which case just throw \
out those fields)</FONT></DIV> <DIV dir=ltr><FONT size=2>--To make this efficient, \
you will want to put a btree index on user_points.code and \
polygon.poly_code</FONT></DIV> <DIV dir=ltr><FONT size=2></FONT>&nbsp;</DIV>
<DIV dir=ltr><FONT size=2>INSERT INTO user_polygon (userid, polygon_code,the_geom, \
numrecords,numgenus, numspecies)<BR>SELECT p.userid, polygon.polygon_code, \
polygon.the_geom, COUNT(p.code),</FONT></DIV> <DIV dir=ltr><FONT size=2>SUM(genus), \
SUM(species)</FONT></DIV> <DIV dir=ltr><FONT size=2>FROM user_points p</FONT></DIV>
<DIV dir=ltr><FONT size=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;INNER JOIN polygon ON p.code = polygon.poly_code</FONT></DIV> <DIV \
dir=ltr><FONT size=2>GROUP BY p.userid, polygon.polygon_code, \
polygon.the_geom;</FONT></DIV> <DIV dir=ltr><FONT size=2></FONT>&nbsp;</DIV>
<DIV dir=ltr><FONT size=2>---If you don't really need user_points except for rolling \
up to user_polygon, then you can forget about all I said above and just do a one \
statement sql</FONT></DIV> <DIV dir=ltr><FONT size=2></FONT>&nbsp;</DIV>
<DIV dir=ltr><FONT size=2>
<DIV dir=ltr><FONT size=2>INSERT INTO user_polygon (userid, polygon_code,the_geom, \
numrecords,numgenus, numspecies)<BR>SELECT p.userid, polygon.polygon_code, \
polygon.the_geom, COUNT(p.code),</FONT></DIV> <DIV dir=ltr><FONT size=2>SUM(genus), \
SUM(species)</FONT></DIV> <DIV dir=ltr><FONT size=2>FROM user_points p</FONT></DIV>
<DIV dir=ltr><FONT size=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;INNER JOIN polygon ON ST_Contains(polygon.the_geom, p.the_geom)</FONT></DIV> \
<DIV dir=ltr><FONT size=2>GROUP BY p.userid, polygon.polygon_code, \
polygon.the_geom;</FONT></DIV> <DIV dir=ltr>&nbsp;</DIV>
<DIV dir=ltr>(normally I would frown on putting the_geom in a group by since it does \
an extent like grouping, but in this</DIV> <DIV dir=ltr>case I think its fine because \
your polygon.polygon_code will guarantee uniqueness. So the extra the_geom is a \
necessary but redundant grouping)</DIV> <DIV dir=ltr>&nbsp;</DIV>
<DIV dir=ltr>IF you still need to do an update&nbsp;&nbsp;- some how you took my \
example and rewoded it so it does something completely different from what I had \
intended.&nbsp; It should read like below not what you reiterated (e.g.&nbsp; \
standard form is</DIV> <DIV dir=ltr>UPDATE sometable1 </DIV>
<DIV dir=ltr>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
FROM sometable2 </DIV> <DIV dir=ltr>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; WHERE \
sometable1.somefield1 = sometable2.somefield2&nbsp; </DIV> <DIV dir=ltr>&nbsp;</DIV>
<DIV dir=ltr>my sometable2 in this case is a subselect aliased as u</DIV>
<DIV dir=ltr>&nbsp;</DIV>
<DIV dir=ltr>(SELECT<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
COUNT(user_point.code) as cnt, user_point.code&nbsp; </DIV> <DIV \
dir=ltr>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FROM \
user_point&nbsp; GROUP BY<BR>user_point.code) as u&nbsp; ;</DIV> <DIV \
dir=ltr>&nbsp;</DIV> <DIV dir=ltr>---</DIV>
<DIV dir=ltr>&nbsp;</DIV>
<DIV dir=ltr>&nbsp; UPDATE user_polygon SET numtax = u.cnt&nbsp;</DIV>
<DIV dir=ltr>FROM &nbsp;(SELECT<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
COUNT(user_point.code) as cnt, user_point.code&nbsp; </DIV> <DIV \
dir=ltr>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FROM \
user_point&nbsp; GROUP BY<BR>user_point.code) as u&nbsp; </DIV> <DIV dir=ltr>WHERE \
u.code = user_polygon.code;<BR></DIV> <DIV dir=ltr>&nbsp;</DIV>
<DIV dir=ltr>Hope that helps,</DIV>
<DIV dir=ltr>Regina</DIV></FONT></DIV>
<DIV dir=ltr><FONT size=2></FONT>&nbsp;</DIV>
<DIV dir=ltr><FONT size=2></FONT>&nbsp;</DIV>
<DIV dir=ltr><FONT size=2></FONT>&nbsp;</DIV>
<DIV dir=ltr>&nbsp;</DIV></DIV>
<DIV dir=ltr><BR>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> postgis-users-bounces@postgis.refractions.net \
on behalf of pere roca<BR><B>Sent:</B> Mon 6/30/2008 12:40 PM<BR><B>To:</B> \
postgis-users@postgis.refractions.net<BR><B>Subject:</B> RE: [postgis-users] \
point-in-polygon SQL sentence performance<BR></FONT><BR></DIV> <DIV><BR><BR>
<P><FONT size=2>&nbsp;&nbsp; dear Regina,<BR><BR>&nbsp;&nbsp; first, thanks to take \
some time.<BR>&nbsp;&nbsp; yes, it's a little complicated to explain; I will try \
again:<BR>&nbsp; -My webapp lets user insert point data; all users insert this data \
in the<BR>same table (point)<BR>&nbsp; -I have several polygon tables in postGIS (to \
cross with point data)<BR><BR>I thought about:<BR><BR>Updating an unique,exclusive \
user_point table with fields: geom, specie,<BR>genus... and code. All fields except \
the code are extracted from point table<BR>using $user as identifier ($user comes to \
PHP from user identification). The<BR>code for each point is extracted using CONTAINS \
and &amp;&amp; sentence (see the PHP<BR>script) from polygon table.<BR><BR>A \
user_polygon table will have the geom, id... and fields like<BR>number_of_records \
(see SET numtax below), number_genera that require<BR>counting how many records fall \
inside EACH polygon. This fields are filled<BR>with the COUNT sentences taking into \
consideration the previous point table<BR>updating.<BR><BR>I've tried the sentence \
below, following your code:<BR>&nbsp; UPDATE user_polygon SET numtax = (select u.cnt \
from (SELECT<BR>COUNT(user_point.code) as cnt, user_point.code&nbsp; FROM \
user_point&nbsp; GROUP BY<BR>user_point.code) as u&nbsp; WHERE u.code = \
user_polygon.code);";<BR><BR>The sentence inside (select u.cnt...) goes very very \
fast, but "loses"&nbsp; a<BR>lot of time (also takes 1min 20sec) comparing WHERE \
u.code =<BR>user_polygon.code. I think this is the problem, but there is some \
solution<BR>to this? isn't the main idea good? everything goes fast except this \
last<BR>part.<BR><BR>After executing all the SQL sentences, data from&nbsp; \
user_point and<BR>user_polygon is deleted (but not the table!).<BR><BR>Some \
answers:<BR>-"1) Are you using inherited tables and having a different table for \
each<BR>user?". No inherited tables (was wrong using ONLY parameter). A user = \
a<BR>point and a polygon table (info deleted each time session expires)<BR>-"How big \
is your user_polygon table". My testings are based on a 20 Mb<BR>"Toast table size", \
544 Kb Table size. It has 1991 records.<BR><BR>&nbsp; Thanks \
again<BR>&nbsp;<BR>&nbsp; Pere Roca<BR><BR>Obe, Regina&nbsp;&nbsp;&nbsp;&nbsp; \
DND\MIS wrote:<BR>&gt;<BR>&gt; Pere,<BR>&gt;<BR>&gt; Somehow I think your whole code \
is needlessly complicated, but I still<BR>&gt; don't have a clear idea what you are \
trying to do and looking at your<BR>&gt; code is actually more confusing than \
anything.&nbsp; Maybe try to restate<BR>&gt; exactly what you are trying to do minus \
the code.&nbsp;<BR>&gt;<BR>&gt; I'm guessing you are trying to do this (by the way \
wrapping a (SELECT<BR>&gt; ...) like that is very non-standard)<BR>&gt;<BR>&gt; \
INSERT INTO user_polygon (userid, polygon_code,the_geom)<BR>&gt; SELECT<BR>&gt; \
distinct(polygon_code),the_geom FROM polygon where polygon."."the_geom<BR>&gt; \
&amp;&amp;<BR>&gt; GeometryFromText('".$row['astext']."',4326);<BR>&gt;<BR>&gt;<BR>&gt; \
1) Are you using inherited tables and having a different table for each<BR>&gt; \
user?&nbsp; I'm assuming that is why you are doing<BR>&gt;<BR>&gt; UPDATE \
ONLY<BR>&gt;<BR>&gt; rather than UPDATE&nbsp; (UPDATE ONLY really only makes sense if \
you are<BR>&gt; using inherited tables and don't want the update to cascade to \
child<BR>&gt; tables, but then if its inherited - its still silly since you are \
just<BR>&gt; updating child tables anyway)<BR>&gt;<BR>&gt;<BR>&gt; 2) Where is that \
$user coming from.&nbsp; Make sure you know where that $user<BR>&gt; is coming from \
and you properly sanitize it otherwise its a prime target<BR>&gt; for SQL Injection \
attack.<BR>&gt;<BR>&gt; 3) How big is your user_polygon table?&nbsp; Speed would \
depend on that<BR>&gt;<BR>&gt;&nbsp; - this particular statement you have is \
inefficient - because it is<BR>&gt; either doing a correlated subselect for every \
record or its updating<BR>&gt; every record to the same value. Since you are not \
prefixing things - its<BR>&gt; hard for me to tell what exactly this is doing.&nbsp; \
This I suppose would<BR>&gt; make sense if you are creating a different table for \
each user<BR>&gt;<BR>&gt; UPDATE ONLY user_polygon SET numreg = (SELECT COUNT (code) \
FROM<BR>&gt; user_points<BR>&gt; WHERE name = polygon_code);<BR>&gt;<BR>&gt; but its \
generally more efficient to write a sub select<BR>&gt;<BR>&gt; UPDATE user_polygon \
SET numreg = u.cnt<BR>&gt; FROM (SELECT COUNT(up.code) as cnt, up.name<BR>&gt; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FROM \
user_point up<BR>&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; GROUP BY up.name) u<BR>&gt; WHERE u.name = \
user_polygon.polygon_code;<BR>&gt;<BR>&gt;<BR>&gt; 3) I have no idea what the point \
of the first bounding box insert you<BR>&gt; have is for<BR>&gt; since ST_intersects, \
ST_Contains already have bounding box checks in<BR>&gt; them.&nbsp; So I suspect this \
step you have is is not necessary<BR>&gt;<BR>&gt; For this part<BR>&gt;<BR>&gt; \
SELECT<BR>&gt; distinct(polygon_code),the_geom FROM polygon where \
polygon."."the_geom<BR>&gt; &amp;&amp;<BR>&gt; \
GeometryFromText('".$row['astext']."',4326)<BR>&gt;<BR>&gt;<BR>&gt; 4) Again this \
statement<BR>&gt; UPDATE ONLY user_polygon SET userid ='$user';<BR>&gt;<BR>&gt; Is \
updating every record in user_polygon to the current user.&nbsp; So if you<BR>&gt; \
are looping thru a user set which I can only guess you are, this<BR>&gt; statement is \
overwritting the previous updates.<BR>&gt;<BR>&gt; If again you have one table per \
user, then this is fine - but its<BR>&gt; cheaper to do everything in your insert \
rather than doing an update<BR>&gt; after the fact.&nbsp; Updates tend to be slower \
than inserts.<BR>&gt;<BR>&gt; Hope that helps a bit,<BR>&gt; \
Regina<BR>&gt;<BR>&gt;<BR>&gt;<BR>&gt;<BR>&gt;&nbsp;<BR>&gt;<BR>&gt; -----Original \
Message-----<BR>&gt; From: postgis-users-bounces@postgis.refractions.net<BR>&gt; [<A \
href="mailto:postgis-users-bounces@postgis.refractions.net">mailto:postgis-users-bounces@postgis.refractions.net</A>] \
On Behalf Of pere<BR>&gt; roca<BR>&gt; Sent: Monday, June 30, 2008 4:25 AM<BR>&gt; \
To: postgis-users@postgis.refractions.net<BR>&gt; Subject: [postgis-users] \
point-in-polygon SQL sentence performance<BR>&gt;<BR>&gt;<BR>&gt; hi!<BR>&gt;<BR>&gt; \
I wanna make a point-in-polygon operation but takes looong time (about 4<BR>&gt; \
minutes!) and the CPU consuming of the server<BR>&gt;&nbsp; becomes huge when \
performing this operation for 15000 points /<BR>&gt; polygons:<BR>&gt;<BR>&gt; I need \
to count how many user points fall in each polygon and perform<BR>&gt; some<BR>&gt; \
operations based on that.<BR>&gt; We initially have a point_table that stores \
multi-user point data. We<BR>&gt; extract from there only this-user-data and create \
new tables:<BR>&gt; user_points<BR>&gt; and user_polygon. To the first one we will \
insert the code of the<BR>&gt; polygon in<BR>&gt; which it falls; counting how many \
many times the same polygon_code is<BR>&gt; repeated, we will calculate \
number-points_in_polygon...<BR>&gt; The important code is commented below (calling \
SQL from PHP):<BR>&gt;<BR>&gt; //we use the points bbox to narrow the future spatial \
queries (will only<BR>&gt; select polygons falling inside this bbox; from 1991 \
initial polygons,<BR>&gt; only<BR>&gt; 85 are from now on used); it is useful when \
points are close each other<BR>&gt; (if<BR>&gt; there is an outlayer then bbox is not \
useful...)<BR>&gt;<BR>&gt; $extent="select astext(extent(points_table.the_geom)) from \
points_table<BR>&gt; where userid='$user'";<BR>&gt; \
$extent_result=pg_query($extent);<BR>&gt;<BR>&gt; while ($row = \
pg_fetch_array($extent_result, NULL, PGSQL_ASSOC))<BR>&gt; {<BR>&gt; //creates a NEW \
polygon table that will store only polygon code and<BR>&gt; polygon<BR>&gt; geometry \
that fall INSIDE the point's bbox<BR>&gt;&nbsp;&nbsp;&nbsp; $poligon_sql2="INSERT \
INTO user_polygon (polygon_code,the_geom)<BR>&gt; (SELECT<BR>&gt; \
distinct(polygon_code),the_geom FROM polygon where polygon."."the_geom<BR>&gt; \
&amp;&amp;<BR>&gt; GeometryFromText('".$row['astext']."',4326));<BR>&gt; UPDATE ONLY \
user_polygon SET userid ='$user';<BR>&gt;<BR>&gt; //we previously created a new user \
point table, where we insert some<BR>&gt; info.<BR>&gt; The table is created because \
we are taking data from a multi-user table<BR>&gt; and<BR>&gt; now we want a single \
user--&gt;a single point table<BR>&gt;<BR>&gt; INSERT INTO user_points (genus, \
species, name, code, usuario, the_geom)<BR>&gt; (SELECT genus, specie, codigo, oid, \
userid, the_geom FROM points WHERE<BR>&gt; userid = '$user');<BR>&gt;<BR>&gt; //in \
this user POINT table we set the code from the polygon's table<BR>&gt; where<BR>&gt; \
the point falls inside;<BR>&gt; //I expected this one to be the "big" operation , but \
it's not (takes<BR>&gt; "only"<BR>&gt; 1 min 18 sec)<BR>&gt;<BR>&gt; UPDATE ONLY \
user_points SET code = (SELECT polygon_code FROM<BR>&gt; user_polygon<BR>&gt; WHERE \
user_polygon.the_geom &amp;&amp; user_points.the_geom AND CONTAINS<BR>&gt; \
(user_polygon.the_geom,user_points.the_geom));<BR>&gt;<BR>&gt; //updating user \
polygon table, counting how many times do we have a<BR>&gt; polygon<BR>&gt; code in \
our POINT table --&gt; number of points in each polygon<BR>&gt; // this updating \
operation takes 1 min 20 seconds!<BR>&gt;<BR>&gt; UPDATE ONLY user_polygon SET numreg \
= (SELECT COUNT (code) FROM<BR>&gt; user_points<BR>&gt; WHERE name = \
polygon_code);<BR>&gt;<BR>&gt; // this updating operation also takes about 1 min 20 \
seconds!<BR>&gt;<BR>&gt; UPDATE ONLY user_polygon SET numtax = (SELECT COUNT (genus) \
FROM (SELECT<BR>&gt; DISTINCT (genus) FROM user_points WHERE code = polygon_code) AS \
foo);<BR>&gt;<BR>&gt;&nbsp;&nbsp;&nbsp; All important data is indexed; Any ideas to \
get better performance?<BR>&gt;<BR>&gt;&nbsp;&nbsp;&nbsp; \
thanks!<BR>&gt;&nbsp;<BR>&gt;&nbsp;&nbsp;&nbsp; Pere Roca<BR>&gt;&nbsp;&nbsp;&nbsp; \
EDIT project (<A href="http://edit.csic.es/edit_geo/prototype/edit.html">http://edit.csic.es/edit_geo/prototype/edit.html</A>)<BR>&gt;<BR>&gt;<BR>&gt; \
--<BR>&gt; View this message in context:<BR>&gt; <A \
href="http://www.nabble.com/point-in-polygon-SQL-sentence-performance-tp181908">http://www.nabble.com/point-in-polygon-SQL-sentence-performance-tp181908</A><BR>&gt; \
54p18190854.html<BR>&gt; Sent from the PostGIS - User mailing list archive at \
Nabble.com.<BR>&gt;<BR>&gt; _______________________________________________<BR>&gt; \
postgis-users mailing list<BR>&gt; postgis-users@postgis.refractions.net<BR>&gt; <A \
href="http://postgis.refractions.net/mailman/listinfo/postgis-users">http://postgis.refractions.net/mailman/listinfo/postgis-users</A><BR>&gt; \
-----------------------------------------<BR>&gt; The substance of this message, \
including any attachments, may be<BR>&gt; confidential, legally privileged and/or \
exempt from disclosure<BR>&gt; pursuant to Massachusetts law. It is intended<BR>&gt; \
solely for the addressee. If you received this in error, please<BR>&gt; contact the \
sender and delete the material from any computer.<BR>&gt;<BR>&gt; \
_______________________________________________<BR>&gt; postgis-users mailing \
list<BR>&gt; postgis-users@postgis.refractions.net<BR>&gt; <A \
href="http://postgis.refractions.net/mailman/listinfo/postgis-users">http://postgis.re \
fractions.net/mailman/listinfo/postgis-users</A><BR>&gt;<BR>&gt;<BR><BR>--<BR>View \
this message in context: <A \
href="http://www.nabble.com/point-in-polygon-SQL-sentence-performance-tp18190854p18199 \
519.html">http://www.nabble.com/point-in-polygon-SQL-sentence-performance-tp18190854p18199519.html</A><BR>Sent \
from the PostGIS - User mailing list archive at \
Nabble.com.<BR><BR>_______________________________________________<BR>postgis-users \
mailing list<BR>postgis-users@postgis.refractions.net<BR><A \
href="http://postgis.refractions.net/mailman/listinfo/postgis-users">http://postgis.re \
fractions.net/mailman/listinfo/postgis-users</A><BR></FONT></P></DIV></BODY></HTML>



_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

--===============0376696906==--


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic