[prev in list] [next in list] [prev in thread] [next in thread]
List: postgis-users
Subject: Re: [postgis-users] Geocode function failswhencallingnormalize_address
From: "Paragon Corporation" <lr () pcorp ! us>
Date: 2012-09-29 4:09:21
Message-ID: 60A94759222C47ED9A898A4F827C6923 () O
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
Robert,
Your normalize_address syntax is no good as that will call normalize_address
numerous times per address (because of the .*) .
Compare with speed of:
SELECT (normalize_address(address)) As addy FROM respondents_addr WHERE
address IS NOT NULL LIMIT 1000;
There are a lot of reasons sadly mostly with teh nomralize logic and not one
we've had time to go thru. The best bet is to loop thru each and see which
one is holding things up.
We use pgScript for that and just set the limit to 1 offset i or something
like that.
There are some fixes we've put in 2.1.0 that we haven't yet backported to
2.0. So 2.0.1 is already kind of antiquated as it is also based on tiger
2010 rather than 2011.
You can send us the 100 off list, and time permitting we'll take a look or
maybe a quick scan we'll see something that sticks out. Kind of strapped
for time at moment.
Mike Pease did a good job of itemizing some issues.
e.g.
http://trac.osgeo.org/postgis/ticket/1614
http://trac.osgeo.org/postgis/ticket/1669
You may want to see if you suffer from any of those.
Thanks,
Regina and Leo
http://www.postgis.us
_____
From: postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] On Behalf Of
Robert_Clift@doh.state.fl.us
Sent: Friday, September 28, 2012 4:04 PM
To: postgis-users@postgis.refractions.net
Subject: Re: [postgis-users] Geocode function
failswhencallingnormalize_address
Dear Leo and Regina (and everyone else too):
Thanks very much for your detailed reply. I'm just now to the point where
I've implemented your recommendations, but no positive outcome yet. I
simplified the addresses to address the regex expression variable error,
performed a soft upgrade to 2.0.1, and cleaned up my query (both with and
without the use of pgScript).
New PostGIS version:
"POSTGIS="2.0.1 r9979" GEOS="3.3.5-CAPI-1.7.5" PROJ="Rel. 4.8.0, 6 March
2012" GDAL="GDAL 1.9.1, released 2012/05/15" LIBXML="2.7.8" ...etc.
New Query:
SELECT a.rid, a.respondent, a.s_year, Coalesce((g.geo).rating,-1) "rating",
(g.geo).geomout "the_geom"
FROM respondents_addr a
LEFT JOIN
(
SELECT respondent, s_year, (geocode(address, 1)) "geo"
FROM respondents_addr
ORDER BY rid LIMIT 100
) g
ON a.respondent = g.respondent
AND a.s_year = g.s_year
WHERE a.address IS NOT NULL
ORDER BY rid LIMIT 100;
Same Error Message:
ERROR: invalid regular expression: quantifier operand invalid
CONTEXT: PL/pgSQL function "normalize_address" line 386 at assignment
PL/pgSQL function "geocode" line 10 at assignment
And yet the following query still returns good results in 43320 ms:
SELECT (normalize_address(address)).* FROM respondents_addr WHERE address IS
NOT NULL LIMIT 1000;
Wait. As I typed the last bit I realized the normalization test query is not
ordered the same way as the geocode query.
This fails with the exact same error:
SELECT (normalize_address(address)).* FROM respondents_addr WHERE address IS
NOT NULL ORDER BY rid LIMIT 100;
I'm not very familiar with regular expressions. Is anyone willing to look at
these first 100 addresses, sent off-list, and tell me which are tripping me
up and why?
Thanks,
Rob
_____
From: postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] On Behalf Of Paragon
Corporation
Sent: Thursday, September 06, 2012 8:14 PM
To: 'PostGIS Users Discussion'
Subject: Re: [postgis-users] Geocode function fails
whencallingnormalize_address
Rob,
Regarding your normalize issue, it's probably one record causing it. I
thought we fixed some of these issues in 2.0.1 and 2.1.0SVN so you might
want to upgrade your script to the 2.0.1 -- there is an upgrade script you
can use packaged in the tar ball.
If that still doesn't fix your issue, look for variables often found in
regex expressions in your address (things like (, ) and . for example might
be throwing it off and we might not be escaping right. Hard to tell which
one until your narrow down to the one bad record.
You might have to run each update as a single to see which one fails
Couple of other tips
1) you don't want to geocode 100K records all at once since it has to
complete as a single transaction. You'll need to do it in batches. We use
pgScript for that batch processing so we can just run in a pgAdmin window.
though any scripting tool of your choice would do.
http://www.postgresonline.com/journal/archives/181-pgAdmin-pgScript.html
2) Don't do (geocode(address)).* That often calls geocode for each field
in output so would slow down your processing n fold (e.g. you'd have like 5
calls instead of 1)
You really want to do geocode(address) As geo
(geo).geomout etc.
See example in docs --
http://www.postgis.org/documentation/manual-svn/Geocode.html -- we do a
left join to guarantee that even if no records returned for an address, we
can stamp the address as attempted with a -1 rating
I think we thought about this after we wrote the book :(
-- SHAMELESS PLUG STARTS HERE ---
On bright side, we just signed our contract for Second Edition of PostGIS in
Action and have started writing it :) more on that later. that will cover
newer enhancements in geocoder, raster, topology, PostGIS 2.0-2.1 (basically
at least 2.0 and 2.1 changes) and PostgreSQL 9.1-9.3
We'll be posting the new Table Of Contents soon here, and let people know
when they can start purchasing
http://www.postgis.us
Those who buy the new book, is our understanding, will automatically get
electronic copies of the first edition, so even if you buy early, you'll get
our drafts as we write them and also have the older book to work with.
-- END SHAMELESS PLUG
3) The book was written before we introduced the limit, so again look at:
http://www.postgis.org/documentation/manual-svn/Geocode.html (feature is
available in 2.0 as well)
Use: geocode(address,1)
for faster performance and then you don't even need the DISTINCT ON since
you'll get back at most one answer and the best one.
Hope that helps,
Leo and Regina
http://www.postgis.us
_____
From: postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] On Behalf Of
Robert_Clift@doh.state.fl.us
Sent: Thursday, September 06, 2012 1:01 PM
To: postgis-users@postgis.refractions.net
Subject: [postgis-users] Geocode function fails when
callingnormalize_address
Hi All:
My batch geocode of 100K records is failing with error messages that refer
to an operand in the normalize_address function. I based my single-column
UPDATE on the multi-column example in "PostGIS In Action" and have tweaked
the syntax only to achieve slight variation in the error message.
The query is something like:
UPDATE respondents_addr SET the_geom =
g.geomout
FROM (SELECT DISTINCT ON (respondent, s_year) respondent, s_year,
(geocode(address)).*
FROM respondents_addr As ra
WHERE ra.address IS NOT NULL
ORDER BY respondent, s_year, rating) As g
WHERE g.respondent = respondents_addr.respondent
AND g.s_year = respondents_addr.s_year;
The most frequent error text is:
ERROR: invalid regular expression: quantifier operand invalid
CONTEXT: PL/pgSQL function "normalize_address" line 386 at assignment
PL/pgSQL function "geocode" line 10 at assignment
********** Error **********
ERROR: invalid regular expression: quantifier operand invalid
SQL state: 2201B
Context: PL/pgSQL function "normalize_address" line 386 at assignment
PL/pgSQL function "geocode" line 10 at assignment
Also the query runs for more than an hour, (xp box with 3.5 GB ram,
PosgreSQL 9.0.6, PostGIS 2.0.0 (might be the problem?)) and no matter how
the error message changes it always references the normalization function,
so I tried:
SELECT (normalize_address(address)).* FROM respondents_addr WHERE address IS
NOT NULL LIMIT 1000;
With the LIMIT it ran quickly and returned the expected result, so I tried:
SELECT (normalize_address(address)).* FROM respondents_addr WHERE address IS
NOT NULL LIMIT 5000;
Bumping the LIMIT up reproduced the error from the geocode attempts. I'd say
"aha" if only I knew why.
I checked the list archives and googled several combinations of words from
the error but alas, to no avail. I suspect that I've either repeated a
simple syntax error or that my setup is somehow off -- any help will be
appreciatedand I might as well close with results of PostGIS_full_version():
POSTGIS="2.0.0 r9605" GEOS="3.3.3-CAPI-1.7.4" PROJ="Rel. 4.8.0, 6 March
2012" GDAL="GDAL 1.9.0, released 2011/12/29" LIBXML="2.7.8"
LIBJSON="UNKNOWN" TOPOLOGY RASTER
Thank you,
Rob
[Attachment #5 (text/html)]
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Geocode function fails when calling normalize_address</TITLE>
<META content="text/html; charset=us-ascii" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 9.00.8112.16450"></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>Robert,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>Your normalize_address syntax is no good as that will call
normalize_address numerous times per address (because of the .*) .
</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>Compare with speed of:</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><SPAN
class=463023319-28092012><FONT color=#0000ff size=2 face=Arial>SELECT
(normalize_address(address)) As addy FROM respondents_addr WHERE address IS
NOT NULL LIMIT 1000;</FONT></SPAN></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>There are a lot of reasons sadly mostly with teh nomralize
logic and not one we've had time to go thru. The best bet is to loop thru
each and see which one is holding things up.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>We use pgScript for that and just set the limit to 1 offset i
or something like that.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>There are some fixes we've put in 2.1.0 that we haven't yet
backported to 2.0. So 2.0.1 is already kind of antiquated as it is also
based on tiger 2010 rather than 2011.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>You can send us the 100 off list, and time permitting we'll
take a look or maybe a quick scan we'll see something that sticks out.
Kind of strapped for time at moment.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>Mike Pease did a good job of itemizing some
issues.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>e.g.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial><A
href="http://trac.osgeo.org/postgis/ticket/1614">http://trac.osgeo.org/postgis/ticket/1614</A></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial><A
href="http://trac.osgeo.org/postgis/ticket/1669">http://trac.osgeo.org/postgis/ticket/1669</A></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>You may want to see if you suffer from any of
those.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>Thanks,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial>Regina and Leo</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial><A
href="http://www.postgis.us">http://www.postgis.us</A></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=051125903-29092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV><BR>
<DIV dir=ltr lang=en-us class=OutlookMessageHeader align=left>
<HR tabIndex=-1>
<FONT size=2 face=Tahoma><B>From:</B>
postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] <B>On Behalf Of
</B>Robert_Clift@doh.state.fl.us<BR><B>Sent:</B> Friday, September 28, 2012 4:04
PM<BR><B>To:</B> postgis-users@postgis.refractions.net<BR><B>Subject:</B> Re:
[postgis-users] Geocode function
failswhencallingnormalize_address<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012>Dear Leo and Regina (and everyone else
too):</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012>Thanks very much for your detailed reply. I'm just now
to the point where I've implemented your recommendations, but no positive
outcome yet. I simplified the addresses to address the regex
expression variable error, performed a soft upgrade to 2.0.1, and cleaned up my
query (both with and without the use of pgScript).</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial></FONT> </DIV>
<DIV dir=ltr align=left><SPAN class=463023319-28092012><FONT size=2
face=Arial>New PostGIS version:</FONT></SPAN></DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial>"POSTGIS="2.0.1 r9979"
GEOS="3.3.5-CAPI-1.7.5" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.1,
released 2012/05/15" LIBXML="2.7.8" <SPAN
class=463023319-28092012>...etc.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012>New Query:</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=463023319-28092012>SELECT a.rid, a.respondent, a.s_year,
Coalesce((g.geo).rating,-1) "rating", (g.geo).geomout "the_geom"<BR>FROM
respondents_addr a<BR>LEFT JOIN<BR> (<BR> SELECT
respondent, s_year, (geocode(address, 1)) "geo"<BR> FROM
respondents_addr<BR> ORDER BY rid LIMIT 100<BR> )
g<BR>ON a.respondent = g.respondent<BR>AND a.s_year = g.s_year<BR>WHERE
a.address IS NOT NULL<BR>ORDER BY rid LIMIT 100;</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=463023319-28092012></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012>Same Error Message:</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=463023319-28092012>ERROR: invalid regular expression: quantifier
operand invalid<BR>CONTEXT: PL/pgSQL function "normalize_address" line 386
at assignment<BR>PL/pgSQL function "geocode" line 10 at
assignment</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=463023319-28092012></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012>And yet the following query still returns good results
in 43320 ms:</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=463023319-28092012>SELECT (normalize_address(address)).* FROM
respondents_addr WHERE address IS NOT NULL LIMIT 1000;</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012>Wait. As I typed the last bit I realized the
normalization test query is not ordered the same way as the geocode
query.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012>This fails with the exact same
error:</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN class=463023319-28092012>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=463023319-28092012>SELECT (normalize_address(address)).* FROM
respondents_addr WHERE address IS NOT NULL ORDER BY rid LIMIT
100;</SPAN></FONT></DIV></SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012>I'm not very familiar with regular expressions. Is
anyone willing to look at these first 100 addresses, sent off-list, and tell me
which are tripping me up and why?</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012>Thanks,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT size=2 face=Arial><SPAN
class=463023319-28092012>Rob</SPAN></FONT><BR></DIV>
<DIV dir=ltr lang=en-us class=OutlookMessageHeader align=left>
<HR tabIndex=-1>
<FONT size=2 face=Tahoma><B>From:</B>
postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] <B>On Behalf Of
</B>Paragon Corporation<BR><B>Sent:</B> Thursday, September 06, 2012 8:14
PM<BR><B>To:</B> 'PostGIS Users Discussion'<BR><B>Subject:</B> Re:
[postgis-users] Geocode function fails
whencallingnormalize_address<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial>Rob,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial>Regarding your normalize issue, it's probably one record
causing it. I thought we fixed some of these issues in 2.0.1 and 2.1.0SVN
so you might want to upgrade your script to the 2.0.1 -- there is an upgrade
script you can use packaged in the tar ball.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial>If that still doesn't fix your issue, look for variables often
found in regex expressions in your address (things like (, ) and . for
example might be throwing it off and we might not be escaping right. Hard
to tell which one until your narrow down to the one bad
record.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial>You might have to run each update as a single to see which one
fails</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial>Couple of other tips</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial>1) you don't want to geocode 100K records all at once since it
has to complete as a single transaction. You'll need to do it in
batches. We use pgScript for that batch processing so we can just run
in a pgAdmin window. though any scripting tool of your choice would
do.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial><A
href="http://www.postgresonline.com/journal/archives/181-pgAdmin-pgScript.html">http:/ \
/www.postgresonline.com/journal/archives/181-pgAdmin-pgScript.html</A></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial>2) Don't do <FONT color=#000000>(geocode(address)).*<FONT
size=3 face="Times New Roman"> That often calls geocode for each
field in output so would slow down your processing n fold (e.g. you'd have like
5 calls instead of 1)</FONT></FONT></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial>You really want to do geocode(address) As geo</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial>(geo).geomout etc. </FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012>See example in docs -- <A
href="http://www.postgis.org/documentation/manual-svn/Geocode.html">http://www.postgis.org/documentation/manual-svn/Geocode.html</A> \
-- we do a left join to guarantee that even if no records returned for an
address, we can stamp the address as attempted with a -1 rating</SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012>I think we thought about
this after we wrote the book :(</SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial>-- SHAMELESS PLUG STARTS HERE ---</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial>On bright side, we just signed our contract for Second Edition of
PostGIS in Action and have started writing it :) more on that later.
that will cover newer enhancements in geocoder, raster, topology, PostGIS
2.0-2.1 (basically at least 2.0 and 2.1 changes) and PostgreSQL 9.1-9.3
</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial>We'll be posting the new Table Of Contents soon here, and let people
know when they can start purchasing</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial><A
href="http://www.postgis.us">http://www.postgis.us</A></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial>Those who buy the new book, is our understanding, will automatically
get electronic copies of the first edition, so even if you buy early, you'll get
our drafts as we write them and also have the older book to work
with.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial>-- END SHAMELESS PLUG</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial>3) The book was written before we introduced the limit, so
again look at:</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><A
href="http://www.postgis.org/documentation/manual-svn/Geocode.html">http://www.postgis.org/documentation/manual-svn/Geocode.html</A><FONT \
size=2 face=Arial> (feature is available in 2.0 as
well)</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial>Use: geocode(address,1)</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial>for faster performance and then you don't even need the DISTINCT ON
since you'll get back at most one answer and the best one.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial>Hope that helps,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial>Leo and Regina</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial><A
href="http://www.postgis.us">http://www.postgis.us</A></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV><BR>
<DIV dir=ltr lang=en-us class=OutlookMessageHeader align=left>
<HR tabIndex=-1>
<FONT size=2 face=Tahoma><B>From:</B>
postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] <B>On Behalf Of
</B>Robert_Clift@doh.state.fl.us<BR><B>Sent:</B> Thursday, September 06, 2012
1:01 PM<BR><B>To:</B> postgis-users@postgis.refractions.net<BR><B>Subject:</B>
[postgis-users] Geocode function fails when
callingnormalize_address<BR></FONT><BR></DIV>
<DIV></DIV><!-- Converted from text/rtf format -->
<P><FONT size=2 face=Arial>Hi All:</FONT> </P>
<P><FONT size=2 face=Arial>My batch geocode of 100K records is failing with
error messages that refer to an operand in the normalize_address function. I
based my single-column UPDATE on the multi-column example in "PostGIS In Action"
and have tweaked the syntax only to achieve slight variation in the error
message.</FONT></P>
<P><FONT size=2 face=Arial>The query is something like:</FONT> </P><BR>
<P><FONT size=2 face=Arial>UPDATE respondents_addr SET the_geom =</FONT>
<BR><FONT size=2 face=Arial> g.geomout</FONT> <BR><FONT size=2
face=Arial> FROM (SELECT DISTINCT ON (respondent, s_year)
respondent, s_year, (geocode(address)).*</FONT> <BR><FONT size=2
face=Arial> FROM respondents_addr As ra</FONT>
<BR><FONT size=2 face=Arial>
WHERE ra.address IS NOT NULL</FONT> <BR><FONT size=2
face=Arial> ORDER BY respondent, s_year, rating)
As g</FONT> <BR><FONT size=2 face=Arial> WHERE
g.respondent = respondents_addr.respondent</FONT> <BR><FONT size=2
face=Arial> AND g.s_year =
respondents_addr.s_year;</FONT> </P><BR>
<P><FONT size=2 face=Arial>The most frequent error text is:</FONT> </P><BR>
<P><FONT size=2 face=Arial>ERROR: invalid regular expression: quantifier
operand invalid</FONT> <BR><FONT size=2 face=Arial>CONTEXT: PL/pgSQL
function "normalize_address" line 386 at assignment</FONT> <BR><FONT size=2
face=Arial>PL/pgSQL function "geocode" line 10 at assignment</FONT> <BR><FONT
size=2 face=Arial>********** Error **********</FONT> <BR><FONT size=2
face=Arial>ERROR: invalid regular expression: quantifier operand invalid</FONT>
<BR><FONT size=2 face=Arial>SQL state: 2201B</FONT> <BR><FONT size=2
face=Arial>Context: PL/pgSQL function "normalize_address" line 386 at
assignment</FONT> <BR><FONT size=2 face=Arial>PL/pgSQL function "geocode" line
10 at assignment</FONT> </P><BR>
<P><FONT size=2 face=Arial>Also the query runs for more than an hour, (xp box
with 3.5 GB ram, PosgreSQL 9.0.6, PostGIS 2.0.0 (might be the problem?)) and no
matter how the error message changes it always references the normalization
function, so I tried:</FONT></P><BR>
<P><FONT size=2 face=Arial>SELECT (normalize_address(address)).* FROM
respondents_addr WHERE address IS NOT NULL LIMIT 1000;</FONT> </P><BR>
<P><FONT size=2 face=Arial>With the LIMIT it ran quickly and returned the
expected result, so I tried:</FONT> </P><BR>
<P><FONT size=2 face=Arial>SELECT (normalize_address(address)).* FROM
respondents_addr WHERE address IS NOT NULL LIMIT 5000;</FONT> </P><BR>
<P><FONT size=2 face=Arial>Bumping the LIMIT up reproduced the error from the
geocode attempts. I'd say "aha" if only I knew why.</FONT> <BR><FONT size=2
face=Arial>I checked the list archives and googled several combinations of words
from the error but alas, to no avail. I suspect that I've either repeated a
simple syntax error or that my setup is somehow off -- any help will be
appreciatedand I might as well close with results of
PostGIS_full_version():</FONT></P><BR>
<P><FONT size=2 face=Arial>POSTGIS="2.0.0 r9605" GEOS="3.3.3-CAPI-1.7.4"
PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.0, released 2011/12/29"
LIBXML="2.7.8" LIBJSON="UNKNOWN" TOPOLOGY RASTER</FONT></P><BR>
<P><FONT size=2 face=Arial>Thank you,</FONT> <BR><FONT size=2
face=Arial>Rob</FONT> </P></BODY></HTML>
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic