[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgis-users
Subject:    Re: [postgis-users] Geocode function fails whencallingnormalize_address
From:       <Robert_Clift () doh ! state ! fl ! us>
Date:       2012-09-28 20:04:11
Message-ID: 8F8517020108854FBA3C766A8711815E053DB7E5 () dit00smapo01 ! doh ! ad ! state ! fl ! us
[Download RAW message or body]

--===============1951747532909521849==
Content-class: urn:content-classes:message
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01CD9DB4.6D4F1088"

This is a multi-part message in MIME format.


Dear Leo and Regina (and everyone else too):
 
Thanks very much for your detailed reply. I'm just now to the point
where I've implemented your recommendations, but no positive outcome
yet. I simplified the addresses to address the regex expression variable
error, performed a soft upgrade to 2.0.1, and cleaned up my query (both
with and without the use of pgScript).
 
New PostGIS version:
"POSTGIS="2.0.1 r9979" GEOS="3.3.5-CAPI-1.7.5" PROJ="Rel. 4.8.0, 6 March
2012" GDAL="GDAL 1.9.1, released 2012/05/15" LIBXML="2.7.8" ...etc.
 
New Query:
SELECT a.rid, a.respondent, a.s_year, Coalesce((g.geo).rating,-1)
"rating", (g.geo).geomout "the_geom"
FROM respondents_addr a
LEFT JOIN
  (
    SELECT respondent, s_year, (geocode(address, 1)) "geo"
    FROM respondents_addr
    ORDER BY rid LIMIT 100
  ) g
ON a.respondent = g.respondent
AND a.s_year = g.s_year
WHERE a.address IS NOT NULL
ORDER BY rid LIMIT 100;
 
Same Error Message:
ERROR:  invalid regular expression: quantifier operand invalid
CONTEXT:  PL/pgSQL function "normalize_address" line 386 at assignment
PL/pgSQL function "geocode" line 10 at assignment
 
And yet the following query still returns good results in 43320 ms:
SELECT (normalize_address(address)).* FROM respondents_addr WHERE
address IS NOT NULL LIMIT 1000;
 
Wait. As I typed the last bit I realized the normalization test query is
not ordered the same way as the geocode query.
This fails with the exact same error:
SELECT (normalize_address(address)).* FROM respondents_addr WHERE
address IS NOT NULL ORDER BY rid LIMIT 100;
 
I'm not very familiar with regular expressions. Is anyone willing to
look at these first 100 addresses, sent off-list, and tell me which are
tripping me up and why?
 
Thanks,
Rob

________________________________

From: postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] On Behalf Of
Paragon Corporation
Sent: Thursday, September 06, 2012 8:14 PM
To: 'PostGIS Users Discussion'
Subject: Re: [postgis-users] Geocode function fails
whencallingnormalize_address


Rob,
 
Regarding your normalize issue, it's probably one record causing it.  I
thought we fixed some of these issues in 2.0.1 and 2.1.0SVN so you might
want to upgrade your script to the 2.0.1 -- there is an upgrade script
you can use packaged in the tar ball.
If that still doesn't fix your issue, look for variables often found in
regex expressions in your address (things like (, )  and . for example
might be throwing it off and we might not be escaping right.  Hard to
tell which one until your narrow down to the one bad record.
You might have to run each update as a single to see which one fails
 
 
Couple of other tips
1) you don't want to geocode 100K records all at once since it has to
complete as a single transaction.  You'll need to do it in batches.  We
use pgScript for that batch processing so we can just run in a pgAdmin
window.  though any scripting tool of your choice would do.
http://www.postgresonline.com/journal/archives/181-pgAdmin-pgScript.html
 
2) Don't do (geocode(address)).*   That often calls geocode for each
field in output so would slow down your processing n fold (e.g. you'd
have like 5 calls instead of 1)
 
You really want to do geocode(address) As geo
 
(geo).geomout etc. 
 
See example in docs --
http://www.postgis.org/documentation/manual-svn/Geocode.html  -- we do a
left join to guarantee that even if no records returned for an address,
we can stamp the address as attempted with a -1 rating
I think we thought about this after we wrote the book :(
 
-- SHAMELESS PLUG STARTS HERE ---
On bright side, we just signed our contract for Second Edition of
PostGIS in Action and have started writing it :)  more on that later.
that will cover newer enhancements in geocoder, raster, topology,
PostGIS 2.0-2.1 (basically at least 2.0 and 2.1 changes)  and PostgreSQL
9.1-9.3 
 
We'll be posting the new Table Of Contents soon here, and let people
know when they can start purchasing
http://www.postgis.us
 
Those who buy the new book, is our understanding, will automatically get
electronic copies of the first edition, so even if you buy early, you'll
get our drafts as we write them and also have the older book to work
with.
 
-- END SHAMELESS PLUG
 
 
3) The book was written before we introduced the limit, so again look
at:
 
http://www.postgis.org/documentation/manual-svn/Geocode.html  (feature
is available in 2.0 as well)
 
Use: geocode(address,1)
 
for faster performance and then you don't even need the DISTINCT ON
since you'll get back at most one answer and the best one.
 
Hope that helps,
Leo and Regina
http://www.postgis.us
 
 
 
 
 

________________________________

From: postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] On Behalf Of
Robert_Clift@doh.state.fl.us
Sent: Thursday, September 06, 2012 1:01 PM
To: postgis-users@postgis.refractions.net
Subject: [postgis-users] Geocode function fails when
callingnormalize_address



Hi All: 

My batch geocode of 100K records is failing with error messages that
refer to an operand in the normalize_address function. I based my
single-column UPDATE on the multi-column example in "PostGIS In Action"
and have tweaked the syntax only to achieve slight variation in the
error message.

The query is something like: 


UPDATE respondents_addr SET the_geom = 
   g.geomout 
   FROM (SELECT DISTINCT ON (respondent, s_year) respondent, s_year,
(geocode(address)).* 
      FROM respondents_addr As ra 
         WHERE ra.address IS NOT NULL 
      ORDER BY respondent, s_year, rating) As g 
      WHERE g.respondent = respondents_addr.respondent 
      AND g.s_year = respondents_addr.s_year; 


The most frequent error text is: 


ERROR:  invalid regular expression: quantifier operand invalid 
CONTEXT:  PL/pgSQL function "normalize_address" line 386 at assignment 
PL/pgSQL function "geocode" line 10 at assignment 
********** Error ********** 
ERROR: invalid regular expression: quantifier operand invalid 
SQL state: 2201B 
Context: PL/pgSQL function "normalize_address" line 386 at assignment 
PL/pgSQL function "geocode" line 10 at assignment 


Also the query runs for more than an hour, (xp box with 3.5 GB ram,
PosgreSQL 9.0.6, PostGIS 2.0.0 (might be the problem?)) and no matter
how the error message changes it always references the normalization
function, so I tried:


SELECT (normalize_address(address)).* FROM respondents_addr WHERE
address IS NOT NULL LIMIT 1000; 


With the LIMIT it ran quickly and returned the expected result, so I
tried: 


SELECT (normalize_address(address)).* FROM respondents_addr WHERE
address IS NOT NULL LIMIT 5000; 


Bumping the LIMIT up reproduced the error from the geocode attempts. I'd
say "aha" if only I knew why. 
I checked the list archives and googled several combinations of words
from the error but alas, to no avail. I suspect that I've either
repeated a simple syntax error or that my setup is somehow off -- any
help will be appreciatedand I might as well close with results of
PostGIS_full_version():


POSTGIS="2.0.0 r9605" GEOS="3.3.3-CAPI-1.7.4" PROJ="Rel. 4.8.0, 6 March
2012" GDAL="GDAL 1.9.0, released 2011/12/29" LIBXML="2.7.8"
LIBJSON="UNKNOWN" TOPOLOGY RASTER


Thank you, 
Rob 


[Attachment #3 (text/html)]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Geocode function fails when calling normalize_address</TITLE>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.6000.21314" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012>Dear Leo and Regina (and everyone else 
too):</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012></SPAN></FONT>&nbsp;</DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012>Thanks very much for your detailed reply. I'm just now 
to the point where I've implemented your recommendations, but no positive 
outcome yet. I&nbsp;simplified the addresses to&nbsp;address the regex 
expression variable error, performed a soft upgrade to 2.0.1, and cleaned up my 
query (both with and without the use of pgScript).</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=463023319-28092012><FONT face=Arial 
size=2>New PostGIS version:</FONT></SPAN></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2>"POSTGIS="2.0.1 r9979" 
GEOS="3.3.5-CAPI-1.7.5" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.1, 
released 2012/05/15" LIBXML="2.7.8"&nbsp;<SPAN 
class=463023319-28092012>...etc.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012></SPAN></FONT>&nbsp;</DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012>New Query:</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN 
class=463023319-28092012>SELECT a.rid, a.respondent, a.s_year, 
Coalesce((g.geo).rating,-1) "rating", (g.geo).geomout "the_geom"<BR>FROM 
respondents_addr a<BR>LEFT JOIN<BR>&nbsp; (<BR>&nbsp;&nbsp;&nbsp; SELECT 
respondent, s_year, (geocode(address, 1)) "geo"<BR>&nbsp;&nbsp;&nbsp; FROM 
respondents_addr<BR>&nbsp;&nbsp;&nbsp; ORDER BY rid LIMIT 100<BR>&nbsp; ) 
g<BR>ON a.respondent = g.respondent<BR>AND a.s_year = g.s_year<BR>WHERE 
a.address IS NOT NULL<BR>ORDER BY rid LIMIT 100;</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN 
class=463023319-28092012></SPAN></FONT>&nbsp;</DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012>Same Error Message:</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN 
class=463023319-28092012>ERROR:&nbsp; invalid regular expression: quantifier 
operand invalid<BR>CONTEXT:&nbsp; PL/pgSQL function "normalize_address" line 386 
at assignment<BR>PL/pgSQL function "geocode" line 10 at 
assignment</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN 
class=463023319-28092012></SPAN></FONT>&nbsp;</DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012>And yet the following query still returns good results 
in 43320 ms:</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN 
class=463023319-28092012>SELECT (normalize_address(address)).* FROM 
respondents_addr WHERE address IS NOT NULL LIMIT 1000;</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012></SPAN></FONT>&nbsp;</DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012>Wait. As I typed the last bit I realized the 
normalization test query is not ordered the same way as the geocode 
query.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012>This fails with the exact same 
error:</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN class=463023319-28092012>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN 
class=463023319-28092012>SELECT (normalize_address(address)).* FROM 
respondents_addr WHERE address IS NOT NULL ORDER BY rid LIMIT 
100;</SPAN></FONT></DIV></SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012></SPAN></FONT>&nbsp;</DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012>I'm not very familiar with regular expressions. Is 
anyone willing to look at these first 100 addresses, sent off-list, and tell me 
which are tripping me up and why?</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012></SPAN></FONT>&nbsp;</DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012>Thanks,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial size=2><SPAN 
class=463023319-28092012>Rob</SPAN></FONT><BR></DIV>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> 
postgis-users-bounces@postgis.refractions.net 
[mailto:postgis-users-bounces@postgis.refractions.net] <B>On Behalf Of 
</B>Paragon Corporation<BR><B>Sent:</B> Thursday, September 06, 2012 8:14 
PM<BR><B>To:</B> 'PostGIS Users Discussion'<BR><B>Subject:</B> Re: 
[postgis-users] Geocode function fails 
whencallingnormalize_address<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2>Rob,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2>Regarding your normalize issue, it's probably one record 
causing it.&nbsp; I thought we fixed some of these issues in 2.0.1 and 2.1.0SVN 
so you might want to upgrade your script to the 2.0.1 -- there is an upgrade 
script you can use packaged in the tar ball.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2>If that still doesn't fix your issue, look for variables 
often found in regex expressions in your address (things like (, )&nbsp; and . 
for example might be throwing it off and we might not be escaping right.&nbsp; 
Hard to tell which one until your narrow down to the one bad 
record.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2>You might have to run each update as a single to see which 
one fails</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2>Couple of other tips</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2>1) you don't want to geocode 100K records all at once since 
it has to complete as a single transaction.&nbsp; You'll need to do it in 
batches.&nbsp;&nbsp;We use pgScript for that batch processing so we can just run 
in a pgAdmin window.&nbsp; though any scripting tool of your choice would 
do.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2><A 
href="http://www.postgresonline.com/journal/archives/181-pgAdmin-pgScript.html">http:/ \
/www.postgresonline.com/journal/archives/181-pgAdmin-pgScript.html</A></FONT></SPAN></DIV>
 <DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2>2) Don't do <FONT color=#000000>(geocode(address)).*<FONT 
face="Times New Roman" size=3>&nbsp;&nbsp; That often calls geocode for each 
field in output so would slow down your processing n fold (e.g. you'd have like 
5 calls instead of 1)</FONT></FONT></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2>You really want to do geocode(address) As geo</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2>(geo).geomout etc. </FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012>See example in docs -- <A 
href="http://www.postgis.org/documentation/manual-svn/Geocode.html">http://www.postgis.org/documentation/manual-svn/Geocode.html</A>&nbsp; \
                
-- we do a left join to guarantee that even if no records returned for an 
address, we can stamp the address as attempted with a -1 rating</SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012>I think we thought about 
this after we wrote the book :(</SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2>-- SHAMELESS PLUG STARTS HERE ---</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2>On bright side, we just signed our contract for Second Edition of PostGIS 
in Action and have started writing it :)&nbsp; more on that later.&nbsp; that 
will cover newer enhancements in geocoder, raster, topology, PostGIS 2.0-2.1 
(basically at least 2.0 and 2.1 changes) &nbsp;and PostgreSQL 9.1-9.3 
</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2>We'll be posting the new Table Of Contents soon here, and let people know 
when they can start purchasing</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2><A 
href="http://www.postgis.us">http://www.postgis.us</A></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2>Those who buy the new book, is our understanding, will automatically get 
electronic copies of the first edition, so even if you buy early, you'll get our 
drafts as we write them and also have the older book to work 
with.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2>-- END SHAMELESS PLUG</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2>3) The book was written before we introduced the limit, so 
again look at:</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><A 
href="http://www.postgis.org/documentation/manual-svn/Geocode.html">http://www.postgis.org/documentation/manual-svn/Geocode.html</A><FONT \
 face=Arial size=2>&nbsp; (feature is available in 2.0 as 
well)</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2>Use: geocode(address,1)</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2>for faster performance and then you don't even need the DISTINCT ON since 
you'll get back at most one answer and the best one.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2>Hope that helps,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2>Leo and Regina</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2><A 
href="http://www.postgis.us">http://www.postgis.us</A></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV>
<DIV dir=ltr align=left><SPAN class=618372819-06092012><FONT face=Arial 
color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV><BR>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> 
postgis-users-bounces@postgis.refractions.net 
[mailto:postgis-users-bounces@postgis.refractions.net] <B>On Behalf Of 
</B>Robert_Clift@doh.state.fl.us<BR><B>Sent:</B> Thursday, September 06, 2012 
1:01 PM<BR><B>To:</B> postgis-users@postgis.refractions.net<BR><B>Subject:</B> 
[postgis-users] Geocode function fails when 
callingnormalize_address<BR></FONT><BR></DIV>
<DIV></DIV><!-- Converted from text/rtf format -->
<P><FONT face=Arial size=2>Hi All:</FONT> </P>
<P><FONT face=Arial size=2>My batch geocode of 100K records is failing with 
error messages that refer to an operand in the normalize_address function. I 
based my single-column UPDATE on the multi-column example in "PostGIS In Action" 
and have tweaked the syntax only to achieve slight variation in the error 
message.</FONT></P>
<P><FONT face=Arial size=2>The query is something like:</FONT> </P><BR>
<P><FONT face=Arial size=2>UPDATE respondents_addr SET the_geom =</FONT> 
<BR><FONT face=Arial size=2>&nbsp;&nbsp; g.geomout</FONT> <BR><FONT face=Arial 
size=2>&nbsp;&nbsp; FROM (SELECT DISTINCT ON (respondent, s_year) respondent, 
s_year, (geocode(address)).*</FONT> <BR><FONT face=Arial 
size=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FROM respondents_addr As ra</FONT> 
<BR><FONT face=Arial size=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
WHERE ra.address IS NOT NULL</FONT> <BR><FONT face=Arial 
size=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ORDER BY respondent, s_year, rating) As 
g</FONT> <BR><FONT face=Arial size=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; WHERE 
g.respondent = respondents_addr.respondent</FONT> <BR><FONT face=Arial 
size=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; AND g.s_year = 
respondents_addr.s_year;</FONT> </P><BR>
<P><FONT face=Arial size=2>The most frequent error text is:</FONT> </P><BR>
<P><FONT face=Arial size=2>ERROR:&nbsp; invalid regular expression: quantifier 
operand invalid</FONT> <BR><FONT face=Arial size=2>CONTEXT:&nbsp; PL/pgSQL 
function "normalize_address" line 386 at assignment</FONT> <BR><FONT face=Arial 
size=2>PL/pgSQL function "geocode" line 10 at assignment</FONT> <BR><FONT 
face=Arial size=2>********** Error **********</FONT> <BR><FONT face=Arial 
size=2>ERROR: invalid regular expression: quantifier operand invalid</FONT> 
<BR><FONT face=Arial size=2>SQL state: 2201B</FONT> <BR><FONT face=Arial 
size=2>Context: PL/pgSQL function "normalize_address" line 386 at 
assignment</FONT> <BR><FONT face=Arial size=2>PL/pgSQL function "geocode" line 
10 at assignment</FONT> </P><BR>
<P><FONT face=Arial size=2>Also the query runs for more than an hour, (xp box 
with 3.5 GB ram, PosgreSQL 9.0.6, PostGIS 2.0.0 (might be the problem?)) and no 
matter how the error message changes it always references the normalization 
function, so I tried:</FONT></P><BR>
<P><FONT face=Arial size=2>SELECT (normalize_address(address)).* FROM 
respondents_addr WHERE address IS NOT NULL LIMIT 1000;</FONT> </P><BR>
<P><FONT face=Arial size=2>With the LIMIT it ran quickly and returned the 
expected result, so I tried:</FONT> </P><BR>
<P><FONT face=Arial size=2>SELECT (normalize_address(address)).* FROM 
respondents_addr WHERE address IS NOT NULL LIMIT 5000;</FONT> </P><BR>
<P><FONT face=Arial size=2>Bumping the LIMIT up reproduced the error from the 
geocode attempts. I'd say "aha" if only I knew why.</FONT> <BR><FONT face=Arial 
size=2>I checked the list archives and googled several combinations of words 
from the error but alas, to no avail. I suspect that I've either repeated a 
simple syntax error or that my setup is somehow off -- any help will be 
appreciatedand I might as well close with results of 
PostGIS_full_version():</FONT></P><BR>
<P><FONT face=Arial size=2>POSTGIS="2.0.0 r9605" GEOS="3.3.3-CAPI-1.7.4" 
PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.0, released 2011/12/29" 
LIBXML="2.7.8" LIBJSON="UNKNOWN" TOPOLOGY RASTER</FONT></P><BR>
<P><FONT face=Arial size=2>Thank you,</FONT> <BR><FONT face=Arial 
size=2>Rob</FONT> </P></BODY></HTML>



_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

--===============1951747532909521849==--

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic