[prev in list] [next in list] [prev in thread] [next in thread]
List: postgresql-sql
Subject: Re: [SQL] must appear in GROUP by clause issue
From: George Woodring <george.woodring () iglass ! net>
Date: 2012-07-02 13:01:12
Message-ID: CACi+J=Rm8q+JK-9X_MPZje4_tRx_6CqGfCyBZ2OeFyeEGNYntQ () mail ! gmail ! com
[Download RAW message or body]
This was helpful, thank you. The issue was that one of the tables had a
"name" column added. Changing the alias name in the query so that it was
unique was the solution.
Thanks for the help,
George
On Sat, Jun 30, 2012 at 12:34 AM, Lee Hachadoorian <
Lee.Hachadoorian+L@gmail.com> wrote:
> On Fri, Jun 29, 2012 at 4:02 PM, George Woodring
> <george.woodring@iglass.net> wrote:
> > I have 2 (8.4.11) servers that I am testing the following query:
> >
> > SELECT count(*),
> > maptrunc(cpeloc.lat, 4.5)::text || maptrunc(cpeloc.long,
> > 4.5)::text AS name,
> > AVG(cpeloc.lat) AS lt,
> > AVG(cpeloc.long) AS lng
> > FROM cable_billing JOIN cpeloc USING(mac) LEFT JOIN davic USING(mac)
> > WHERE (status = 0 OR status is null) AND (cpeloc.lat is not null AND
> > cpeloc.lat !=0 AND cpeloc.long is not null AND cpeloc.long != 0)
> > GROUP BY name ORDER BY name;
> >
> > On the first server the query works,
> >
> > count | name | lt | lng
> > -------+-----------------+---------------------+----------------------
> > 1 | 43.0425-94.2295 | 43.0429410000000000 | -94.2299740000000000
> > 1 | 43.0525-94.260 | 43.0526200000000000 | -94.2603800000000000
> > 1 | 43.054-94.224 | 43.0543150000000000 | -94.2244750000000000
> > (51 rows)
> >
> > On the second server I get an error.
> >
> > ERROR: column "cpeloc.lat" must appear in the GROUP BY clause or be
> used in
> > an aggregate function
> > LINE 1: select count(*), maptrunc(cpeloc.lat, 4.5)::text || maptrunc...
> >
> > I was thinking the one server that works the maptrunc function was
> thought
> > of as an agg function, but they are both defined the same
> >
> > \df maptrunc
> > List of functions
> > Schema | Name | Result data type | Argument data types | Type
> > --------+----------+------------------+---------------------+--------
> > public | maptrunc | numeric | numeric, numeric | normal
> > (1 row)
> >
> > The only difference between the 2 servers are that the one that works
> > returns 51 rows and the one that does not I would expect to return 12000
> > rows.
> >
> > I can make the one that does not work functional by changing
> >
> > maptrunc(cpeloc.lat)::text --> max(maptrunc(cpeloc.lat))::text
> > maptrunc(cpeloc.long)::text --> max(maptrunc(cpeloc.long))::text
> >
> > however, that one breaks on the first server with the error
> >
> > ERROR: aggregates not allowed in GROUP BY clause
> > LINE 1: select count(*), max(maptrunc(cpeloc.lat, 4.5))::text || max...
> >
> > Any suggestions would be appreciated.
> >
> > George Woodring
> > --
> > iGLASS Networks
> > www.iglass.net
>
> Are you sure the queries are the same on both servers? I'm not sure how
>
> SELECT max(…) || max (…) AS name
> …
> GROUP BY name
>
> could ever work without leading to the "aggregates not allowed in
> GROUP BY clause".
>
> I *think* this could also happen if for some reason a "name" column
> was added to one of the tables on Server2 but not on Server1. Server1
> groups by name as an output column. On Server2, name is interpreted as
> an input column. Once you add max(…), cpeloc is now part of an
> aggregate expression, but the GROUP BY actually operates over the
> input column. Switch back to Server1, if name is not in the table
> definition it is interpreted as an output column and GROUP BY max(…)
> fails.
>
> All of this is kind of spitting in the dark without table defs and
> maybe the definition of maptrunc(), although at a guess I would say
> that it looks like you are doing is you are trying to join locations
> within a tolerance. maptrunc() truncates your latititude and longitude
> measurement to within some acceptable error, and you are averaging all
> candidate lat/longs to come up with "the" location.
>
> Assuming it's not just differences in table defs between the servers,
> I would suggest simplifying the query to debug it. Since lat and long
> are both in cpelong, and the only other column mentioned is status, at
> least one of the other tables (cable_billing and davic) is
> unnecessary. Replace GROUP BY name ORDER BY name with GROUP BY 2 ORDER
> BY 2. Consider generating name within a subquery before aggregating,
> i.e.:
>
> SELECT count(*),
> name,
> AVG(cpeloc.lat) AS lt,
> AVG(cpeloc.long) AS lng
> FROM (
> SELECT
> maptrunc(cpeloc.lat, 4.5)::text || maptrunc(cpeloc.long,
> 4.5)::text AS name,
> cpeloc.lat,
> cpeloc.long
> FROM
> cable_billing JOIN cpeloc USING(mac) LEFT JOIN davic
> USING(mac)
> WHERE COALESCE(status, 0) = 0
> AND COALESCE(cpeloc.lat, 0) !=0
> AND COALESCE(cpeloc.long, 0) != 0
> )
> GROUP BY name ORDER BY name;
>
> I would also inspect the subquery, ordered by name, to see if anything
> weird leaps out at you in terms of the adjacent rows that *should* be
> grouped. As an aside, I replaced the (x = 0 OR x IS NULL) and converse
> constructions with COALESCE() functions, which IMO are a bit easier to
> read.
>
> Hope this is at all helpful.
>
> --Lee
>
> --
> Lee Hachadoorian
> PhD, Earth & Environmental Sciences (Geography)
> Research Associate, CUNY Center for Urban Research
> http://freecity.commons.gc.cuny.edu/
>
--
iGLASS Networks
www.iglass.net
[Attachment #3 (text/html)]
This was helpful, thank you. The issue was that one of the tables had a \
"name" column added. Changing the alias name in the query so that it was \
unique was the solution.<div><br></div><div>Thanks for the help,</div> \
<div>George<br><br><div class="gmail_quote">On Sat, Jun 30, 2012 at 12:34 AM, Lee \
Hachadoorian <span dir="ltr"><<a href="mailto:Lee.Hachadoorian+L@gmail.com" \
target="_blank">Lee.Hachadoorian+L@gmail.com</a>></span> wrote:<br> <blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Fri, Jun 29, 2012 at \
4:02 PM, George Woodring<br> <<a \
href="mailto:george.woodring@iglass.net">george.woodring@iglass.net</a>> \
wrote:<br> > I have 2 (8.4.11) servers that I am testing the following query:<br>
><br>
> SELECT count(*),<br>
> maptrunc(cpeloc.lat, 4.5)::text || maptrunc(cpeloc.long,<br>
> 4.5)::text AS name,<br>
> AVG(cpeloc.lat) AS lt,<br>
> AVG(cpeloc.long) AS lng<br>
> FROM cable_billing JOIN cpeloc USING(mac) LEFT JOIN davic USING(mac)<br>
> WHERE (status = 0 OR status is null) AND (cpeloc.lat is not null AND<br>
> cpeloc.lat !=0 AND cpeloc.long is not null AND cpeloc.long != 0)<br>
> GROUP BY name ORDER BY name;<br>
><br>
> On the first server the query works,<br>
><br>
> count | name | lt | lng<br>
> -------+-----------------+---------------------+----------------------<br>
> 1 | 43.0425-94.2295 | 43.0429410000000000 | -94.2299740000000000<br>
> 1 | 43.0525-94.260 | 43.0526200000000000 | -94.2603800000000000<br>
> 1 | <a href="tel:43.054-94.224" value="+14305494224">43.054-94.224</a> | \
43.0543150000000000 | -94.2244750000000000<br> > (51 rows)<br>
><br>
> On the second server I get an error.<br>
><br>
> ERROR: column "cpeloc.lat" must appear in the GROUP BY clause or be \
used in<br> > an aggregate function<br>
> LINE 1: select count(*), maptrunc(cpeloc.lat, 4.5)::text || maptrunc...<br>
><br>
> I was thinking the one server that works the maptrunc function was thought<br>
> of as an agg function, but they are both defined the same<br>
><br>
> \df maptrunc<br>
> List of functions<br>
> Schema | Name | Result data type | Argument data types | Type<br>
> --------+----------+------------------+---------------------+--------<br>
> public | maptrunc | numeric | numeric, numeric | normal<br>
> (1 row)<br>
><br>
> The only difference between the 2 servers are that the one that works<br>
> returns 51 rows and the one that does not I would expect to return 12000<br>
> rows.<br>
><br>
> I can make the one that does not work functional by changing<br>
><br>
> maptrunc(cpeloc.lat)::text --> max(maptrunc(cpeloc.lat))::text<br>
> maptrunc(cpeloc.long)::text --> max(maptrunc(cpeloc.long))::text<br>
><br>
> however, that one breaks on the first server with the error<br>
><br>
> ERROR: aggregates not allowed in GROUP BY clause<br>
> LINE 1: select count(*), max(maptrunc(cpeloc.lat, 4.5))::text || max...<br>
><br>
> Any suggestions would be appreciated.<br>
><br>
> George Woodring<br>
> --<br>
> iGLASS Networks<br>
> <a href="http://www.iglass.net" target="_blank">www.iglass.net</a><br>
<br>
</div></div>Are you sure the queries are the same on both servers? I'm not sure \
how<br> <br>
SELECT max(…) || max (…) AS name<br>
…<br>
GROUP BY name<br>
<br>
could ever work without leading to the "aggregates not allowed in<br>
GROUP BY clause".<br>
<br>
I *think* this could also happen if for some reason a "name" column<br>
was added to one of the tables on Server2 but not on Server1. Server1<br>
groups by name as an output column. On Server2, name is interpreted as<br>
an input column. Once you add max(…), cpeloc is now part of an<br>
aggregate expression, but the GROUP BY actually operates over the<br>
input column. Switch back to Server1, if name is not in the table<br>
definition it is interpreted as an output column and GROUP BY max(…)<br>
fails.<br>
<br>
All of this is kind of spitting in the dark without table defs and<br>
maybe the definition of maptrunc(), although at a guess I would say<br>
that it looks like you are doing is you are trying to join locations<br>
within a tolerance. maptrunc() truncates your latititude and longitude<br>
measurement to within some acceptable error, and you are averaging all<br>
candidate lat/longs to come up with "the" location.<br>
<br>
Assuming it's not just differences in table defs between the servers,<br>
I would suggest simplifying the query to debug it. Since lat and long<br>
are both in cpelong, and the only other column mentioned is status, at<br>
least one of the other tables (cable_billing and davic) is<br>
unnecessary. Replace GROUP BY name ORDER BY name with GROUP BY 2 ORDER<br>
BY 2. Consider generating name within a subquery before aggregating,<br>
i.e.:<br>
<br>
SELECT count(*),<br>
<div class="im"> name,<br>
AVG(cpeloc.lat) AS lt,<br>
AVG(cpeloc.long) AS lng<br>
</div>FROM (<br>
SELECT<br>
<div class="im"> maptrunc(cpeloc.lat, 4.5)::text || \
maptrunc(cpeloc.long, 4.5)::text AS name,<br> </div> cpeloc.lat,<br>
cpeloc.long<br>
<div class="im"> FROM<br>
cable_billing JOIN cpeloc USING(mac) LEFT JOIN davic USING(mac)<br>
</div> WHERE COALESCE(status, 0) = 0<br>
AND COALESCE(cpeloc.lat, 0) !=0<br>
AND COALESCE(cpeloc.long, 0) != 0<br>
)<br>
<div class="im">GROUP BY name ORDER BY name;<br>
<br>
</div>I would also inspect the subquery, ordered by name, to see if anything<br>
weird leaps out at you in terms of the adjacent rows that *should* be<br>
grouped. As an aside, I replaced the (x = 0 OR x IS NULL) and converse<br>
constructions with COALESCE() functions, which IMO are a bit easier to<br>
read.<br>
<br>
Hope this is at all helpful.<br>
<span class="HOEnZb"><font color="#888888"><br>
--Lee<br>
<br>
--<br>
Lee Hachadoorian<br>
PhD, Earth & Environmental Sciences (Geography)<br>
Research Associate, CUNY Center for Urban Research<br>
<a href="http://freecity.commons.gc.cuny.edu/" \
target="_blank">http://freecity.commons.gc.cuny.edu/</a><br> \
</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br>iGLASS \
Networks<br><a href="http://www.iglass.net">www.iglass.net</a><br> </div>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic