'Re: [SQL] CASE returning multiple values (was SQL Help)'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-sql
Subject:    Re: [SQL] CASE returning multiple values (was SQL Help)
From:       C F <tacnaboyz () yahoo ! com>
Date:       2003-05-30 18:01:13
[Download RAW message or body]

I was afraid someone was going to ask that :)
Okay, I'll do my best at explaining where I'm coming from....
I'm working on a mapping application it is user-configurable.  What this means (as it \
pertains to this disucssion) is that the through a configuration file, the user is \
able to define the rules that the application will use to determine which geometries \
*and* attributes to pull from PG at various scales.  These 'rules' that the user \
defines also contain other specifics such as how to symbolize the geometry, how to \
label the geometry etc.  All of these parameters can either be hard coded by the user \
into the configuration file, or they can define an expression that will be used to \
dynamically pull it from the database.  On top of all of this, we have two more \
levels of queries. So... at the very top level, the user can define an expression \
that will determine that everything queried from this table will match these \
criteria... this is my WHERE clause. Then below this level, various rules can be \
defined... each rule can have another definition that evaluates into a SQL \
expression.  Now, I could make each rule an entirely separate query, but for one, \
they all share the exact same top level WHERE clause, and two, there could \
potentially be many many rules which I would think would cause severe performance \
issues.  Let me give you an example... Let's say we're mapping cities of the United \
States based on population...  In other words, I want to symbolize the cities on the \
map based on population (larger symbol for larger populations, smaller symbol for \
smaller populations, etc).  I also want to show the city names of the larger cities \
*only*.  So, what the client application (client to PostgreSQL) needs is; the city \
location, which rules evaluate to true, and the city names of those larger cities \
(defined by a rule).  
We have a table of cities of the world.  So the top level filter (that all rules will \
share) is, "COUNTRY = 'USA'".  
Rule 1 says that cities with a population over 1,000,000 will have a large symbol and \
be labeled with the city name.  So the sql could look like this... select longitude, \
latitude, city_name from city where country = 'USA' and population > 1000000;  
... seems easy enough, but remember we can have an infinite number of rules (not \
really inifinite, but you get the point). So....  
Rule2 says that cities with a population under 1,000,000 will have a small symbol \
(note, we do not care about the city name here).  So, by itself, the SQL could look \
like this... select longitude, latitude from city where country = 'USA' and \
population < 1000000;  
Okay, for this simple example, I would have no problem doing two different queries \
(this example is extremely simplified compared to what is possible/likely).  But what \
if the user wanted to give a different symbol for every population in 100,000 \
increments?  If our range of populations was 100,000 to 5,000,000 that would be 50 \
queries!  Not only would it be 50 queries, but it would be 50 queries using a nearly \
identical WHERE clause.  So I thought it would be more efficient to combine the \
queries into something like the following...  
select
  longitude,
  latitutde,
  (case when population > 1000000 then true else false end) as rule1,
  (case when population > 1000000 then city_name end) as label1,
  (case when population < 1000000 then true else false end) as rule2
where
  country = 'USA' 
;
 
I could just only concern the SQL with the boolean values for the rules, and return \
all city names, and let the application simply discard them, but that seems like not \
a good thing to do for very large resultsets (and again, this is overly simplified, \
we could have many such columns full of uncessary data being returned).  And by the \
way, that query cannot be written as something like...  (case when population > \
                1000000 then 'rule1' when population < 1000000 then 'rule2' end) as \
                rules
... because the rules are NOT mutually exclusive, there can many positives.
 

Anyway, hopefully I didn't leave anything important out.  It sounds like there's no \
obvious solution to avoiding multiple evaluations of the test expressions.  The rules \
are relatively static once the config files are read in, so I could conceivably \
create stored procedures with a bunch of IF statements at that time.  However, I'm \
not sure if in PG there is a way to dynamically populate the resulting recordset on \
the fly.  I can think of 10 different ways accomplish what I'm trying to do, but \
hopefully someone has some ideas on what would be the best performing. Sorry if it's \
information overload, but you tried to answer my questions, so I thought I should at \
least try to answer yours :)

Any thoughts much appreciated.





You could write a set returning function, but you'd just end up doing the same 
thing. Can you explain what it is you're trying to acheive - real 
fields/schemas etc?

-- 
Richard Huxton


---------------------------------
Do you Yahoo!?
Free online calendar with sync to Outlook(TM).


[Attachment #3 (text/html)]

<DIV>
<DIV>I was afraid someone was going to ask that :)</DIV>
<DIV>Okay, I'll do my best at explaining where I'm coming from....</DIV>
<DIV>I'm working on a mapping application it is user-configurable.&nbsp; What this \
means (as it pertains to this disucssion) is that the through a configuration file, \
the user is able to define the rules that the application will use to determine which \
geometries *and* attributes&nbsp;to pull from PG at various scales.&nbsp; These \
'rules' that the user defines also contain other specifics such as how to symbolize \
the geometry, how to label the geometry etc.&nbsp; All of these parameters can either \
be hard coded by the user into the configuration file, or they can define an \
expression that will be used to dynamically pull it from the database.&nbsp; On top \
of all of this, we have two more levels of queries.</DIV> <DIV>So... at the very top \
level, the user can define an expression that will determine that everything queried \
from this table will match these criteria... this is my WHERE clause.</DIV> <DIV>Then \
below this level, various rules can be defined... each rule can have another \
definition that evaluates into a SQL expression.&nbsp; Now, I could make each rule an \
entirely separate query, but for one, they all share the exact same top level WHERE \
clause, and two, there could potentially be many many rules which I would think would \
cause severe performance issues.&nbsp; Let me give you an example...</DIV> <DIV>Let's \
say we're mapping cities of the United States based on population...&nbsp; In other \
words, I want to symbolize the cities on the map based on population (larger symbol \
for larger populations, smaller symbol for smaller populations, etc).&nbsp; I also \
want to show the city names of the larger cities *only*.&nbsp; So, what the client \
application (client to PostgreSQL) needs is; the city location, which rules evaluate \
to true, and the city names of those larger cities (defined by a rule).</DIV> \
<DIV>&nbsp;</DIV> <DIV>We have a table of cities of the world.&nbsp; So the top level \
filter (that all rules will share) is, "COUNTRY = 'USA'".</DIV> <DIV>&nbsp;</DIV>
<DIV>Rule 1 says that cities with a population over 1,000,000 will have a large \
symbol and be labeled with the city name.&nbsp; So the sql could look&nbsp;like \
this...</DIV> <DIV>select longitude, latitude, city_name from city where country = \
'USA' and population &gt; 1000000;</DIV> <DIV>&nbsp;</DIV>
<DIV>... seems easy enough, but remember we can have an infinite number of rules (not \
really inifinite, but you get the point). So....</DIV> <DIV>&nbsp;</DIV>
<DIV>Rule2 says that cities with a population under 1,000,000 will have a small \
symbol (note, we do not care about the city name here).&nbsp; So, by itself, the SQL \
could look like this...</DIV> <DIV>select longitude, latitude from city where country \
= 'USA' and population&nbsp;&lt; 1000000;</DIV> <DIV>&nbsp;</DIV>
<DIV>Okay, for this simple example, I would have no problem doing two different \
queries (this example is extremely simplified compared to what is \
possible/likely).&nbsp; But what if the user wanted to give a different symbol for \
every population in 100,000 increments?&nbsp; If our range of populations was 100,000 \
to 5,000,000 that would be 50 queries!&nbsp; Not only would it be 50 queries, but it \
would be 50 queries using a nearly identical WHERE clause.&nbsp; So I thought it \
would be more efficient to combine the queries into something like the \
following...</DIV> <DIV>&nbsp;</DIV>
<DIV>select</DIV>
<DIV>&nbsp; longitude,</DIV>
<DIV>&nbsp; latitutde,</DIV>
<DIV>&nbsp; (case when population &gt; 1000000 then true else false end) as \
rule1,</DIV> <DIV>&nbsp; (case when population&nbsp;&gt; 1000000 then city_name end) \
as label1,</DIV> <DIV>&nbsp; (case when population&nbsp;&lt; 1000000 then true else \
false end) as rule2</DIV> <DIV>where</DIV>
<DIV>&nbsp; country = 'USA' </DIV>
<DIV>;</DIV>
<DIV>&nbsp;</DIV>
<DIV>I could just only concern the SQL with the boolean values for the rules, and \
return all city names, and let the application simply discard them, but that seems \
like not a good thing to do for very large resultsets (and again, this is overly \
simplified, we could have many such columns full of uncessary data being \
returned).&nbsp; And by the way, that query cannot be written as something \
like...</DIV> <DIV>
<DIV>&nbsp; (case when population &gt; 1000000 then 'rule1' when population&nbsp;&lt; \
1000000&nbsp;then&nbsp;'rule2' end) as rules</DIV> <DIV>... because the rules are NOT \
mutually exclusive, there can many positives.</DIV> <DIV>&nbsp;</DIV></DIV>
<DIV>Anyway,&nbsp;hopefully I didn't leave anything important out.&nbsp; It sounds \
like there's no obvious solution to avoiding multiple evaluations of the test \
expressions.&nbsp; The rules are relatively static once the config files are read in, \
so I could conceivably create stored procedures with a bunch of IF statements at that \
time.&nbsp; However, I'm not sure if in PG there is a way to dynamically populate the \
resulting&nbsp;recordset on the fly.&nbsp; I can think of 10 different ways \
accomplish what I'm trying to do, but hopefully someone has some ideas on what would \
be the best performing.</DIV> <DIV>Sorry if it's information overload, but you tried \
to answer my questions, so I thought I should at least&nbsp;try to answer yours \
:)</DIV> <DIV><BR>Any thoughts much appreciated.<BR><BR><BR></DIV>
<BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #1010ff 2px \
solid"><BR><BR>You could write a set returning function, but you'd just end up doing \
the same <BR>thing. Can you explain what it is you're trying to acheive - real \
<BR>fields/schemas etc?<BR><BR>-- <BR>Richard Huxton</BLOCKQUOTE></DIV><p><hr SIZE=1> \
Do you Yahoo!?<br> Free <a \
href="http://us.rd.yahoo.com/mail_us/tag/*http://calendar.yahoo.com">online \
calendar</a> with sync to Outlook(TM).



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic