'Re: [postgis-users] Coverages and PostGIS wiki page'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgis-users
Subject:    Re: [postgis-users] Coverages and PostGIS wiki page
From:       Bryce L Nordgren <bnordgren () fs ! fed ! us>
Date:       2011-07-22 21:04:40
Message-ID: OF00E23ECB.4024490D-ON872578D5.006D319F-002578D5.0073C838 () fs ! fed ! us
[Download RAW message or body]

This is a multipart message in MIME format.

This is a multipart message in MIME format.
--=_alternative 0073C836002578D5_=
Content-Type: text/plain; charset="US-ASCII"

I added geomval to the concept map on the wiki page. Also added a section 
on it.

postgis-users-bounces@postgis.refractions.net wrote on 07/22/2011 05:48:24 
PM:

> Some observations:
> [...]

I think you mistook my intent. I want to show people that if they intend 
to do the "obvious" thing, meaning to think of a table as a raster 
coverage the same way that a table has been taken to be a vector coverage 
for years, there are a few caveats. 

How you think of the table depends on what you're doing with the table. 

> This distinction disappear if you do something like this on a raster
> coverage to produce a very different kind of raster coverage being 
> something very new in the GIS world and having, really, no 
> distinction with a vector coverage: CREATE TABLE rastcov AS SELECT 
> ST_AsRaster((gv).geom, 1), (gv).val FROM (ST_DumpAsPolygons(rast) 
> gv) foo. (Note that ST_AsRaster() is still to be implemented). The 
> result is a raster coverage, composed of a series of rows acting 
> exactly like a vector coverage but in which the geometries are 
> stored as small (or bigger) rasters. You might prefer to work with 
> this kind of object if you really want your raster coverage to have 
> the exact same characteristics of a vector coverage (one geometric 
> area per row value).

So essentially you're proposing to create one raster for every distinct 
pixel value in the raster? Each raster being a "mask" containing the 
locations of it's one value. And you want to select the distinct pixel 
values first by creating polygons which outline the pixels for each value, 
then converting each individual geometry back into a mask containing only 
"nodata" or the value in question.

I think it might be faster to use ST_Reclass(), provided there was some 
"ST_DistinctValues(raster)" method.

Could I request a "ST_AsRaster(setof geomval) returns raster" which acts 
as the inverse of ST_DumpAsPolygons? So if you did 
ST_AsRaster(ST_DumpAsPolygons(rast)), you'd get your original raster back? 
That's more along the lines of what I'd expect. That way, you can take the 
raster apart into geometries, edit them, then get a new raster back. I was 
kind of assuming that was what ST_AsRaster did...

> "The third observation is that a "raster coverage" does not need to 
> be backed by a table at all. A single raster item could provide all 
> of the data required." Totally agree, but storing a raster coverage 
> in PostgreSQL as a multi row table give some major advantages: 1) 
> You get fast gist indexing of the tiles for (almost) free, 2) You 
> can store coverage of (theoretically) 32 TB, 3) You don't have to 
> provide tiles for all the covered area, 4) tiles can overlap, 5) 
> Tiles can have different dimensions, 6) implementation was relatively 
easy.

The point was that any of the "aggregation" functions (or mapalgebra) 
could produce something for display or analysis by returning an item with 
the "raster" data type. It need not be stored in a table before being 
displayed or used in the next processing step.

Note that if one is trying to consider a table to be a single raster 
broken into rows, #3 #4 and #5 are disadvantages. Extra care must be taken 
to constrain the flexibility those items provide. Calling attention to 
this fact is the point of the wiki page.

The reason is: the "raster coverage" itself has metadata which pertains to 
the coverage as a whole. If a "row raster" is to be considered to belong 
to a "raster coverage", it must conform to the "raster coverage metadata". 
Including:

* location of a corner pixel
* pixel size in each dimension
* number of pixels in each dimension
* number of bands
* types of bands
* srid

These are recorded in "raster_columns", along with the column extent. Each 
row must conform to all of these criteria, and the extent of the "row" 
must be contained within the extent of the column. Failure to comply means 
the row rasters cannot be aggregated in a simple way to form a larger 
"raster coverage": they must be resampled onto a common grid before 
display or processing.

Clearly, if you don't consider the rows to be part of a larger coverage, 
then this does not apply.

> 
> "Clearly, a table with a raster column could be used as an index 
> into many unrelated rasters. This is a legitimate use of the table 
> facility. However, it is an example of a table which could not be 
> used to provide information to a "raster coverage." I totally agree 
> with your conclusion. It is possible to use PostGIS raster this but 
> it is also very possible to use it in a clever way. 

Again, the point is that it's perfectly valid to use it this way. It's not 
wrong, and it's not un-clever. If you need an index, take advantage of the 
flexibility offered by #3, #4, and #5. The advantage of doing so is that 
you do not have to create a new table for each "type" of raster. For 
anything other than the most trivial of cases, that will become 
unmanageable.

In addition to the "index" case, it is perfectly acceptable to consider a 
table to be a "raster type definition". The table may then contain many 
raster coverage instances belonging to the same type. However, you still 
cannot aggregate rows belonging to different raster coverages into some 
"hybrid raster coverage". 

For example, define a table to contain "Landsat Band 2" images. Columns 
are "raster, path, row, date". (Path & row is a coarse global grid used by 
landsat to describe the location of a scene. Two scenes over the same 
path, row cover the same area.) Insert one scene over a given path/row; 
divide it into 1000 rows just for kicks. 16 days later, you get another 
scene with the same path/row and load it into the database using 1000 
rows. 

You must guard against combining rows from different scenes. All queries 
have to select for path, row, date. This is also legitimate if this is 
what you need to do. But it puts the onus on the user to be aware of the 
subtleties to ensure they achieve the expected results.

So we have two ways of using a table, and the user needs to be aware of 
the subtleties of both:

* The table stores more than one raster coverage. (e.g., "raster type 
definition", "index of unrelated images") Here the user must ensure row 
data from different coverages are never mixed.
* The table stores only one raster coverage. Here the user does not have 
to worry about mixing coverages.

Vector coverage tables are rarely used to store multiple coverage 
instances, but if they are, there is no technical obstacle to combinining 
elements from different coverages. There is an obstacle with inhomogeneous 
rasters. Vector coverage tables do not sport a grid definition, so the 
grid definitions of individual rows do not need to be compatible. Raster 
row data which are part of the same collection must ensure their grid 
definitions are compatible.

Bryce
--=_alternative 0073C836002578D5_=
Content-Type: text/html; charset="US-ASCII"

<br><font size=2 face="sans-serif">I added geomval to the concept map on
the wiki page. Also added a section on it.</font>
<br>
<br><tt><font size=2>postgis-users-bounces@postgis.refractions.net wrote
on 07/22/2011 05:48:24 PM:<br>
<br>
&gt; Some observations:<br>
&gt; [...]</font></tt>
<br>
<br><tt><font size=2>I think you mistook my intent. I want to show people
that if they intend to do the &quot;obvious&quot; thing, meaning to think
of a table as a raster coverage the same way that a table has been taken
to be a vector coverage for years, there are a few caveats. </font></tt>
<br>
<br><tt><font size=2>How you think of the table depends on what you're
doing with the table. </font></tt>
<br>
<br><tt><font size=2>&gt; This distinction disappear if you do something
like this on a raster<br>
&gt; coverage to produce a very different kind of raster coverage being
<br>
&gt; something very new in the GIS world and having, really, no <br>
&gt; distinction with a vector coverage: CREATE TABLE rastcov AS SELECT
<br>
&gt; ST_AsRaster((gv).geom, 1), (gv).val FROM (ST_DumpAsPolygons(rast)
<br>
&gt; gv) foo. (Note that ST_AsRaster() is still to be implemented). The
<br>
&gt; result is a raster coverage, composed of a series of rows acting <br>
&gt; exactly like a vector coverage but in which the geometries are <br>
&gt; stored as small (or bigger) rasters. You might prefer to work with
<br>
&gt; this kind of object if you really want your raster coverage to have
<br>
&gt; the exact same characteristics of a vector coverage (one geometric
<br>
&gt; area per row value).</font></tt>
<br>
<br><tt><font size=2>So essentially you're proposing to create one raster
for every distinct pixel value in the raster? Each raster being a &quot;mask&quot;
containing the locations of it's one value. And you want to select the
distinct pixel values first by creating polygons which outline the pixels
for each value, then converting each individual geometry back into a mask
containing only &quot;nodata&quot; or the value in question.</font></tt>
<br>
<br><tt><font size=2>I think it might be faster to use ST_Reclass(), provided
there was some &quot;ST_DistinctValues(raster)&quot; method.</font></tt>
<br>
<br><tt><font size=2>Could I request a &quot;ST_AsRaster(setof geomval)
returns raster&quot; which acts as the inverse of ST_DumpAsPolygons? So
if you did ST_AsRaster(ST_DumpAsPolygons(rast)), you'd get your original
raster back? That's more along the lines of what I'd expect. That way,
you can take the raster apart into geometries, edit them, then get a new
raster back. I was kind of assuming that was what ST_AsRaster did...</font></tt>
<br><tt><font size=2><br>
&gt; &quot;The third observation is that a &quot;raster coverage&quot;
does not need to <br>
&gt; be backed by a table at all. A single raster item could provide all
<br>
&gt; of the data required.&quot; Totally agree, but storing a raster coverage
<br>
&gt; in PostgreSQL as a multi row table give some major advantages: 1)
<br>
&gt; You get fast gist indexing of the tiles for (almost) free, 2) You
<br>
&gt; can store coverage of (theoretically) 32 TB, 3) You don't have to
<br>
&gt; provide tiles for all the covered area, 4) tiles can overlap, 5) <br>
&gt; Tiles can have different dimensions, 6) implementation was relatively
easy.<br>
</font></tt>
<br><tt><font size=2>The point was that any of the &quot;aggregation&quot;
functions (or mapalgebra) could produce something for display or analysis
by returning an item with the &quot;raster&quot; data type. It need not
be stored in a table before being displayed or used in the next processing
step.</font></tt>
<br>
<br><tt><font size=2>Note that if one is trying to consider a table to
be a single raster broken into rows, #3 #4 and #5 are disadvantages. Extra
care must be taken to constrain the flexibility those items provide. Calling
attention to this fact is the point of the wiki page.</font></tt>
<br>
<br><tt><font size=2>The reason is: the &quot;raster coverage&quot; itself
has metadata which pertains to the coverage as a whole. If a &quot;row
raster&quot; is to be considered to belong to a &quot;raster coverage&quot;,
it must conform to the &quot;raster coverage metadata&quot;. Including:</font></tt>
<br>
<br><tt><font size=2>* location of a corner pixel</font></tt>
<br><tt><font size=2>* pixel size in each dimension</font></tt>
<br><tt><font size=2>* number of pixels in each dimension</font></tt>
<br><tt><font size=2>* number of bands</font></tt>
<br><tt><font size=2>* types of bands</font></tt>
<br><tt><font size=2>* srid</font></tt>
<br>
<br><tt><font size=2>These are recorded in &quot;raster_columns&quot;,
along with the column extent. Each row must conform to all of these criteria,
and the extent of the &quot;row&quot; must be contained within the extent
of the column. Failure to comply means the row rasters cannot be aggregated
in a simple way to form a larger &quot;raster coverage&quot;: they must
be resampled onto a common grid before display or processing.</font></tt>
<br>
<br><tt><font size=2>Clearly, if you don't consider the rows to be part
of a larger coverage, then this does not apply.</font></tt>
<br>
<br><tt><font size=2>&gt; <br>
&gt; &quot;Clearly, a table with a raster column could be used as an index
<br>
&gt; into many unrelated rasters. This is a legitimate use of the table
<br>
&gt; facility. However, it is an example of a table which could not be
<br>
&gt; used to provide information to a &quot;raster coverage.&quot; I totally
agree <br>
&gt; with your conclusion. It is possible to use PostGIS raster this but
<br>
&gt; it is also very possible to use it in a clever way. </font></tt>
<br>
<br><tt><font size=2>Again, the point is that it's perfectly valid to use
it this way. It's not wrong, and it's not un-clever. If you need an index,
take advantage of the flexibility offered by #3, #4, and #5. The advantage
of doing so is that you do not have to create a new table for each &quot;type&quot;
of raster. For anything other than the most trivial of cases, that will
become unmanageable.</font></tt>
<br>
<br><tt><font size=2>In addition to the &quot;index&quot; case, it is perfectly
acceptable to consider a table to be a &quot;raster type definition&quot;.
The table may then contain many raster coverage instances belonging to
the same type. However, you still cannot aggregate rows belonging to different
raster coverages into some &quot;hybrid raster coverage&quot;. </font></tt>
<br>
<br><tt><font size=2>For example, define a table to contain &quot;Landsat
Band 2&quot; images. Columns are &quot;raster, path, row, date&quot;. (Path
&amp; row is a coarse global grid used by landsat to describe the location
of a scene. Two scenes over the same path, row cover the same area.) Insert
one scene over a given path/row; divide it into 1000 rows just for kicks.
16 days later, you get another scene with the same path/row and load it
into the database using 1000 rows. </font></tt>
<br>
<br><tt><font size=2>You must guard against combining rows from different
scenes. All queries have to select for path, row, date. This is also legitimate
if this is what you need to do. But it puts the onus on the user to be
aware of the subtleties to ensure they achieve the expected results.</font></tt>
<br>
<br><tt><font size=2>So we have two ways of using a table, and the user
needs to be aware of the subtleties of both:</font></tt>
<br>
<br><tt><font size=2>* The table stores more than one raster coverage.
(e.g., &quot;raster type definition&quot;, &quot;index of unrelated images&quot;)
Here the user must ensure row data from different coverages are never mixed.</font></tt>
<br><tt><font size=2>* The table stores only one raster coverage. Here
the user does not have to worry about mixing coverages.</font></tt>
<br>
<br><tt><font size=2>Vector coverage tables are rarely used to store multiple
coverage instances, but if they are, there is no technical obstacle to
combinining elements from different coverages. There is an obstacle with
inhomogeneous rasters. Vector coverage tables do not sport a grid definition,
so the grid definitions of individual rows do not need to be compatible.
Raster row data which are part of the same collection must ensure their
grid definitions are compatible.</font></tt>
<br>
<br><tt><font size=2>Bryce</font></tt>
--=_alternative 0073C836002578D5_=--

_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

[prev in list] [next in list] [prev in thread] [next in thread]