[prev in list] [next in list] [prev in thread] [next in thread] 

List:       mapserver-dev
Subject:    Re: [mapserver-dev] Query Sandbox: PostGIS
From:       "Jim Klassen" <Jim.Klassen () ci ! stpaul ! mn ! us>
Date:       2009-03-17 14:28:03
Message-ID: 49BF6D220200004F0000F966 () jeckle
[Download RAW message or body]

Sorry for the delay, I've been swamped with email.

The patches are separate. Only the last patch needs to be applied.

For me the 8x is a great improvement over where we were. At the moment, the delay \
seems to be pulling the data from the DB (OCIStmtFetch2). 

I've been thinking along the lines of a feature cache and it would improve \
performance even more so long as it doesn't eat up too much memory.

In the mean time, I have been playing with creating a separate thread that handles \
getting the data from Oracle and pre-fetches it ARRAY_SIZE (1024) features at a time \
into one of two buffers. NextShape and GetShape then just copy from these buffers. \
The benefit is on the query modes the DB connection is busy nearly 100% of the time \
which gives about another 10% speedup. There are still downsides which I haven't had \
time to sort out. I was expecting more of an overall speedup. It appeared to slow \
down NextShape because mode=map is only half as fast with the patch. I would've \
expected a gain across the board. So... there is something different in what \
NextShape vs. GetShape are doing yet (in the single-pass patch) where NextShape is \
still more efficient by a notable amount. Also, the database doesn't seem to like to \
be hit that hard and throws ora-600 errors, although the results on the client side \
appear to be correct.

One thing I discovered is the first pass (NextShapes) ends up waiting for the \
fetching thread to finish getting the data, while in the second pass (GetShape) the \
fetching thread ends up waiting for GetShape to be done with the buffer

I suspect this version is going to become more of a learning tool than production \
code, but if anyone is interested, I'd be willing to post the threaded version, but I \
don't want people confusing it with something that is anywhere near production ready. \
Also, it uses pthreads so I'm guessing this won't work on Windows.

I need to investigate further if there is a difference in the timings of NextShape in \
mode=map vs. the WFS results.

> > > "Steve Lime" <Steve.Lime@dnr.state.mn.us> 03/12/09 8:19 AM >>>
I guess the question is if 8x is fast enough. I was thinking of adding a feature \
cache  that could be used in addition to the new code (and would be tunable). Small \
result sets would only hit the db once.

Jim, do your patches need to be applied in any order?

Steve

> > > "Jim Klassen" <Jim.Klassen@ci.stpaul.mn.us> 03/12/09 12:21 AM >>>
I made some updates to my previous patch and attached a new patch to #2933.

The new results for the same dataset are:
* trunk takes 6m8s (1m19s in mapserver code)
* prepare takes 1m35s (23.5s in mapserver code)
* single-pass takes 46.5s (26.5s in mapserver code)
* both function=none and function=filter function, although function=none returns one \
more row than function=filter. function=filter agrees with postgis and shape. I \
haven't had time to track this down.

Conclusion:
* Single-pass is about 8x faster than trunk.
* In single-pass PostGIS is 2.4x faster than Oracle.

BTW: the bus error in trunk with filter=none appears to be related to a failed \
malloc. I haven't looked into it enough yet to know for sure what is going on.

> > > "Jim Klassen" <Jim.Klassen@ci.stpaul.mn.us> 03/11/09 6:47 PM >>>
Ok, I feel kind of stupid here in that I should have noticed something was wrong with \
a file that small. (The DBF of the shape was 300MB). My test dataset is the address \
points for the City of St. Paul (2D Point with 32 attributes).

Note: single-pass is in ticket #2933, prepare is in ticket #2927, my trunk is at \
r8765.

It turns out:
* The dataset is 139,000 rows (excluding the ones with null geometry, which mapserver \
                skips)
* The final GML size is 364MB
* The single-pass time to generate that file is 3m17s (1m6s in mapserver the rest \
                waiting for oracle over the network)
* The trunk time to generate the file is 3m21s (1m2s in mapserver) with a bus error \
                at the 100MB mark.
* The trunk+prepare time to generate the file is 1m3s (20s in mapserver) with a bus \
                error at the 100MB mark.
* The PostGIS (from trunk) time to generate the file is 1m21s (17s in mapserver) and \
                finished successfully.
* THe PostGIS (from single-pass) time to generate the file is 19s (14s in mapserver) \
                and finished successfully.
* There is a bug somewhere on using function = none (vs. function = filter which \
works fine) where the record id seems to be getting mangled causing single-pass to \
                exit early.  Also there appears to still be a bug in trunk given the \
                bus errors.
* MapServer appears to use about 25% cpu and 200MB ram on single-pass vs. 33% cpu and \
13MB ram in trunk (according to activity monitor). Shark says most of this is spent \
in OCIStmtFetch2() in single-pass and OCIStmtExecute() in trunk and vs. 38% cpu and \
13MB ram for trunk+prepare.

Conclusion:
* Just looking at the rates, single-pass is about 3.58x faster than trunk, and \
                prepare is about 3.2x faster than trunk.
* For PostGIS, I get single-pass is about 4.3x faster than trunk.
* In single-pass PostGIS is 10.9x faster than Oracle. In trunk PostGIS is 8.9x faster \
                then Oracle.
* There is something different with the oracle at the office vs. the sprint because I \
wasn't seeing bus errors at the sprint.

> > > Jim Klassen 03/11/09 11:39 AM >>>
I added the OracleSpatial patch to track as #2933.
 
On our 220k point address database the query time went from 6 minutes down to 32 \
seconds and produced about 360K of GML.

> > > On 3/10/2009 at 10:33 AM, Paul Ramsey <pramsey@opengeo.org> wrote:
I have updated the PostGIS driver in the sandbox to use the new logic.
My timing on outputting 5Mb of GML via WFS query has gone from 12s to
0.5s, a 24x improvement.

Great work Steve!

P.
_______________________________________________
mapserver-dev mailing list
mapserver-dev@lists.osgeo.org 
http://lists.osgeo.org/mailman/listinfo/mapserver-dev

_______________________________________________
mapserver-dev mailing list
mapserver-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-dev

_______________________________________________
mapserver-dev mailing list
mapserver-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-dev


_______________________________________________
mapserver-dev mailing list
mapserver-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-dev


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic