[prev in list] [next in list] [prev in thread] [next in thread]
List: spacewalk-devel
Subject: Re: [Spacewalk-devel] Spacewalk-devel Digest, Vol 45, Issue 4
From: "Parsons, Aron" <parsonsa () bit-sys ! com>
Date: 2012-02-03 17:20:21
Message-ID: 38BF0DD94618A6449F9DC97ECCB360A92AD1AFEC () suhn-mbx2 ! SIX3 ! local
[Download RAW message or body]
The left join on rhnChecksumView is horrendously slow, exactly the same thing that \
was happening with doing config file diffs. An export of a single channel with 3000 \
packages takes hours compared to 10 minutes with this patch. I can post the 'explain \
analyze' output if you'd like.
I'll brush on my SQL and learn why this is so bad. In the meantime, can someone \
explain why this is bad in this case? Obviously it's not an ideal query, but \
PostgreSQL deciding to do a sequence scan on rhnChecksumView with 2,000,000+ rows \
just doesn't work. Joining the subquery with the same table on a single column seems \
straightforward enough; I don't see how it's error prone as Michael explained in the \
previously referenced commit.
/aron
-----Original Message-----
Message: 4
Date: Fri, 3 Feb 2012 17:02:31 +0100
From: Jan Pazdziora <jpazdziora@redhat.com>
To: spacewalk-devel@redhat.com
Subject: Re: [Spacewalk-devel] Patch review for exportLib.py
Message-ID: <20120203160231.GB6087@redhat.com>
Content-Type: text/plain; charset=us-ascii
On Fri, Feb 03, 2012 at 03:55:53PM +0000, Parsons, Aron wrote:
> Can someone verify that this is a sound change? rhn-satellite-exporter is \
> extremely slow on PostgreSQL when getting the file checksums. This speeds it up \
> dramatically.
> It's the same issue we ran into with PostgreSQL doing a left join on \
> rhnChecksumView in 'query_client_get_file'. PostgreSQL does a sequence scan on a \
> table with millions of rows. I patched it in \
> 1e64dc4546d96c58dd84a88c3c634b2d2ce164a4 but Michael had to fix it in \
> 79e57d3b215340a54e911a073408f0eb7c6afc1f. I just want to be sure my SQL is proper \
> on this one before committing this time.
> diff --git a/backend/satellite_tools/exporter/exportLib.py b/backend/satellite_t
> index 1646c58..1f265a2 100644
> --- a/backend/satellite_tools/exporter/exportLib.py
> +++ b/backend/satellite_tools/exporter/exportLib.py
> @@ -753,7 +753,12 @@ class _PackageDumper(BaseRowDumper):
> c.checksum_type as "checksum-type",
> c.checksum, pf.linkto, pf.flags, pf.verifyflags, pf.lang
> from rhnPackageFile pf
> - left join rhnChecksumView c
> + left join
> + (select c.*
> + from rhnPackageFile pf
> + inner join rhnChecksumView c
> + on pf.checksum_id = c.id
> + where pf.package_id = :package_id) c
> on pf.checksum_id = c.id,
> rhnPackageCapability pc
> where pf.capability_id = pc.id
I'm affraid you'd get the same
having a table twice in select is mostly a bug
response on this one. What are you trying to achieve? This query
doesn't even have any lookup_* function.
--
Jan Pazdziora
Principal Software Engineer, Satellite Engineering, Red Hat
------------------------------
_______________________________________________
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel
End of Spacewalk-devel Digest, Vol 45, Issue 4
**********************************************
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic