[prev in list] [next in list] [prev in thread] [next in thread] 

List:       spacewalk-devel
Subject:    Re: [Spacewalk-devel] Spacewalk-devel Digest, Vol 45, Issue 4
From:       "Parsons, Aron" <parsonsa () bit-sys ! com>
Date:       2012-02-03 17:20:21
Message-ID: 38BF0DD94618A6449F9DC97ECCB360A92AD1AFEC () suhn-mbx2 ! SIX3 ! local
[Download RAW message or body]

The left join on rhnChecksumView is horrendously slow, exactly the same thing that \
was happening with doing config file diffs.  An export of a single channel with 3000 \
packages takes hours compared to 10 minutes with this patch.  I can post the 'explain \
analyze' output if you'd like.

I'll brush on my SQL and learn why this is so bad.  In the meantime, can someone \
explain why this is bad in this case?  Obviously it's not an ideal query, but \
PostgreSQL deciding to do a sequence scan on rhnChecksumView with 2,000,000+ rows \
just doesn't work.  Joining the subquery with the same table on a single column seems \
straightforward enough; I don't see how it's error prone as Michael explained in the \
previously referenced commit.

/aron


-----Original Message-----

Message: 4
Date: Fri, 3 Feb 2012 17:02:31 +0100
From: Jan Pazdziora <jpazdziora@redhat.com>
To: spacewalk-devel@redhat.com
Subject: Re: [Spacewalk-devel] Patch review for exportLib.py
Message-ID: <20120203160231.GB6087@redhat.com>
Content-Type: text/plain; charset=us-ascii

On Fri, Feb 03, 2012 at 03:55:53PM +0000, Parsons, Aron wrote:
> Can someone verify that this is a sound change?  rhn-satellite-exporter is \
> extremely slow on PostgreSQL when getting the file checksums.  This speeds it up \
> dramatically. 
> It's the same issue we ran into with PostgreSQL doing a left join on \
> rhnChecksumView in 'query_client_get_file'.  PostgreSQL does a sequence scan on a \
> table with millions of rows.  I patched it in \
> 1e64dc4546d96c58dd84a88c3c634b2d2ce164a4 but Michael had to fix it in \
> 79e57d3b215340a54e911a073408f0eb7c6afc1f.  I just want to be sure my SQL is proper \
> on this one before committing this time. 
> diff --git a/backend/satellite_tools/exporter/exportLib.py b/backend/satellite_t
> index 1646c58..1f265a2 100644
> --- a/backend/satellite_tools/exporter/exportLib.py
> +++ b/backend/satellite_tools/exporter/exportLib.py
> @@ -753,7 +753,12 @@ class _PackageDumper(BaseRowDumper):
> c.checksum_type as "checksum-type",
> c.checksum, pf.linkto, pf.flags, pf.verifyflags, pf.lang
> from rhnPackageFile pf
> -            left join rhnChecksumView c
> +            left join
> +              (select c.*
> +               from rhnPackageFile pf
> +                 inner join rhnChecksumView c
> +                   on pf.checksum_id = c.id
> +               where pf.package_id = :package_id) c
> on pf.checksum_id = c.id,
> rhnPackageCapability pc
> where pf.capability_id = pc.id

I'm affraid you'd get the same

	having a table twice in select is mostly a bug

response on this one. What are you trying to achieve? This query
doesn't even have any lookup_* function.

-- 
Jan Pazdziora
Principal Software Engineer, Satellite Engineering, Red Hat



------------------------------

_______________________________________________
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel

End of Spacewalk-devel Digest, Vol 45, Issue 4
**********************************************


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic