[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gentoo-performance
Subject:    Re: [gentoo-performance] Re: portage performance
From:       Paul de Vrieze <pauldv () gentoo ! org>
Date:       2004-08-03 8:49:17
Message-ID: 200408031049.26281.pauldv () gentoo ! org
[Download RAW message or body]


On Tuesday 27 July 2004 00:57, Brian Harring wrote:
> > The basic problem in searching is actually that it isn't implemented
> > smartly
> > in current portage. I have working (emerge -s like) code that is
> > blazingly
> > fast as it does not actually open all ebuilds.
>
> Searching works off of the cache for the most part, if a cache entry is
> stale, it's updated (eg the ebuild is opened and srced).
> Unless you're not checking the cache and updating it as you proceed,
> you're implementation ought to suffer the same limitation.

Basically it does a directory glob selecting valid candidates. Those 
candidates are then checked whether they are real packages. If they are, they 
are valid results and returned.

> There are 2 things that need to be done (in my books at least) to step
> up the speed of a description search-
> A) sql based cache backend, whether sqlite or mysql.  Either that, or
> extend the flat cache to store the descriptions in a central index.
> B) alter the search description alg so that instead of stepping through
> each entry getting the description, we just state "give me all packages
> that have a description matching blar", and leave it up to the backend
> to decide what is the most efficient way to search.  With flat cache,
> we'd still have to go file by file; w/ a sql variant, it could take
> advantage of the appropriate syntax.

Probably some kind of caching or tool (like makewhatis) is the way to go. An 
option would be to use grep first to limit the amount of candidate packages 
that get examined for real (grep is a lot cheaper than parsing).

> Since there is code for a sql based cache backend, B has been bounced
> around in #gentoo-portage a bit.  Prior to it actually happening I
> would think the sql db code would need to be cleaned up/QA'd/etc.
>
> Course, there still is the issue of verifying that the cache entry
> isn't stale... :)

For now on I don't have any persistent caching in my working code (except 
where it uses old code for accessing current ebuilds) to keep it simple. It 
actually allready is quite fast.

> Err, eh?  If the tree is corrupted, and sync'd against a
> good/non-corrupted tree, it ought to be reverted to a sane state.

Exactly

Paul

-- 
Paul de Vrieze
Gentoo Developer
Mail: pauldv@gentoo.org
Homepage: http://www.devrieze.net

[Attachment #3 (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic