[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-usability
Subject: Re: Easier Searching in KDE
From: Jonathan Gardner <jgardner () jonathangardner ! net>
Date: 2004-06-01 22:05:42
Message-ID: 200406011505.42598.jgardner () jonathangardner ! net
[Download RAW message or body]
On Monday 24 May 2004 07:47 am, Gustavo Sverzut Barbieri wrote:
> The only point I see against this is the slowness... google is fully
> indexed and does this all within mseconds. Searching the web or even
> search for files takes a lot of time in computer these days... slocate is
> good, but it generally is desync. Searching for contents is even more
> slow.
> Maybe we could come with a way to solve this:
> - the match list should first search local files in indexed mode, then
> check if they exists (avoid show non-existent), then proceed with
> something like "find", then with web, ... This may present first results
> really quick, but the other will still be delayed
I really don't like the way the windows search works: It generates results
slowly, and you don't really have an idea of how much is left to search
over. I want results right away, like Google. I want the results catalogued
and listed by relevance and user preferences.
> - cache previous results, maybe they're used again soon since users
> often refine their search
Unfortunately, this doesn't help the first search. The first search is
always more important than subsequent searches, in my opinion.
> - change kio_slaves to update a db everytime a file is modified...
> with that we can have something fast for user and better sync'ed than
> slocate. The problem is other apps, like gnome or openoffice.
>
This may slow down the speed at which updates happen.
> All of them have problems. The real problem I see is with home dir...
> it's the part of the system that changes most and in short periods of
> time, probably the user changes and then search... that common case makes
> life difficult and should be optimized.
>
I have been doing some work with Materialized Views in PostgreSQL. Here is
what I think will work with KDE.
Google type search. Everything on disk is indexed. With an 80GB hard drive,
it's not a problem to have everything indexed in multiple ways. The search
should take a couple of seconds, max. Let the OS worry about in-memory
caching, etc... If there isn't enough room for the index, then we should
provide a weaker system like Windows where the search is done in real time,
only the most important files are indexed, and recent search results are
stored.
File Importance. Files in the home directory, files modified or viewed
frequently by the user, files in the favorites list, etc, are more
important than system files or log files or cache files. They should be
listed first and indexed first.
Creating the index is the problem. A backend process should be constantly
running at a low priority. Initially, it indexes all the files. Then, it
begins to index files as they change. It always keep a fresh index of the
most important files, and gets around to less important files when it has
the time.
When a search is made, it is run completely against the index. Sure, the
index may be out-of-date, but the most important files will be indexed
almost as soon as they are modified, and the least important files may
never get indexed, so the results are usually relevant.
Metadata is critical. How do we accumulate the meta data? How is it stored?
I don't have the slightest in this department. Already, we have
permissions, location, access/modification times, and MIME type that give
us good meta data. But I would like additional meta-data specified by the
user, like "project" or "author" or "comments". Importance is another piece
of meta-data that is useful.
I don't think a real database (PostgreSQL or MySQL) will be appropriate for
the database part of the tool. The requirements for the index cache are
very different than what PostgreSQL provides. I do believe that we can
steal a lot of ideas from the database community on how to search vast
indexes efficiently. Perhaps early implementations will rely on a database,
but that should be temporary.
--
Jonathan Gardner
jgardner@jonathangardner.net
_______________________________________________
kde-usability mailing list
kde-usability@kde.org
https://mail.kde.org/mailman/listinfo/kde-usability
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic