[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-usability
Subject:    Re: Easier Searching in KDE
From:       Manuel Amador <rudd-o () amautacorp ! com>
Date:       2004-06-03 19:09:51
Message-ID: 1086289791.28358.19.camel () localhost ! localdomain
[Download RAW message or body]

[Attachment #2 (multipart/signed)]


El mar, 01-06-2004 a las 20:46, Jamethiel Knorth escribió:

> 
> An index of the entire filesystem cannot be maintained without wasting a ton 
> of computer power.

That is NOT true.  I have kazaa and it does NOT waste a ton of computer
power.  I sure hope kazaa's metadata and index were pervasively
available in the user interface.

>  Google does it by having thousands of computers working 
> constantly. Do not compare to Google.

Because google indexes data from billions of computers.

> 
> You're right that it needs, absolutely needs, to be able to fall back to the 
> basic search.

this is true.

> It is hard to do a good job tracking what changes without significantly 
> slowing the system down.

No, not at all.  Real-time tracking may be hard with dnotify, but there
are alternative solutions being worked upon.

Fact is, we *need* the robust index to be able to compete at all.  Next
year, our strongest competition (Microsoft) will have something on that
front, and we won't if we keep saying to ourselves "it's impossible".

> 
> >When a search is made, it is run completely against the index. Sure, the
> >index may be out-of-date, but the most important files will be indexed
> >almost as soon as they are modified, and the least important files may
> >never get indexed, so the results are usually relevant.
> >
> >Metadata is critical. How do we accumulate the meta data? How is it stored?
> >I don't have the slightest in this department. Already, we have
> >permissions, location, access/modification times, and MIME type that give
> >us good meta data. But I would like additional meta-data specified by the
> >user, like "project" or "author" or "comments". Importance is another piece
> >of meta-data that is useful.
> >
> >I don't think a real database (PostgreSQL or MySQL) will be appropriate for
> >the database part of the tool. The requirements for the index cache are
> >very different than what PostgreSQL provides. I do believe that we can
> >steal a lot of ideas from the database community on how to search vast
> >indexes efficiently. Perhaps early implementations will rely on a database,
> >but that should be temporary.
> 
> Although I did kinda make harsh criticism, you probably know more about this 
> than I do. This is what I have recommended on the page [1], truncated to a 
> more brief form for this e-mail.
> 
> - Indexing -
> 
> Have an index which tracks file locations, but also tracks how up-to-date it 
> is and what its rate of change is. It doesn't need to all be at the same 
> point. Update times are recorded for a directory in a non-recursive manner.
> 
> Whenever a directory is searched, it is added to the index again. Programs 
> would be expected to also update the index whenever they list a directory. 
> To optimize for speed, such updates wouldn't always need to be inserted into 
> the directory immediately. So, if Konqueror opens a directory, it sends the 
> list of contents off to the list of things to add into the index.

This could actually work very well!

> 
> The actual indexer is a daemon (hopefully run with higher privileges so that 
> all users can share). When the daemon indexes a directory, it notes how much 
> it changed since the last time, and how long between updates, and calculates 
> the rate of change.
> 
> - Lazy Checking -
> When updating, do updates according to what needs it most. This is 
> determined according to rate-of-change and time-of-last-update.
> 
> This would be done slowly and lazily. Whenever the system has extra 
> resources, another directory would be checked, then the indexer would pause 
> a moment, then check another.
> 
> - User Accessibility -
> 
> The help people know what is causing thrashing, there would be a systray 
> icon showing if it was doing updates, and allowing a window to be displayed 
> showing how up-to-date various portions of the index are. This also would 
> allow users to force updates and to set some directories to specific 
> priorities, if they really wanted to. (Root privileges would be needed for 
> some of those activities, if the indexer ran as root.)
> 
> - Usage Enhancements -
> 
> Search speed can be improved by optimizing where plugins search and all. I 
> won't go into that here.
> 
> - Metadata -
> 
> Metadata could be checked by having programs throw information to the 
> indexer when they get it. The indexer would get metadata about all music if 
> JuK was used. It would get information whenever an image was previewed. It 
> would get information from the details and information views in Konqueror. 
> RPM databases could keep it up-to-date.
> 
> - Removeable/Remote Filesystems - (this isn't on the page yet)
> 
> The index would keep track of removed media and remote filesystems for a 
> while, but would keep them in separate indexes. This way, they wouldn't need 
> to sit in memory. Depending on user requests, it could give more longevity 
> to the indexes for some mediums. This could potentially be interworked with 
> user tools to be very powerful (I would want it hooked into a CD cataloging 
> program).

Yes, this is exactly what is needed =)

> 
> 
> 
> [1] http://localhost/designs/ubiquitous_searching/index.html#implementation
> 
> _________________________________________________________________
> Watch the online reality show Mixed Messages with a friend and enter to win 
> a trip to NY 
> http://www.msnmessenger-download.click-url.com/go/onm00200497ave/direct/01/
> 
> _______________________________________________
> kde-usability mailing list
> kde-usability@kde.org
> https://mail.kde.org/mailman/listinfo/kde-usability
-- 
	Manuel Amador
	Jefe de I+D                         +593 (9) 847-7372
	Amauta                     http://www.amautacorp.com/
	GNU Privacy Guard key ID: 0xC1033CAD at keyserver.net

["signature.asc" (application/pgp-signature)]

_______________________________________________
kde-usability mailing list
kde-usability@kde.org
https://mail.kde.org/mailman/listinfo/kde-usability


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic