[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-devel
Subject: Re: A search service for KDE
From: mweilguni () kde ! org
Date: 2005-02-24 7:01:10
Message-ID: 200502240801.10560.mweilguni () kde ! org
[Download RAW message or body]
Did you take a look on the Lucene search engine? There are a lot of ports to
different languages, performance is exceptional good and has sophisticated
search facilities.
Regards,
Mario
Am Donnerstag, 24. Februar 2005 00:28 schrieb Manuel Amador:
> El mar, 22-02-2005 a las 19:03 +0000, Daniel Roe escribió:
> > But I think given that Manuel's engine seems quite close to completion,
> > we should at the very least check it out before rejecting its language. I
> > speak for others, I'm sure, when I say that I'm eager to look over the
> > preview release he speaks of.
>
> Thanks, but do not get your hopes up so much right now. Here's what is
> done:
>
> - file plugin interface (with KDE's KFile abstraction, for plugin
> writers)
> - index database interface, and two incomplete, but far ahead plugins:
> PostgreSQL and ZODB (I'm leaning more towards ZODB since I can
> transparently persist everything instead of manually pickling and
> unpickling, and being able to store references, which I cannot with PG,
> and I found - I think - the solution to the scalability problems)
> - Filesystem crawler with companion per-volume crawlers
> - indexer process, which consumes files produced by the FS crawler and
> adds them to the database
> - XML-RPC-based interfaces (socket and TCP) exposing search and metadata
> querying methods, plus various administrative functions for the
> superuser
> - a rudimentary command-line based search tool (more like testing tool)
> - a command-line tool to test file plugins
> - automatic throttling based on last minute loadavg (check your uptime
> command's output for more information)
> - an event logging interface which can output to syslog, stderr or a
> file
>
> Currently, using any of the two database backends, indexing takes 0.02
> seconds per file on average, and extracting contents and all metadata
> takes 0.5 seconds on average (MP3s take 0.2 seconds, text files 0.5 and
> some large HTML files up to 45 seconds, due to the unoptimized regexps
> I'm using in some places). Searching The Beatles (13000 songs indexed)
> takes 0.5 seconds on average, according to the time(1) command.
>
> I'm solving scalability issues. I finally have memory usage pinned
> down, by using finite-sized queues between processes, and with
> PostgreSQL. I'll delve into stabilizing memory usage with ZODB (which
> can enable so much more functionality in the future!) because once I
> unleashed the indexer on my entire hard disk, the memory usage of the
> metadata service process shot to 400 MB (I only have half a gig of RAM
> and recovering from that situation took me a couple of minutes).
>
> Next on my list:
> - incremental indexing (with inotify). I first need to reboot into my
> inotify-enabled patched kernel. I don't wanna!
>
> Sorry, no GUI search tool yet =( I tried with kdevelop but all I got
> was a headache (I still cannot wrap my head around C++, or the other way
> around - maybe I'm spoiled because of python). Anyway, any app that
> wants to query the database can perform a simple, already "standardized"
> XML-RPC call and the server replies with results. But ideally, that
> tool should expose a standard DCOP interface for all KDE apps to use and
> rely on, right?
>
> > Regards,
> > Daniel
> >
> > >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to
> > >> unsubscribe <<
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic