[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-devel
Subject:    Re: A search service for KDE
From:       Manuel Amador <rudd-o () amautacorp ! com>
Date:       2005-02-25 15:16:22
Message-ID: 1109344582.10715.2.camel () master ! amauta
[Download RAW message or body]

I'll keep that in mind.  The text indexer in ZODB is subpar, perhaps it
can benefit from plugging in Lucene.

El jue, 24-02-2005 a las 08:01 +0100, mweilguni@kde.org escribió:
> Did you take a look on the Lucene search engine? There are a lot of ports to 
> different languages, performance is exceptional good and has sophisticated 
> search facilities.
> 
> Regards,
> Mario
> 
> 
> Am Donnerstag, 24. Februar 2005 00:28 schrieb Manuel Amador:
> > El mar, 22-02-2005 a las 19:03 +0000, Daniel Roe escribió:
> > > But I think given that Manuel's engine seems quite close to completion,
> > > we should at the very least check it out before rejecting its language. I
> > > speak for others, I'm sure, when I say that I'm eager to look over the
> > > preview release he speaks of.
> >
> > Thanks, but do not get your hopes up so much right now.  Here's what is
> > done:
> >
> > - file plugin interface (with KDE's KFile abstraction, for plugin
> > writers)
> > - index database interface, and two incomplete, but far ahead plugins:
> > PostgreSQL and ZODB (I'm leaning more towards ZODB since I can
> > transparently persist everything instead of manually pickling and
> > unpickling, and being able to store references, which I cannot with PG,
> > and I found - I think - the solution to the scalability problems)
> > - Filesystem crawler with companion per-volume crawlers
> > - indexer process, which consumes files produced by the FS crawler and
> > adds them to the database
> > - XML-RPC-based interfaces (socket and TCP) exposing search and metadata
> > querying methods, plus various administrative functions for the
> > superuser
> > - a rudimentary command-line based search tool (more like testing tool)
> > - a command-line tool to test file plugins
> > - automatic throttling based on last minute loadavg (check your uptime
> > command's output for more information)
> > - an event logging interface which can output to syslog, stderr or a
> > file
> >
> > Currently, using any of the two database backends, indexing takes 0.02
> > seconds per file on average, and extracting contents and all metadata
> > takes 0.5 seconds on average (MP3s take 0.2 seconds, text files 0.5 and
> > some large HTML files up to 45 seconds, due to the unoptimized regexps
> > I'm using in some places).  Searching The Beatles (13000 songs indexed)
> > takes 0.5 seconds on average, according to the time(1) command.
> >
> > I'm solving scalability issues.  I finally have memory usage pinned
> > down, by using finite-sized queues between processes, and with
> > PostgreSQL.  I'll delve into stabilizing memory usage with ZODB (which
> > can enable so much more functionality in the future!) because once I
> > unleashed the indexer on my entire hard disk, the memory usage of the
> > metadata service process shot to 400 MB (I only have half a gig of RAM
> > and recovering from that situation took me a couple of minutes).
> >
> > Next on my list:
> > - incremental indexing (with inotify).  I first need to reboot into my
> > inotify-enabled patched kernel.  I don't wanna!
> >
> > Sorry, no GUI search tool yet =(  I tried with kdevelop but all I got
> > was a headache (I still cannot wrap my head around C++, or the other way
> > around - maybe I'm spoiled because of python).  Anyway, any app that
> > wants to query the database can perform a simple, already "standardized"
> > XML-RPC call and the server replies with results.  But ideally, that
> > tool should expose a standard DCOP interface for all KDE apps to use and
> > rely on, right?
> >
> > > Regards,
> > > Daniel
> > >
> > >  >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to
> > >  >> unsubscribe <<
>  
> >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<
-- 
Manuel Amador <rudd-o@amautacorp.com>
Amauta
 
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic