[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kmail-devel
Subject:    Re: Full Text Indexing for kmail
From:       Daniel Naber <daniel.naber () t-online ! de>
Date:       2005-03-19 20:02:29
Message-ID: 200503192102.29878 () danielnaber ! de
[Download RAW message or body]


(sorry for not correctly replying to the thread, but I just re-subscribed)

> Hopefully if Ingo and the other developers are reading they will now 
> be aware of this limitation of full text indices. *term* searches are 
> intractable.

*term* means to iterate over all different terms, which is still much 
better than to iterate over all terms in all documents.

Anyway, *term searches are not considered important, at least not for 
English (in German and other languages that make heavy use of compounds 
it's a different situation, but even there you can live without them).

BTW, there's no need for a tree with the index terms. An in-memory index  
that contains every 128th term and refers to the complete term list on 
disk is fast enough.

> (BTW just to reiterate I'm much more concerned with the amount of 
> main/core/RAM memory used than how much disk space is used). 

It's very difficult to have fast indexing with low memory requirements. 
It's also not trivial to have fast searches and fast re-indexing of single 
documents (i.e. updating the index without re-indexing everything). The 
good news is that all of this -- including phrase searches, fuzzy 
searches, wildcard searches etc.  -- and much more is already solved by 
Apache Lucene (http://lucene.apache.org/). The C++ version of Lucene is at 
http://sourceforge.net/projects/clucene/. I'm a committer for the (Java) 
Lucene project, so let me know if you have any questions.

Regards
 Daniel

-- 
http://www.danielnaber.de
_______________________________________________
KMail developers mailing list
KMail-devel@kde.org
https://mail.kde.org/mailman/listinfo/kmail-devel
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic