[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    Indexing source files
From:       Axel <axelclk () gmail ! com>
Date:       2005-08-31 19:20:39
Message-ID: 7a49c63505083112204f02a672 () mail ! gmail ! com
[Download RAW message or body]

Hi

I'm quite new to lucene, and I'm looking for information, how I can
start implementing a search engine for code completion, phpdoc
hovering etc.
Currently I'm using an approach similar to ctags: http://ctags.sourceforge.net/,
which isn't very fast in the startup time for large projects.
For the PHP scripting language I've already created a parser which
extracts something like
function-, class-, methodnames, phpdoc sections... for a given file.
I've also created a reduced binary AST of the php file, which I would
like to store with the lucene document.

What's the best way to store all these informations in an lucene search index?
Something like:
Field.Text("content", <string of all parsed identifiers>);

or is it better to write something like this for every identifier::
Field.Keyword("function","myfunction");
Field.Keyword("class","myclass");
...

Is it possible to store the complete AST (in an serializable format)
in the document or should the extra information be stored for every
(function-)identifier (like for example number of arguments, source
start, source end, javadoc start, javadoc end,...)

Did someone already implemented something similar?
Is Lucene the right tool for such a task?

-- 
Axel Kramer
http://www.phpeclipse.de - PHP Eclipse Plugin

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic