[prev in list] [next in list] [prev in thread] [next in thread] 

List:       slide-dev
Subject:    Re: Lucene reindexer?
From:       Stefan_Lützkendorf <luetzkendorf () apache ! org>
Date:       2005-03-30 16:48:02
Message-ID: 424AD842.1010408 () apache ! org
[Download RAW message or body]

Hi Eirikur,

I recently checked in some first try for initalizing an index for an existing store
(Tested with txfile store).

The indexer scans all docs in the store if there is no index on startup.

Give it a try if you want.

Rgadards, Stefan

Eirikur Hrafnsson wrote:
> Hi James,
> 
> do you have time now to update/integrate the batch indexer for Slide? I  
> really need it badly : /
> 
> best regards
> Eirikur, Idega.
> 
> 
> On 12.3.2005, at 06:15, James Mason wrote:
> 
>> Sorry I didn't post this early. I was hoping I'd have time to clean it
>> up and actually integrate it into Slide, but I've been completely
>> swamped lately.
>>
>> I've uploaded a code dump from my first working version to
>> http://cvs.apache.org/~masonjm/batchindexer/
>>
>> I know that it contains bugs, since I've fixed a few I made the dump.
>> Also, keep in mind that the code won't work as posted. The only
>> implementation I've made works with Autonomy for the search engine, and
>> I didn't post the piece that actually talks to Autonomy. There's  nothing
>> in there that would be useful for Lucene anyway.
>>
>> To make this generally useful there will need to be an implementation  of
>> QueueProcessor that supports Lucene. I've included an example
>> implementation (for Autonomy) that should be a good starting point.
>>
>> There also needs to be a way to start/stop the batch indexer. I've
>> implemented a Spring-based MVC webapp for controlling it on my server,
>> but I'm not sure if this is the best approach for a more general
>> solution. Also, this is one area I know for sure contains bugs. Someone
>> who actually knows what they're doing should take a look at the run()
>> logic for BatchIndexer to make it properly resumable. My latest version
>> seems to work alright, but this is an earlier snapshot so the logic
>> still has errors.
>>
>> Also, since this whole thing uses Spring to glue everything together
>> you'll need to get the Spring jars for it to work. I *think* I patched
>> the code in CVS to expose the ApplicationContext to the lower levels. I
>> think a servlet filter would be a better approach, but be aware that if
>> you want to do this with Slide 2.1 you'll need to go through some extra
>> steps.
>>
>> Holler if there are any questions.
>>
>> -James
>>
>> On Wed, 2005-03-09 at 11:15 +0000, Eirikur Hrafnsson wrote:
>>
>>> Hi Stefan,
>>>
>>> On 9.3.2005, at 08:52, Stefan Lützkendorf wrote:
>>>
>>>> Hi Eirikur,
>>>>
>>>> the reindex problem is still unresolved :-(.
>>>> I'm currently thinking about this, because I think it's crucial too.
>>>
>>> Yup, especially when you want to use Lucene on an existing store.
>>> Somebody mentioned he was working on a batch indexer when we last
>>> discussed this and he was going to commit it, was it Christophe or
>>> Daniel perhaps...I can't find the email....
>>>
>>> cheers
>>> Eiki, Idega.
>>>
>>>>
>>>> Stefan
>>>>
>>>> Eirikur Hrafnsson wrote:
>>>>
>>>>> Hi all (long time no bugging you... ; )
>>>>> a while ago I asked if there was a way to re-index the lucene index
>>>>> for slide. This is pretty crucial feature in my opinion since the
>>>>> Slide index is always stored on the file system regardless of what
>>>>> kind of store you have thus making it harder to move a website from
>>>>> development to production, backing it up and especially when you  want
>>>>> to enable the lucene indexing on an existing Slide store...
>>>>> Is this possible today?
>>>>> Best Regards
>>>>> Eirikur S. Hrafnsson, eiki@idega.is
>>>>> Chief Software Engineer
>>>>> Idega Software
>>>>> http://www.idega.com
>>>>> p.s.
>>>>> the SimpleXMLExtractor XPath stuff still doesn't work if you specify
>>>>> a namespace other than "DAV:"  : (
>>>>> -------------------------------------------------------------------- -
>>>>> To unsubscribe, e-mail: slide-dev-unsubscribe@jakarta.apache.org
>>>>> For additional commands, e-mail: slide-dev-help@jakarta.apache.org
>>>>
>>>>
>>>>
>>>> -- 
>>>> Stefan Lützkendorf  --  luetzkendorf@apache.org
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: slide-dev-unsubscribe@jakarta.apache.org
>>>> For additional commands, e-mail: slide-dev-help@jakarta.apache.org
>>>>
>>>>
>>>>
>>> Best Regards
>>>
>>> Eirikur S. Hrafnsson, eiki@idega.is
>>> Chief Software Engineer
>>> Idega Software
>>> http://www.idega.com
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: slide-dev-unsubscribe@jakarta.apache.org
>>> For additional commands, e-mail: slide-dev-help@jakarta.apache.org
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: slide-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: slide-dev-help@jakarta.apache.org
>>
>>
>>
> Best Regards
> 
> Eirikur S. Hrafnsson, eiki@idega.is
> Chief Software Engineer
> Idega Software
> http://www.idega.com
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: slide-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: slide-dev-help@jakarta.apache.org
> 


-- 
Stefan Lützkendorf  --  luetzkendorf@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: slide-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-dev-help@jakarta.apache.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic