[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kinosearch
Subject:    [KinoSearch] only update certain document fields
From:       marvin () rectangular ! com (Marvin Humphrey)
Date:       2006-11-24 5:54:32
Message-ID: 25A2966C-9C4E-4B5B-BF88-10F581F92E50 () rectangular ! com
[Download RAW message or body]

On Nov 22, 2006, at 12:54 AM, Marc Elser wrote:
> Would it be possible to only select the SQL-Fields 'Address, ZIP,  
> City, Country' and update only the 'Address' Field in Kinosearch?

It's not possible to update anything in KS/Lucene.  You can only  
delete/add.  Aside from the deletions files, which are per-segment  
bitmaps with one bit per document, no segment files are ever modified  
once written.

The serialized, stored documents are housed in a single file,  
the .fdt file.  File pointers to individual documents are housed in  
the .fdx file, which is a pile of 64-bit integers.  You can't go into  
the .fdt file and modify it.  It's just not designed for anything  
other than recovery.

But let's say you could.  There's a more serious problem.  The index  
files in each segment are created by tearing each document in the  
collection apart, then pooling the fragments and reassembling them in  
sorted order which can be searched.  Say you modified the content of  
document 1274 in a 10000-document segment.  Now you have little bits  
and pieces that are wrong scattered throughout these index  
structures, with no good way to go in and modify them.

Think of the index at the end of a book.  If you change the contents  
of a chapter, then a few of the entries on every index page now need  
to be modified.  There's no good way to handle that other than to  
regenerate the index.

> If so, how can I do this? Can I update the document somehow, or is  
> it possible to read the already indexed parts, update them with the  
> modified data and then delete and re-add the document(but how do I  
> preserve vector data, boost and and things like this then?)

Deleting and re-adding is the only solution.

Best,

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic