[prev in list] [next in list] [prev in thread] [next in thread] 

List:       solr-user
Subject:    Re: automatic index time field?
From:       "ryan mckinley" <ryantxu () gmail ! com>
Date:       2006-12-14 2:13:46
Message-ID: 176776ee0612131813x2169f5d5ha73ea7aaa56438d7 () mail ! gmail ! com
[Download RAW message or body]


thanks for the advice.  I implemented option #2, followed the directions on:
 http://wiki.apache.org/solr/HowToContribute

and made:
  http://issues.apache.org/jira/browse/SOLR-82

The only change I might make is to have the schema store if it has fields
with default values so that DocumentBuilder.getDoc() does not cycle through
all fields if there aren't any.

Thanks
ryan



On 12/13/06, Chris Hostetter <hossman_lucene@fucit.org> wrote:
>
>
> : Is there a way to automatically set a field when a document is indexed?
> : Specifically, I'd like to have a date field updated to the current time
> when
> : a document is indexed.
>
> Your message reminded me that i never announced the new "Date Match"
> parsing code, which does let you say something like...
>
>   <field name="timestamp">NOW</field>
>
> ...in your <add><doc> calls, but there is currently no way to have
> "default" values for fields in your schema ... it's on the wishlist, but
> no one is currently pursueing it as far as i know.
>
> : I have a bunch of stuff stored in SQL, my plan is to:
> :  * note the current time
>
> ...the gist of your plan is sound, but to eliminate possible headaches
> from clock sync issues, instead of getting the "current time" from
> somewhere, i would query your index for the all docs (of the type
> you are interested in) sorted by date desc, and then note the date of the
> newst doc and later delete all docs with dates up to and including that
> one.
>
> : My options are:
> : 1) Send the index time along with the document.
> : 2) extend UpdateHandler (DirectUpdateHandler2) to do this automatically
> :
> : 1) is the easiest but requires that everyone sending data sends a valid
> : "index_time" field.
> : 2) more complicated, but then we know everything has a valid
> "index_time"
> : field.
>
> As i said, you could just put "NOW" in all of your docs, but if you are
> interested in pursuing option#2, the most general purpose and reusable
> approach miht be to add an optional default="value" attribute to the
> <field> declarations in the schema.xml (relevant classes are SchemaField
> and IndexSchema) and then modify the DocumentBuilder.getDoc method to
> check for any default values of fields the Document doesn't already have
> values for and add them .. then your timestamp field becomes...
>
> <field name="timestamp" type="date" indexed="true" stored="true"
> default="NOW" />
>
> ..but you can also have other default fields...
>
> <field name="forSale" type="boolean" indexed="true" stored="true"
> default="false" />
> <field name="type" type="string" indexed="true" stored="true"
> default="unknown" />
>
> ...etc.
>
>
> -Hoss
>
>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic