[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-dev
Subject:    RE: corrupted index
From:       Otis Gospodnetic <otis_gospodnetic () yahoo ! com>
Date:       2002-04-02 5:05:47
[Download RAW message or body]

I changed the recipient from -user to -dev list, as that seems more
appropriate.
I think this would not be a bad idea, if we do it right.
Things like IndexLockedException, etc. sound alright to me.
I think Doug once welcomed such a change on one of the lists, too.

Perhaps a list of suggested exceptions, new exception classes and
appropriate patches would be the best contribution.

Thanks,
Otis

--- Matt Tucker <matt@jivesoftware.com> wrote:
> Hey all,
> 
> Actually, using shutdown hooks might not be the best idea since
> Lucene is very 
> often used in server-side Java environments. Many app-servers throw
> security 
> errors when trying to add shutdown hooks, and I've seen Weblogic
> crash before 
> when having them in a webapp. Has anyone else run into this?
> 
> This all brings up a key issue with Lucene, which is that there is
> little way 
> to recover from errors gracefully. I'd love to see a number of
> checked 
> exceptions added. For example:
> 
>  IndexNotFoundException -- when trying to open an index that doesn't
> exist
>  IndexLockedException -- when a lock file prevents you from getting
> an index
>  IndexCorruptException -- maybe this would be thrown when an index
> appears to 
> be broken?
> 
> At the moment, Lucene throws many undocumented IOExceptions and even 
> NullPointerExceptions when an error case comes up. I catch these in
> my app, but 
> there's really not an intelligent way to recover from them. Adding
> checked 
> exceptions would be a change of the API, but it seems worth it. I'd
> be happy to 
> make a more specific proposal if other people feel like this would be
> a 
> worthwhile direction to go in.
> 
> Regards,
> Matt
> 
> Quoting "Spencer, Dave" <dave@lumos.com>:
> 
> > Runtime.addShutdownHook:
> > 
> > 
> > 
> >
>
http://java.sun.com/j2se/1.3/docs/api/java/lang/Runtime.html#addShutdown
> > Hook(java.lang.Thread)
> > 
> > -----Original Message-----
> > From: Otis Gospodnetic [ mailto:otis_gospodnetic@yahoo.com]
> > Sent: Sunday, March 17, 2002 12:06 AM
> > To: Lucene Users List
> > Subject: Re: corrupted index
> > 
> > 
> > Oh, I just thought of something (wine does body good).
> > Perhaps one could use Runtime (the class) to catch the JVM shutdown
> and
> > do whatever is needed to prevent index corruption.  I believe there
> are
> > some shutdown hook methods in there that may let you do that.  I'm
> too
> > lazy to look up the API docs now, but I rememeber reading about
> that
> > once, and perhaps it was even mentioned on one of the 2 Lucene
> mailing
> > lists.
> > 
> > On the other hand, it would be great to have a tool that can verify
> an
> > existing index.  I don't know enough about the actual file
> structure
> > yet to write something like that, but maybe somebody else has done
> that
> > already or would like to contribute.
> > 
> > Otis
> > 
> > 
> > --- "Steven J. Owens" <puffmail@darksleep.com> wrote:
> > > Otis,
> > >
> > > > You can remove the .lock file and try re-indexing or continuing
> > > > indexing where you left off.
> > > > I am not sure about the corrupt index.  I have never seen it
> > > happen,
> > > > and I believe I recall reading some messages from Doug Cutting
> > > saying
> > > > that index should never be left in an inconsistent state. 
> > >
> > >      Obviously never "should" be, but if something's pulling the
> rug
> > > out from under his JRE, changes could be only partially written,
> > > right? 
> > >
> > >      Or is the writing format in some sense transactionally safe?
> > > I've never worked directly on something like this, but I worked
> at a
> > > database software company where they used transaction semantics
> and a
> > > journaling scheme to fake a "bulletproof" file system.  Is this
> how
> > > the index-writing code is implemented?
> > >
> > >      In general, I can guess Doug's response - just torch the old
> > > index directory and rebuild it; Lucene's indexing is fast enough
> that
> > > you don't need to get clever.  This seems to be Doug's stance in
> > > general (i.e. "don't get fancy, I already put all the fanciness
> > > you'll
> > > need into extremely fast indexing and searching").  So far, it
> seems
> > > to work :-).
> > >
> > > > I could be making this up, though, so I suggest you search
> through
> > > > lucene-user and lucene-dev archives on www.mail-archive.com.
> > > > A search for "corrupt" should do it.
> > > > Once you figure things out maybe you can post a summary here.
> > >
> > >      I got a little curious, so I went and did the searches. 
> There
> > > is
> > > exactly one message in each list archive (dev and users) with the
> > > keyword "corrupt" in it.  The lucene-users instance is
> irrelevant:
> > >
> > >
> >
>
http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg00557.html
> > >
> > >      The lucene-dev instance is more useful:
> > >
> > >
> >
>
http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg00157.html
> > >
> > >      It's a post from Doug, dated sept 27, 2001, about adding not
> > > just
> > > thread-safety but process-safety:
> > >
> > >   It should be impossible to corrupt an index through the Lucene
> API.
> > >   However if a Lucene process exits unexpectedly it can leave the
> > > index
> > >   locked.  The remedy is simply to, at a time when it is certain
> that
> > > no
> > >   processes are accessing the index, remove all lock files.
> > >  
> > >      So it sounds like it's worth trying just removing the lock
> > > files.
> > > Hm, is there a way to come up with a "sanity check" you can run
> on an
> > > index to make sure it's not corrupted?  This might be an
> excellent
> > > thing to reassure yourself with: something went wrong?  Run a
> sanity
> > > check, if it fails just reindex.
> > >
> > > Steven J. Owens
> > > puff@darksleep.com



__________________________________________________
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://http://taxes.yahoo.com/

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic