[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-dev
Subject:    Re: Term vectors: .tvf format question
From:       Erik Hatcher <erik () ehatchersolutions ! com>
Date:       2004-06-16 0:21:20
Message-ID: 17830E9E-BF2B-11D8-A59F-000393A564E6 () ehatchersolutions ! com
[Download RAW message or body]

On Jun 12, 2004, at 11:46 PM, Doug Cutting wrote:
>> Just out of curiosity - are there any other known inconsistencies 
>> with the file formats documentation?
>
> Good question.  Let me think...
>
> The segments file has also changed format, and this is not yet 
> reflected in the file format documentation.

I've just updated this.  Again, let me know if anything is wrong and 
I'll correct it.

> We should probably also somewhere make clear what's changed.  We 
> promise to do so at the top of the file, but don't.  So perhaps 
> sections which have changed should get "since 1.4" or "changed in 1.4" 
> notices or somesuch.  This will make life much easier for ports.

I would like to, sometime in the future, formalize the file format 
structure somehow.  Perhaps an XML file that describes each file, its 
bits and bytes in the detail that is currently done with 
fileformats.html.  If we did this rigorously enough, it shouldn't be 
too hard of a leap to have some type of verification test to ensure the 
format created matches the specified structure.

I'm not sure if any code generation could be done from such a 
descriptor, but maybe that is an option also in order to keep things 
tightly in sync.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic