[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-dev
Subject:    [jira] [Comment Edited] (LUCENE-6153) randomize stored fields/vectors index blocksize
From:       "Mike Drob (JIRA)" <jira () apache ! org>
Date:       2014-12-31 16:53:13
Message-ID: JIRA.12764393.1420041556000.122371.1420044793267 () Atlassian ! JIRA
[Download RAW message or body]


    [ https://issues.apache.org/jira/browse/LUCENE-6153?page=com.atlassian.jira.plugin \
.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262297#comment-14262297 ] 

Mike Drob edited comment on LUCENE-6153 at 12/31/14 4:52 PM:
-------------------------------------------------------------

{code:title=CompressingStoredFieldsFormat.java}
+    if (blockSize < 1) {
{code}
{code:title=CompressingStoredFieldsIndexWriter.java}
+    if (blockSize <= 0) {
{code}
{code:title=CompressingTermVectorsFormat.java}
+    if (blockSize < 1) {
{code}

It would be nice for these to be consistent.

{code:title=Lucene50StoredFieldsFormat.java}
-        return new CompressingStoredFieldsFormat("Lucene50StoredFieldsFast", \
CompressionMode.FAST, 1 << 14, 128); +        return new \
CompressingStoredFieldsFormat("Lucene50StoredFieldsFast", CompressionMode.FAST, 1 << \
14, 128, 1024); {code}
Can we have a constant for default block size = 1024. Also might as well have \
constants for whatever 1 << 14 and 128 are, but that can be a follow on issue.


was (Author: mdrob):
{code|title:CompressingStoredFieldsFormat.java}
+    if (blockSize < 1) {
{code}
{code|title:CompressingStoredFieldsIndexWriter.java}
+    if (blockSize <= 0) {
{code}
{code|title:CompressingTermVectorsFormat.java}
+    if (blockSize < 1) {
{code}

It would be nice for these to be consistent.

{code|title:Lucene50StoredFieldsFormat.java}
-        return new CompressingStoredFieldsFormat("Lucene50StoredFieldsFast", \
CompressionMode.FAST, 1 << 14, 128); +        return new \
CompressingStoredFieldsFormat("Lucene50StoredFieldsFast", CompressionMode.FAST, 1 << \
14, 128, 1024); {code}
Can we have a constant for default block size = 1024. Also might as well have \
constants for whatever 1 << 14 and 128 are, but that can be a follow on issue.

> randomize stored fields/vectors index blocksize
> -----------------------------------------------
> 
> Key: LUCENE-6153
> URL: https://issues.apache.org/jira/browse/LUCENE-6153
> Project: Lucene - Core
> Issue Type: Test
> Reporter: Robert Muir
> Attachments: LUCENE-6153.patch
> 
> 
> the Compressing impls compress documents into chunks. We then record index data for \
> every N chunks, which is binary searched to find the start of the chunk. today this \
> is always 1024. This means to test the stored fields index well, we need to index \
> thousands and thousands of documents. But if we randomize the parameter, we can \
> test it more effectively by setting it to very low values (e.g. 5) in tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic