[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-dev
Subject:    [jira] [Commented] (LUCENE-7351) BKDWriter should compress doc ids when all values in a block are th
From:       "Robert Muir (JIRA)" <jira () apache ! org>
Date:       2016-06-30 16:21:10
Message-ID: JIRA.12981802.1466580817000.5473.1467303670160 () Atlassian ! JIRA
[Download RAW message or body]


    [ https://issues.apache.org/jira/browse/LUCENE-7351?page=com.atlassian.jira.plugin \
.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357388#comment-15357388 ] 

Robert Muir commented on LUCENE-7351:
-------------------------------------

I like this better than the last patch, I think the optimization is more general. 

I think in the base test class, {{tesMostEqual()}} is a mistake?

> BKDWriter should compress doc ids when all values in a block are the same
> -------------------------------------------------------------------------
> 
> Key: LUCENE-7351
> URL: https://issues.apache.org/jira/browse/LUCENE-7351
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-7351.patch, LUCENE-7351.patch
> 
> 
> BKDWriter writes doc ids using 4 bytes per document. I think it should compress \
> similarly to postings when all docs in a block have the same packed value. This can \
> happen either when a field has a default value which is common across documents or \
> when quantization makes the number of unique values so small that a large index \
> will necessarily have blocks that all contain the same value (eg. there are only \
> 63490 unique half-float values).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic