'[jira] [Commented] (CASSANDRA-13442) Support a means of strongly consistent highly available replica'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cassandra-commits
Subject:    [jira] [Commented] (CASSANDRA-13442) Support a means of strongly consistent highly available replica
From:       "DOAN DuyHai (JIRA)" <jira () apache ! org>
Date:       2017-09-30 9:22:01
Message-ID: JIRA.13063582.1492021729000.244639.1506763321336 () Atlassian ! JIRA
[Download RAW message or body]


    [ https://issues.apache.org/jira/browse/CASSANDRA-13442?page=com.atlassian.jira.pl \
ugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16186999#comment-16186999 \
] 

DOAN DuyHai commented on CASSANDRA-13442:
-----------------------------------------

> After a repair we know that some subset of the data set is fully replicated. At \
> that point we don't have to read from a quorum of nodes for the repaired data. It \
> is sufficient to read from a single node for the repaired data and a quorum of \
> nodes for the unrepaired data.

Also how do we define *repaired data* ? It is at token range level ? After a round of \
repair, we can segregate SSTables into repaired and unrepaired buckets, fine. But is \
it applicable to the token range level ?

Suppose for simplification range [0 - 10[    (out of [0 - 100[ total range). Even if \
a round of repair has just finished for this range, whenever you have a single \
subsequent update falling in this range, the whole range now can no longer be \
considered repaired ...

> Support a means of strongly consistent highly available replication with tunable \
>                 storage requirements
> -----------------------------------------------------------------------------------------------------
>  
> Key: CASSANDRA-13442
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13442
> Project: Cassandra
> Issue Type: Improvement
> Components: Compaction, Coordination, Distributed Metadata, Local Write-Read Paths
> Reporter: Ariel Weisberg
> 
> Replication factors like RF=2 can't provide strong consistency and availability \
> because if a single node is lost it's impossible to reach a quorum of replicas. \
> Stepping up to RF=3 will allow you to lose a node and still achieve quorum for \
> reads and writes, but requires committing additional storage. The requirement of a \
> quorum for writes/reads doesn't seem to be something that can be relaxed without \
> additional constraints on queries, but it seems like it should be possible to relax \
> the requirement that 3 full copies of the entire data set are kept. What is \
> actually required is a covering data set for the range and we should be able to \
> achieve a covering data set and high availability without having three full copies. \
>  After a repair we know that some subset of the data set is fully replicated. At \
> that point we don't have to read from a quorum of nodes for the repaired data. It \
> is sufficient to read from a single node for the repaired data and a quorum of \
> nodes for the unrepaired data. One way to exploit this would be to have N replicas, \
> say the last N replicas (where N varies with RF) in the preference list, delete all \
> repaired data after a repair completes. Subsequent quorum reads will be able to \
> retrieve the repaired data from any of the two full replicas and the unrepaired \
> data from a quorum read of any replica including the "transient" replicas. \
> Configuration for something like this in NTS might be something similar to { \
> DC1="3-1", DC2="3-2" } where the first value is the replication factor used for \
> consistency and the second values is the number of transient replicas. If you \
> specify { DC1=3, DC2=3 } then the number of transient replicas defaults to 0 and \
> you get the same behavior you have today.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic