[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ceph-users
Subject:    [ceph-users] Scrubbing question
From:       lionel-subscription () bouton ! name (Lionel Bouton)
Date:       2015-11-26 15:37:24
Message-ID: 56572734.2030802 () bouton ! name
[Download RAW message or body]

Le 26/11/2015 15:53, Tomasz Kuzemko a ?crit :
> ECC will not be able to recover the data, but it will always be able to
> detect that data is corrupted.

No. That's a theoretical impossibility as the detection is done by some
kind of hash over the memory content which brings the possibility of
hash collisions. For cryptographic hashes collisions are by definition
nearly impossible to trigger but obviously memory controllers can't use
cryptographic hashes to protect the memory content : the verification
would be prohibitive (both in hardware costs and in latencies). Most ECC
implementations use hamming codes which correct all single-bit errors
and detect all 2-bit errors but can have false negatives for 3+ bit
errors. There's even speculation that modern hardware makes this more
likely because individual chips now use buses that aren't 1-bit anymore
and defective chips don't store only 1-bit in a byte returned by a read
anymore but several.

>  AFAIK under Linux this results in
> immediate halt of system, so it would not be able to report bad checksum
> data during deep-scrub.

It can, it's just less likely.

Lionel

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic