[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-btrfs
Subject:    Re: [PATCH URGENT v1.1 0/2] btrfs-progs: Fix the nobarrier behavior of write
From:       Qu Wenruo <quwenruo.btrfs () gmx ! com>
Date:       2019-03-31 14:42:54
Message-ID: b4f5c3f5-3f8b-08b2-5552-e7f9b6b670e7 () gmx ! com
[Download RAW message or body]

[Attachment #2 (multipart/mixed)]


Not so gentle ping.

IMHO this fix itself should be worthy a minor release.

Thanks,
Qu

On 2019/3/27 下午10:48, Qu Wenruo wrote:
> 
> 
> On 2019/3/27 下午10:07, Adam Borowski wrote:
>> On Wed, Mar 27, 2019 at 05:46:50PM +0800, Qu Wenruo wrote:
>>> This urgent patchset can be fetched from github:
>>> https://github.com/adam900710/btrfs-progs/tree/flush_super
>>> Which is based on v4.20.2.
>>>
>>> Before this patch, btrfs-progs writes to the fs has no barrier at all.
>>> All metadata and superblock are just buffered write, no barrier between
>>> super blocks and metadata writes at all.
>>>
>>> No wonder why even clear space cache can cause serious transid
>>> corruption to the originally good fs.
>>>
>>> Please merge this fix as soon as possible as I really don't want to see
>>> btrfs-progs corrupting any fs any more.
>>
>> How often does this happen in practice?  I'm slightly incredulous about
>> btrfs-progs crashing often.   Especially that pwrite() is buffered on the
>> kernel side, so we'd need a _kernel_ crash (usually a power loss) to break
>> consistency.  Obviously, a potential data loss bug is always something that
>> needs fixing, I'm just wondering about severity.
> 
> Here is a valid case where a crash could cause transid error:
> 
> - transaction 1
>   new em at 16K (fs root, gen = 1)
>   new em at 32K (extent root, gen = 1)
>   new em at 48K (tree root, gen = 1)
>   sb->fs root = gen 1
>   sb->extent root = gen 1
>   sb->tree root = gen 1
> 
> - transaction 2
>   new em at 64K (extent root, gen = 2)
>   new em at 80K (tree root, gen = 2)
>   sb->fs root = gen 1 at 16K
>   sb->extent root = gen 2
>   sb->tree root = gen 2
> 
> - transaction 3, half backed due to error commit transaction
>   new eb at 16K (tree root, gen = 3) submitted
> 
> In above case, we will write the newest eb at 16K to disk, but with sb
> from transaction 2.
> 
> Then sb expects to read out a tree with gen 1, but get a tree with gen 3.
> Further more, even we ignore the generation mismatch, the content of em
> 16K is completely wrong, super block of gen 2 expects fs root content
> from em at 16K, but its content is tree root.
> 
> This should explain the severity much better.
> 
> Thanks,
> Qu
> 
>>
>> Or do I understand this wrong?
>>
>> Asking because Dimitri John Ledkov stepped down as Debian's maintainer of
>> this package, and I'm taking up the mantle (with Nicholas D Steeves being
>> around) -- modulo any updates other than important bug fixes being on hold
>> because of Debian's freeze.  Thus, I wonder if this is important enough to
>> ask for a freeze exception.
>>
>>
>> Meow!
>>
> 


["signature.asc" (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic