[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ocfs2-devel
Subject:    Re: [Ocfs2-devel] The last part of the file is zeroed out when write N random bytes
From:       Gang He <ghe () suse ! com>
Date:       2021-09-29 0:57:07
Message-ID: 9b1a45f7-ff5d-8c0e-bb72-5b7edb4d7573 () suse ! com
[Download RAW message or body]

Hi Guys,

Just give a update.
Based on our testing, the problem was caused by the comment in fs/buffer.c.

commit 6dbf7bb555981fb5faf7b691e8f6169fc2b2e63b
Author: Jan Kara <jack@suse.cz>
Date:   Fri Sep 4 10:58:51 2020 +0200

     fs: Don't invalidate page buffers in block_write_full_page()


Thanks
Gang

On 2021/9/27 15:57, Gang He wrote:
> 
> 
> On 2021/9/27 15:49, Joseph Qi wrote:
> > 
> > Last week, Andrey Markov reported a similar issue, but unfortunately not
> > on mail list.
> > 
> > And Junxiao has resolved a similar issue recently. So can you reproduce
> > the bug in latest kernel?
> Yes, I can reproduce this issue with the latest code.
> The cluster size must be greater than 4K(e.g. 8K, 1M), this is the key
> to the problem.
> 
> Thanks
> Gang
> 
> > 
> > Thanks,
> > Joseph
> > 
> > On 9/27/21 3:16 PM, Gang He wrote:
> > > Hi List,
> > > 
> > > I'd like to report a data loss bug when write N random bytes, since I saw there \
> > > were some related commits in the past weeks. I can reproduce this bug stably \
> > > with the latest ocfs2 kernel module code as below, 1) Create a three node(e.g. \
> > > ghe-tw-nd1, ghe-tw-nd2, ghe-tw-nd3) ocfs2 cluster, attach a shared disk(e.g. \
> > > /dev/vdb). 2) Format the disk with the command "mkfs.ocfs2 -N 4 -b 4096 -C \
> > > 1048576 /dev/vdb", and mount the disk to /mnt/shared on each node. The cluster \
> > > size must be greater than 4K, this is the key to the problem. 3) Copy the file \
> > > write/test scripts to /mnt/shared directory, then run test script on node1 to \
> > > reproduce this bug. file write script ocfs2_fallocate_bug_plain_write.py: \
> > > https://pastebin.com/QsXcD8rq file test script ocfs2_loop.sh: \
> > > https://pastebin.com/eTUe2hkW 4) Then, you can meet this bug, the file md5sum \
> > > is different between from node1 and from node2. In fact, the last part of the \
> > > file is zeroed out from node2. e.g.
> > > file dump from node1: https://pastebin.com/HB92TVS0
> > > file dump from node2: https://pastebin.com/jBG7HdSz
> > > 
> > > More information,
> > > this bug does not exist on some old kernels( e.g. linux-4.12.14-120), but it \
> > > will happen on some new kernels, I feel this bug is probably NOT caused by \
> > > ocfs2 commits, since I used old ocfs2 kernel module code on the new kernels, \
> > > the problem also happened. Anyway, if you have any comments, please reply this \
> > > mail. 
> > > Thanks
> > > Gang
> > > 
> > > _______________________________________________
> > > Ocfs2-devel mailing list
> > > Ocfs2-devel@oss.oracle.com
> > > https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> > > 
> > 
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic