[prev in list] [next in list] [prev in thread] [next in thread] 

List:       fuse-devel
Subject:    Re: [fuse-devel] [PATCH] fuse: fix stale data issue of virtiofs with dax
From:       Miklos Szeredi via fuse-devel <fuse-devel () lists ! sourceforge ! net>
Date:       2023-11-13 14:45:11
Message-ID: CAJfpegsqwWKcpcJCpOQ_EmfuX5k9RXgPSz0q6d6+9DVfEAsLVw () mail ! gmail ! com
[Download RAW message or body]

[adding Vivek Goyal and virtio-fs list]

On Sun, 5 Nov 2023 at 07:36, <obuil.liubo@gmail.com> wrote:
> 
> From: Liu Bo <bo.liu@linux.alibaba.com>
> 
> In open(O_TRUNC), it doesn't cleanup the corresponding dax mapping of
> an inode, which could expose stale data to users, which can be easily
> reproduced by accessing overlayfs through virtiofs with dax.
> 
> The reproducer is,
> 0. mkdir /home/test/lower /home/test/upper /home/test/work
> 1. echo "aaaaaa" > /home/test/lower/foobar
> 2. sudo mount -t overlay overlay \
> -olowerdir=/home/test/lower,upperdir=/home/test/upper,workdir=/home/test/work \
> /mnt/ovl 3. start a guest VM configured with virtiofs passthrough(_dax_ enabled and \
> cache policy is always) and use /mnt/ovl as the source directory.
> 4. ssh into the guest VM
> 5. mount virtiofs onto /mnt/test
> 5. cat /mnt/test/foobar
> aaaaaa
> 6. echo "xxxx" > /mnt/test/foobar
> 7. cat /mnt/test/foobar
> 
> w/o this patch, step 7 shows "aaaa" rather than "xxxx".

This is generally unsupported.  See Documentation/filesystems/overlayfs.rst:

"Changes to underlying filesystems
---------------------------------

Changes to the underlying filesystems while part of a mounted overlay
filesystem are not allowed.  If the underlying filesystem is changed,
the behavior of the overlay is undefined, though it will not result in
a crash or deadlock.

..."

Thanks,
Miklos


> 
> Step 5 reads the content of file 'foobar' by setting up a dax mapping, and the
> mapping eventually maps /home/test/lower/foobar because there is no copy up
> at this moment.
> 
> In step 6, 'foobar' is opened with O_TRUNC and the viriofs daemon on host side
> triggers a copy-up, "xxxx" is eventually written to /home/test/upper/foobar.
> Since 'foobar' is truncated to size 0 when writting, writes go with non-dax io
> path and update the file size in guest VM accordingly.
> 
> However, dax mapping still refers to the /home/test/lower/foobar, so what step
> 7 reads is /home/test/lower/foobar but with the new size, which shows "aaaa"
> rather "xxxx".
> 
> Reported-by: Cameron <cameron@northflank.com>
> Tested-by: Cameron <cameron@northflank.com>
> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
> ---
> fs/fuse/dax.c    | 9 +++++++++
> fs/fuse/dir.c    | 5 +++++
> fs/fuse/fuse_i.h | 1 +
> 3 files changed, 15 insertions(+)
> 
> diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c
> index 8e74f278a3f6..d7f9ec7f4597 100644
> --- a/fs/fuse/dax.c
> +++ b/fs/fuse/dax.c
> @@ -385,6 +385,15 @@ void fuse_dax_inode_cleanup(struct inode *inode)
> WARN_ON(fi->dax->nr);
> }
> 
> +/* Callers need to make sure fi->i_mmap_sem is held. */
> +void fuse_dax_inode_cleanup_range(struct inode *inode, loff_t start)
> +{
> +       struct fuse_conn *fc = get_fuse_conn(inode);
> +       struct fuse_inode *fi = get_fuse_inode(inode);
> +
> +       inode_reclaim_dmap_range(fc->dax, inode, start, -1);
> +}
> +
> static void fuse_fill_iomap_hole(struct iomap *iomap, loff_t length)
> {
> iomap->addr = IOMAP_NULL_ADDR;
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index f67bef9d83c4..be7892e052b5 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -1788,6 +1788,9 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr \
>                 *attr,
> */
> i_size_write(inode, 0);
> truncate_pagecache(inode, 0);
> +                       if (fault_blocked && FUSE_IS_DAX(inode))
> +                               fuse_dax_inode_cleanup(inode);
> +
> goto out;
> }
> file = NULL;
> @@ -1883,6 +1886,8 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr \
> *attr, S_ISREG(inode->i_mode) && oldsize != outarg.attr.size) {
> truncate_pagecache(inode, outarg.attr.size);
> invalidate_inode_pages2(mapping);
> +               if (fault_blocked && FUSE_IS_DAX(inode))
> +                       fuse_dax_inode_cleanup_range(inode, outarg.attr.size);
> }
> 
> clear_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index 9b7fc7d3c7f1..02fa7aa2bd56 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -1305,6 +1305,7 @@ void fuse_dax_conn_free(struct fuse_conn *fc);
> bool fuse_dax_inode_alloc(struct super_block *sb, struct fuse_inode *fi);
> void fuse_dax_inode_init(struct inode *inode, unsigned int flags);
> void fuse_dax_inode_cleanup(struct inode *inode);
> +void fuse_dax_inode_cleanup_range(struct inode *inode, loff_t start);
> void fuse_dax_dontcache(struct inode *inode, unsigned int flags);
> bool fuse_dax_check_alignment(struct fuse_conn *fc, unsigned int map_alignment);
> void fuse_dax_cancel_work(struct fuse_conn *fc);
> --
> 2.32.1 (Apple Git-133)
> 


-- 
fuse-devel mailing list
To unsubscribe or subscribe, visit \
https://lists.sourceforge.net/lists/listinfo/fuse-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic