[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ext4
Subject:    Re: [PATCH v2] ext4: unconditionally enable the i_version counter
From:       Jeff Layton <jlayton () kernel ! org>
Date:       2022-07-27 15:58:22
Message-ID: 262e94a64185df7cf9eb1d5c537095242a33b746.camel () kernel ! org
[Download RAW message or body]

On Wed, 2022-07-27 at 11:48 -0400, Benjamin Coddington wrote:
> On 27 Jul 2022, at 10:37, Jeff Layton wrote:
> 
> > The original i_version implementation was pretty expensive, requiring 
> > a
> > log flush on every change. Because of this, it was gated behind a 
> > mount
> > option (implemented via the MS_I_VERSION mountoption flag).
> > 
> > Commit ae5e165d855d (fs: new API for handling inode->i_version) made 
> > the
> > i_version flag much less expensive, so there is no longer a 
> > performance
> > penalty from enabling it. xfs and btrfs already enable it
> > unconditionally when the on-disk format can support it.
> > 
> > Have ext4 ignore the SB_I_VERSION flag, and just enable it
> > unconditionally. While we're in here, remove the handling of
> > Opt_i_version as well, since we're almost to 5.20 anyway.
> > 
> > Ideally, we'd couple this change with a way to disable the i_version
> > counter (just in case), but the way the iversion mount option was
> > implemented makes that difficult to do. We'd need to add a new mount
> > option altogether or do something with tune2fs. That's probably best
> > left to later patches if it turns out to be needed.
> > 
> > Cc: Dave Chinner <david@fromorbit.com>
> > Cc: Lukas Czerner <lczerner@redhat.com>
> > Cc: Benjamin Coddington <bcodding@redhat.com>
> > Cc: Christoph Hellwig <hch@infradead.org>
> > Cc: Darrick J. Wong <djwong@kernel.org>
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >  fs/ext4/inode.c |  5 ++---
> >  fs/ext4/super.c | 13 ++++---------
> >  2 files changed, 6 insertions(+), 12 deletions(-)
> > 
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 84c0eb55071d..c785c0b72116 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -5411,7 +5411,7 @@ int ext4_setattr(struct user_namespace 
> > *mnt_userns, struct dentry *dentry,
> >  			return -EINVAL;
> >  		}
> > 
> > -		if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size)
> > +		if (attr->ia_size != inode->i_size)
> >  			inode_inc_iversion(inode);
> > 
> >  		if (shrink) {
> > @@ -5717,8 +5717,7 @@ int ext4_mark_iloc_dirty(handle_t *handle,
> >  	}
> >  	ext4_fc_track_inode(handle, inode);
> > 
> > -	if (IS_I_VERSION(inode))
> > -		inode_inc_iversion(inode);
> > +	inode_inc_iversion(inode);
> > 
> >  	/* the do_update_inode consumes one bh->b_count */
> >  	get_bh(iloc->bh);
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 845f2f8aee5f..4b06f394d7d1 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -1585,7 +1585,7 @@ enum {
> >  	Opt_inlinecrypt,
> >  	Opt_usrjquota, Opt_grpjquota, Opt_quota,
> >  	Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
> > -	Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
> > +	Opt_usrquota, Opt_grpquota, Opt_prjquota,
> >  	Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never,
> >  	Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
> >  	Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize,
> > @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec 
> > ext4_param_specs[] = {
> >  	fsparam_flag	("barrier",		Opt_barrier),
> >  	fsparam_u32	("barrier",		Opt_barrier),
> >  	fsparam_flag	("nobarrier",		Opt_nobarrier),
> > -	fsparam_flag	("i_version",		Opt_i_version),
> 
> We've got to keep the parameter, I think, else we'll break existing 
> setups
> with the i_version mount option.
> 

It had already been announced that the above mount option would be
removed by v5.20 (which Darrick pointed out). We might as well drop it
here since this likely wouldn't be merged before then anyway.

The "iversion" mount option is parsed in the userland mount program, and
gets turned into MS_I_VERSION flag for the mount syscall. That will
still be done, though with this change, the kernel should now just
ignore it.
-- 
Jeff Layton <jlayton@kernel.org>
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic