[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-fsdevel
Subject:    Re: Proposal to improve filesystem/block snapshot interaction
From:       David Chinner <dgc () sgi ! com>
Date:       2007-10-31 7:04:02
Message-ID: 20071031070402.GZ995458 () sgi ! com
[Download RAW message or body]

On Wed, Oct 31, 2007 at 03:01:58PM +1100, Greg Banks wrote:
> On Wed, Oct 31, 2007 at 10:56:52AM +1100, David Chinner wrote:
> > On Tue, Oct 30, 2007 at 03:16:06PM +1100, Neil Brown wrote:
> > > On Tuesday October 30, gnb@sgi.com wrote:
> > > > BIO_HINT_RELEASE
> > > >     The bio's block extent is no longer in use by the filesystem
> > > >     and will not be read in the future.  Any storage used to back
> > > >     the extent may be released without any threat to filesystem
> > > >     or data integrity.
> > > 
> > > If the allocation unit of the storage device (e.g. a few MB) does not
> > > match the allocation unit of the filesystem (e.g. a few KB) then for
> > > this to be useful either the storage device must start recording tiny
> > > allocations, or the filesystem should re-release areas as they grow.
> > > i.e. when releasing a range of a device, look in the filesystem's usage
> > > records for the largest surrounding free space, and release all of that.
> > 
> > I figured that the easiest way around this is reporting free space
> > extents, not the amoutn actually freed. e.g.
> > 
> > 	4k in file A @ block 10
> > 	4k in file B @ block 11
> > 	4k free space @ block 12
> > 	4k in file C @ block 13
> > 	1008k in free space at block 14.
> > 
> > If we free file A, we report that we've released an extent of 4k @ block 10.
> > if we then free file B, we report we've released an extent of 12k @ block 10.
> > If we then free file C, we report a release of 1024k @ block 10.
> > 
> > Then the underlying device knows what the aggregated free space regions
> > are and can easily release large regions without needing to track tiny
> > allocations and frees done by the filesystem.
> 
> If you could do that in the filesystem, it certainly solve the problem.
> In which case I'll explicitly allow for the hint's extent to overlap
> extents previous extents thus hinted, and define the semantics
> for overlaps.  I think I'll rename the hint to BIO_HINT_RELEASED,
> I think that will make the semantics a little clearer.

I think that can be done - i wouldn't have mentioned it if I didn't
think it was possible to implement ;).

It will require a further btree lookup once the free transaction
hits the disk, but I think that's pretty easy to do. I'd probably
hook xfs_alloc_clear_busy() to do this.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic