[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-fsdevel
Subject: Re: Proposal to improve filesystem/block snapshot interaction
From: David Chinner <dgc () sgi ! com>
Date: 2007-10-31 7:04:02
Message-ID: 20071031070402.GZ995458 () sgi ! com
[Download RAW message or body]
On Wed, Oct 31, 2007 at 03:01:58PM +1100, Greg Banks wrote:
> On Wed, Oct 31, 2007 at 10:56:52AM +1100, David Chinner wrote:
> > On Tue, Oct 30, 2007 at 03:16:06PM +1100, Neil Brown wrote:
> > > On Tuesday October 30, gnb@sgi.com wrote:
> > > > BIO_HINT_RELEASE
> > > > The bio's block extent is no longer in use by the filesystem
> > > > and will not be read in the future. Any storage used to back
> > > > the extent may be released without any threat to filesystem
> > > > or data integrity.
> > >
> > > If the allocation unit of the storage device (e.g. a few MB) does not
> > > match the allocation unit of the filesystem (e.g. a few KB) then for
> > > this to be useful either the storage device must start recording tiny
> > > allocations, or the filesystem should re-release areas as they grow.
> > > i.e. when releasing a range of a device, look in the filesystem's usage
> > > records for the largest surrounding free space, and release all of that.
> >
> > I figured that the easiest way around this is reporting free space
> > extents, not the amoutn actually freed. e.g.
> >
> > 4k in file A @ block 10
> > 4k in file B @ block 11
> > 4k free space @ block 12
> > 4k in file C @ block 13
> > 1008k in free space at block 14.
> >
> > If we free file A, we report that we've released an extent of 4k @ block 10.
> > if we then free file B, we report we've released an extent of 12k @ block 10.
> > If we then free file C, we report a release of 1024k @ block 10.
> >
> > Then the underlying device knows what the aggregated free space regions
> > are and can easily release large regions without needing to track tiny
> > allocations and frees done by the filesystem.
>
> If you could do that in the filesystem, it certainly solve the problem.
> In which case I'll explicitly allow for the hint's extent to overlap
> extents previous extents thus hinted, and define the semantics
> for overlaps. I think I'll rename the hint to BIO_HINT_RELEASED,
> I think that will make the semantics a little clearer.
I think that can be done - i wouldn't have mentioned it if I didn't
think it was possible to implement ;).
It will require a further btree lookup once the free transaction
hits the disk, but I think that's pretty easy to do. I'd probably
hook xfs_alloc_clear_busy() to do this.
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic