[prev in list] [next in list] [prev in thread] [next in thread]
List: netbsd-tech-kern
Subject: Re: RFC (reassign)buf and carvinf up buffers (was Re: SCSI MMC device abstraction and UDF patch for
From: Bill Studenmund <wrstuden () NetBSD ! org>
Date: 2005-12-29 21:45:52
Message-ID: 20051229214552.GD14308 () netbsd ! org
[Download RAW message or body]
On Thu, Dec 29, 2005 at 09:47:25PM +0100, Reinoud Zandijk wrote:
> On Thu, Dec 29, 2005 at 09:58:51AM -0800, Bill Studenmund wrote:
> > > That implies having a VOP_BMAP() figuring this out. Since UDF can't use a
> > > VOP_BMAP this way (due to write shuffling) it would mean that VOP_BMAP
> > > needs to distinguish between read and write requests and for read-request
> > > try to figure out how much it can read in one go... quite expensive and
> > > locking trouble prone.
> >
> > This does not imply VOP_BMAP() figuring this out.
> >
> > The file system decides what data goes into what buffers. The file system
> > knows what blocks are where. Thus you don't have to figure all of this out
> > in the middle of your strategy routine, you can figure it out when you
> > make the buffers in the first place.
> >
> > More directly, you SHOULD figure it out before your strategy routine.
>
> Since UDF uses genfs, genfs decides the number of blocks to request by the
> `runp' variable set by its VOP_BMAP() call to the filingsystem. Since UDF's
> bmap is a 1:1 translation it allways returns the maximum runlength with
>
> *runp = MAXPHYS / lb_size;
>
> to make full use of long extents to read to reduce the number of
> transactions as much as possible. Note that this isn't happening yet but
> thats the idea behind it. If i otoh return 0 or 1 i get lb_size or
> 2*lb_size. So prolly i'll have to substract 1 from the *runp assignment :)
You should return the number of blocks that are contiguous. If the next
MAXPHYS are all together, return the runp above. If not, return less.
> > No, a VOP_STRATEGY() call does NOT represent a read/write that has nothing
> > to do with disk mapping, it represents a read or write of a buffer. Said
> > buffer represents an extent on disk. One extent. If you have multiple
> > extents in your transfer, you are dealing with multiple buffers.
>
> true the read or write of a buffer that is created by genfs. So i allways
> have to return a runlength of one then and loosing all hope on multi-sector
> reads?
Actually look at your metadata layout. You (the fs) know where the blocks
are. You should know how many blocks are contiguous for the passed-in
offset. If you don't have MAXPHYS / lb_size worth of data in a row, return
the amount you have. If you do have (MAXPHYS / lb_size) worth, return
MAXPHYS / lb_size.
The test should be a simple conditional. It shouldn't be that hard. :-)
For sane files, you're always going to find a lot of blocks in a row. So
most of the time you are going to do (MAXPHYS / lb_size) blocks.
If you really have a file that is significantly fragmented, then you have
a severe performance issue. The time it takes to figure all of this out in
VOP_BMAP() will be next to nothing compared to disk access time. So while
you need to handle it, you don't need to worry about it performing well.
Take care,
Bill
[Attachment #3 (application/pgp-signature)]
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic