[prev in list] [next in list] [prev in thread] [next in thread]
List: opensolaris-ufs-discuss
Subject: Re: [ufs-discuss] Locking in directio codepath
From: Frank Batschulat <Frank.Batschulat () Sun ! COM>
Date: 2006-08-23 12:51:11
Message-ID: op.teqlzlrz176b0c () blade
[Download RAW message or body]
On Thu, 17 Aug 2006 20:13:35 +0200, Samarjeet Tomar <sst117@gmail.com>
wrote:
> ufs_write() seems to be getting all the usual locks in the
> directio codepath too, which was introduced specifically
> for databases (Oracle?). My understanding was that the
> database expects the file system to not do any (or minimal)
> synchronization, as it itself takes care of the synchronization
> for access to database blocks. Is my understanding correct?
not entirely correct, you're mixing up 2 things here:
Direct I/O and Concurrent Direct I/O
What is UFS Direct I/O?
Direct I/O is a UFS option to advise that file system
reads and writes should bypass the Solaris file system
page cache, and go straigh to the disk device. Direct
I/O uses a very similar code path to that when the raw
disk path is used, but rather than bypassing the file
system completely, we simply bypass the file system
cache. This has the advantage of retaining the regular
file system administration model, with the advantage of
providing an efficient code path similar to that of raw
disk.
What is Concurrent Direct I/O?
Concurrent Direct I/O is an enhanced implementation of
Direct I/O available in Solaris 8 Update 3 onwards. It
provides significant performance improvements for data-
bases, over the standard Direct I/O implementation.
Concurrent Direct I/O allows multiple overlapping
reads/writes to the same file. Without it, only one
synchronous read or write can occur to a file at any
time.
However there are some quirks concerning Concurrent Direct I/O.
The approach taken modifies the current directio data path.
No new interfaces were introduced and no existing interfaces
altered. All POSIX semantics have been preserved. To take advantage of
these changes, the file must be marked directio using the directio()
library call or the "forcedirectio" mount option on the entire file system.
In particular, this is a write-only, advisory interface. The application
cannot reasonable determine if a directio (or concurrent directio)
operation
has occurred. Also, certain optimizations could not be done while
maintaining
strict POSIX compliance.
However, it is possible to allow concurrent writers
to a single file in restricted situations, the so called "re-write"
situation, it is possible to allow concurrent writers to pre-allocated ufs
files using an unbuffered data path.
The current ufs directio datapath was altered to special case "re-write"
operations. that's what ufs_write() checks for via ufs_check_rewrite()
before calling ufs_directio_write()
http://cvs.opensolaris.org/source/xref/on/usr/src/uts/common/fs/ufs/ufs_vnops.c#392
which has the following comment:
402 /*
403 * Filter to determine if this request is suitable as a
404 * concurrent rewrite. This write must not allocate blocks
405 * by extending the file or filling in holes. No use trying
406 * through FSYNC descriptors as the inode will be synchronously
407 * updated after the write. The uio structure has not yet been
408 * checked for sanity, so assume nothing.
409 */
if those criteria are met the inode's i_rwlock is being held as RW_READER
allowing concurrent write access instead of usually hold as RW_WRITER.
in addition the inode's i_contents lock is also held as RW_READER.
otherwise the "regular" directio code path is taken
via ufs_write()->wrip()->ufs_directio_write()
hth
--
frankB
(I'd rather be a forest then a street....)
_______________________________________________
ufs-discuss mailing list
ufs-discuss@opensolaris.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic