[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-aio
Subject:    Re: [PATCH linux-2.6.1-rc1-mm1] filemap_fdatawait.patch
From:       Andrew Morton <akpm () osdl ! org>
Date:       2004-01-02 4:36:12
Message-ID: 20040101203612.15debff2.akpm () osdl ! org
[Download RAW message or body]

Suparna Bhattacharya <suparna@in.ibm.com> wrote:
>
> On Wed, Dec 31, 2003 at 03:42:39PM -0800, Andrew Morton wrote:
> > Daniel McNeil <daniel@osdl.org> wrote:
> > >
> > > The other potential race in filemap_fdatawait() was that it
> > > removed the page from the locked list while waiting for a writeback
> > > and if there was a 2nd filemap_fdatawait() running on another cpu,
> > > it would not wait for the page being written since it would never see
> > > it on the list.
> > 
> > That would only happen if one thread or the other was not running under
> > i_sem.   The only path I see doing that is in generic_file_direct_IO()?
> 
> Yes, and we should simply fix generic_file_direct_IO to avoid doing so.
> We anyway issue filemap_fdatawait later with i_sem held.
> 
> The race that we need to worry about is between background writeouts
> (which don't take i_sem) and filemap_fdatawrite/filemap_fdatawait - i.e
> the first one discussed.

Well Daniel has raised a second race here, betwen filemap_fdatawait() and
filemap_fdatwait().  The background writeback code does not execute
filemap_datawait() anyway, so no prob.

Yes, extending i_sem coverage in the case O_DIRECT reads should suit.  In
the (vastly) common case mapping->nrpages is zero anyway, so we shouldn't
even enter that code.

> > > +		/*
> > > +		 * If the page is locked, it might be in process of being 
> > > +		 * setup for writeback but without PG_writeback set 
> > > +		 * and with PG_dirty cleared.
> > > +		 * (PG_dirty is cleared BEFORE PG_writeback is set)
> > > +		 * So, wait for the PG_locked to clear, then start over.
> > > +		 */
> > > +		if (PageLocked(page)) {
> > > +			page_cache_get(page);
> > > +			spin_unlock(&mapping->page_lock);
> > > +			wait_on_page_locked(page);
> > > +			page_cache_release(page);
> > > +			goto restart;
> > > +		}
> > 
> > Why is this a problem which needs addressing here?  If some other thread is
> > in the process of starting I/O against this page then the page must have
> > been clean when this thread ran filemap_fdatawrite()?
> 
> This is the same race that we have been discussing (background writer
> pulled this page off io_pages, put it on locked pages but hasn't set 
> PG_writeback as yet). To me it seemed that Daniel's solution was just an 
> alternative to what you proposed - i.e. adding lock_page() to filemap_fdatawait.
> I have to think a little about the fix -- AFAICS but we are all talking
> about the same (real) problem here.

Yup.  Realish, anyway.  Unless we can demonstrate that this is the cause of
the O_DIRECT data-exposure problems, this race isn't really very
interesting.  It should be plugged though I guess.

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic