[prev in list] [next in list] [prev in thread] [next in thread]
List: freebsd-smp
Subject: Re: VM Commits / GIANT_ macros
From: Matt Dillon <dillon () earth ! backplane ! com>
Date: 2001-07-06 2:48:09
[Download RAW message or body]
:
:
:--yNb1oOkm5a9FJOVX
:Content-Type: text/plain; charset=us-ascii
:Content-Disposition: inline
:Content-Transfer-Encoding: quoted-printable
:
:On Wed, Jul 04, 2001 at 09:38:17AM -0700, Matt Dillon wrote:
:> Hello everyone! Ok, after talking with John and others at USENIX
:> and a doing a couple of back and forths with Alfred, I am officially
:> taking over the main-line machine-independant VM system in -current.
:>
:> I will also be working on i386 pmap, vm_object, vm_map, and the buffer
:> cache (in regards to mutexes & Giant).
:
:Could you keep me posted wrt what locking is needed in pmap? This should
:allow me to keep PowerPC in sync. Don't have any plans to hit SMP on power:pc
:any time soon, but it'd be nice to have the infrastructure in place. =)
:
:--
:Benno Rice
:benno@FreeBSD.org
Sure, I'll post updates to freebsd-smp. Here's the first update:
I spent a good deal of wednesday cleaning up the VM source files,
breaking them up into manageable pieces and moving vm_page_zero_idle()
from MD files to a new MI file.
I spent about four hours experimenting with various fine-grained VM
mutex models, e.g. simply by starting to code it and noting where I
would bog-down. I believe I have come up with one that is useable
for vm_page_t manipulation.
The issue we have with vm_page_t is that various entities currently
depend on the atomic_* ops or Giant to do things like lookup a page
and then busy it. This previously occured under splvm() in order
to guarentee that nobody else would be able to busy the page while
we were trying to. Now it occurs under Giant. The goal is to be able
to do these sorts of operations without Giant.
This same dependance is used to do things like add or remove a vm_page_t
from its page queue, and add or remove a vm_page_t from the (object, index)
hash table, and move vm_page_t's between page queues.
This is the solution as I envision it. It is a considerable amount of
work, which I will be doing in stages.
* We will have a mutex for each (PQ_XXX) page queue. The appropriate
page queue mutex will be obtained to add or remove a vm_page_t to
that page queue (happens a lot), and to scan the queue
(contigmalloc and the pageout daemon scan the page queues).
* We will have a small shared array of mutexes to lock the
(objet, index) hash chains. For example, lets say you are in
vm_fault and do a vm_page_lookup() to lookup a page, and not
finding it you decide to vm_page_alloc() a new page. In order
to protect this sequence of events vm_page_lookup() will obtain
the appropriate hash chain mutex and leave it held on return
(whether or not the page is found). The caller will do whatever
it needs to do (non-blocking), and then release the hash chain
mutex.
This allows callers to safely add or remove pages from hash
chains.
* Many routines now lookup a page, then busy it, then release it
back onto a page queue (e.g. deactivate it, free it, activate it,
cache it). e.g. vm_fault, pageout daemon, and many other
interactions with the system. These interactions currently operate
under Giant (used to operate under spl) and do not bother to
'own' the page to execute the action. These interactions, however,
do check that the page is now owned by someone else (aka that the
page is not PG_BUSY or PG_BUSY/vm_page->busy).
To allow callers to safely lookup and then manipulate pages, for
example to manipulate vm_page->flags, I intend to change the API
such that when you nominally get a page, it will be BUSY'd for yo.
For example, when selecting a free or cache page from the page
queues, the page would be returned already BUSY'd, allowing you to
manipulate the page and then release it back to a queue (or
initiate I/O, or whatever). In many cases the caller intends to
busy the page anyway, so this is not much of a leap.
This only works if the page is not already busy, of course, but
nearly all users of the existing API skip or sleep/loop if the
returned page is busy, so we can fail gracefully and allow the
caller to do whatever needs to be done there.
Finally we have issues with how to set PG_BUSY in the first place.
Currently setting PG_BUSY uses atomic_*. It turns out that the solution
is easy and does not require the use of any additional mutex operations.
* When we are looking up a page that is on the free queue, aka
in vm_page_alloc(), simply holding the appropriate page queue mutex
(which we *ALREADY* hold in most cases) is sufficient to allow
us to manipulate the free pages in that queue without worrying about
other threads messing with those pages. Thus we can set PG_BUSY,
remove the page from the free queue, and then release the page
queue mutex before returning the newly allocated page.
* When we are looking up a page that is on the cache queue, or
is not associated with a queue, we simply aquire (or already hold
in most cases) the appropriate hash chain mutex. Then if the page
is not already PG_BUSY, we know we can safely manipulate its flags
(set PG_BUSY). If the page is already PG_BUSY
we need to sleep/loop anyway, so we can fail gracefully and let
the parent sleep/loop/do-whatever.
* We will need to find a better way to sleep/wait for a busy page
to become available. The current mechanism sets a PG_WANTED
flag in vm_page->flags, which doesn't work under the new scheme.
I expect I will transfer this sleep/wakeup mechanism to an array
of wanted flags in parallel with the VM hash chain mutex array.
Ok, so what am I going to start with? Well, I'm actually going to
start with #3 ... changing the VM API to return pages that are PG_BUSY'd
rather then making the caller busy them, and changing the various
page queue ops (e.g. vm_page_cache(), vm_page_deactivate(), etc...) to
unbusy the page automatically (some like vm_page_free() already work this
way). In most cases this allows existing code to operate as it used
to with only minimal changes... for example, if the existing code
assumes protection by Giant (original code by splvm() and Giant) in
order to retrieve, manipulate, and put back a page, the new code will
be able to assume protection by the fact that it will be given a PG_BUSY'd
page, which it can manipulate and put back.
This preliminary work can be done without introducing VM mutexes just yet
(i.e. I will do this work under Giant). But once complete, this
preliminary work will allow me to then add the VM mutexes described above
with very little effort and take a good chunk of the VM interface out from
under Giant.
--
I am not going to start work on the other major interfaces... pmap,
vm_object's, buffer cache, and so forth, until I complete the work on
the vm_page interface. These other interfaces work on a much more
granular level that will allow us to, for example, give each vm_object
a mutex (something we cannot and do not want to do for each vm_page).
And that is where I stand at the moment.
-Matt
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic