'Re: MOSIX and GFS'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gfs-devel
Subject:    Re: MOSIX and GFS
From:       Amnon Shiloh <amnons () cs ! huji ! ac ! il>
Date:       2000-07-25 17:30:47
[Download RAW message or body]

Hello Andrew, everyone...

Andrew Barry <barry@borg.umn.edu> Wrote:

> DMEP devices AND dlock devices do not send interrupts to clients that want
> a particular lock. Polling will work wellenough for many instances, and if
> quicker response is necessary, clients freeing a lock can send a courtesy
> callback to waiting nodes. If you want a full queue can be implemented.
> GFS doesn't do this, but could. There is a lot of complexity in asking the
> dmep device to send interrupts, and complexity is not something we want to
> add to the protocol. 

I obviously do not suggest that the DMEP device recognize the internals
of the memory-block and send interrupts accordingly - the onus should
indeed be on the nodes to interrupt each other, which you mentioned as
a "courtesy callback".

The question is how are those "courtesy callback"s made?
by using the external network?  I was innocently believing that since a
disk-controller is able to send interrupts anyway, such as at the end of
a DMA operation, and since all nodes are already connected to the same
controller(s), then why not use a controller to send an interrupt to
another node?

This is really not a particular MOSIX issue (as I mentioned earlier, the
DMEP clock(s) can be implemented without locks at all), but I am just curious
about the new dlock implementation: what does a process do when finding that
a file it wishes to use is locked?  busy-polling is not a real option, is it?
trying 2-3 more times is OK, but probably no more, because other process also
need the CPU, so how often should the process retry? and what guarantee is
there to prevent an effective deadlock, where due for example to the SCSI
priorities, a particular node ends up never getting the lock?

While a full queue is probably an overkill,
how about the following primitive queue:

1) Add a field to the dlock-DMEP (say WT for Waiting-Time), telling that
   someone is already waiting for at least N milli-seconds.
2) When someone fails to get the dlock, and finds that they already wait
   longer than WT, they save WT and replace it with their own waiting-time.
3) If someone sees that WT is greater than their own waiting time, they wait
   patiently.
4) When someone obtains the dlock, they immediately restore their saved WT.
5) For crash-recovery, add another field containing a random version-number
   (that is very unlikely to be repeated on different nodes).
   It is the responsibility of every waiting node to check the DMEP at least
   once a second (not an extra requirement, since it polls more often anyway)
   and if the version-number is theirs, change it.  The previous version-number
   is also saved when modifying WT and restored when obtaining the dlock.
   If someone finds that the version number has not changed for say,
   5 seconds, they zero WT and everyone is free again to try using the dlock.

Amnon Shiloh -- the HUJI MOSIX group.

-
To unsubscribe from this list: send the line "unsubscribe gfs-devel" in
the body of a message to majordomo@sistina.com
Read the GFS Howto:  http://www.globalfilesystem.org/howtos/gfs_howto/

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic