[prev in list] [next in list] [prev in thread] [next in thread]
List: openmosix-general
Subject: [openMosix-general] openMosix-Dojo: little intro on how openMosix do shutdown
From: Mulyadi Santosa <a_mulyadi () telkom ! net>
Date: 2004-03-29 6:03:37
Message-ID: 200403291303.37528.a_mulyadi () telkom ! net
[Download RAW message or body]
hello all
After doing my first adventure in openMosix bug hunting, i want to write it as
free newsletter called "openMosix Dojo". I took the name "Dojo" to symbolize
"train hard and practice your skill with others to know your weaknesses" :)
OK, lets take a look on how openMosix doing shutdown.
1. Usually, you do "setpe -off", either call it directly or via
startup/shutdown script.
2. Setpe will check if it is called on top of open. How setpe determine this
situation? Simple, via checking the existence of "/proc/hpc/admin/mospe" and
"/proc/hpc/admin/config". If one of them is not found then obviously you are
not inside the openMosix. Note: well this is not perfect detection, you can
fake it if u create similar /proc entry :)
3. If step 2 is sucess, setpe will open both /proc entry in read write mode
and access them via file descriptor
4. On mospe entry, setpe will write "0" value. This value will be passed to
ctl_admin_mospe (in hpc/hpcproc.c). because the operation mode is "write",
then the section "write" (in if (write) ) is executed.
5. Because this is a shutdown, PE (the node number) is changed to 0 (zero).
(OK I admit, it is very hard to decode what get_intarr and do_intarr mean)
6. we call the function mosix_config_set_pe with 0 (zero) as parameter. Many
checking is done inside config_set_pe, like verifying the new pe value
(which is zero) is inside allowed range (between zero to MOSIXMAX=65535).
Then it will call mosix_config_set_table(). before that, it will grab the
spinlock to avoid race condition (while modifying the mosix conf table)
7. Inside mosix_config_set_table, it will execute the statement inside if (!
newpe || !nents ) In human word, it will be executed (the shutdown stage) if
newpe = 0 or there is no entry in the mosix table. Eventually it will call
config_shutdown
8. Inside config_shutdown, there is many task to do:
a. calling config_validate. its task is to "command" any process under
openMosix control to do "self checking" of new table. (This is unconfirmed
because i have trouble understanding the code) each process which get the
command will realize that oM is doing shutdown, so: every process originating
from the going-to-shutdown node will be called back and current process
running on the going-to-shutdown node originating from other node will be
kicked out
b. in info-reconfig, it will fill zero value to all load info in GTS (going
to shutdown) node
c. wake up the mig_daemon and info_daemon by sending them SIGALRM. This is to
make sure that they read new configuration
d. then, there is a tight loop to wait until mig_daemon and info_daemon
become inactive. Something that i don't understand is: it is checking for
occurence of pending signal. By doing set_current_state(TASK_INTERRUPTIBLE)
and schedule_timeout, it will certainly come back after certain interval. So
i guess , this checking is intended to fail the shutdown if it is woken up
via signal, not by the expiratio of timer
e. It will unregister reboot notifier. This notifier is will be executed in
case u do three finger salute/kicking the power button/ power failure :)
f. if you use MFS, it will close all opened MFS file descriptor
g. if something wrong happen during a to f, the config_shutdown if undo the
step by restoring the valid node number, the network communication for
migration, etc
I am still working on figure out the "config" proc handler. So, if anyone has
the idea about it, then just add into this Dojo-newsletter :)
NB: I am a kernel newbie, so feel free to correct it if u found any mistakes.
This is just my little contribution to shed up some light in oM code
regards
Mulyadi
a.k.a "jack25"--> meet me in the #openmosix freenode channel :-))
regards
Mulyadi
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id70&alloc_id638&opĚk
_______________________________________________
openMosix-general mailing list
openMosix-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openmosix-general
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic