[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openmosix-general
Subject:    [openMosix-general] openMosix-Dojo: little intro on how openMosix do shutdown
From:       Mulyadi Santosa <a_mulyadi () telkom ! net>
Date:       2004-03-29 6:03:37
Message-ID: 200403291303.37528.a_mulyadi () telkom ! net
[Download RAW message or body]

hello all

After doing my first adventure in openMosix bug hunting, i want to write it as 
free newsletter called "openMosix Dojo". I took the name "Dojo" to symbolize 
"train hard and practice your skill with others to know your weaknesses" :)

OK, lets take a look on how openMosix doing shutdown. 
1. Usually, you do "setpe -off", either call it directly or via 
startup/shutdown script.

2. Setpe will check if it is called on top of open. How setpe determine this 
situation? Simple, via checking the existence of "/proc/hpc/admin/mospe" and 
"/proc/hpc/admin/config". If one of them is not found then obviously you are 
not inside the openMosix. Note: well this is not perfect detection, you can 
fake it if u create similar /proc entry :)

3. If step 2 is sucess, setpe will open both /proc entry in read write mode 
and access them via file descriptor

4. On mospe entry, setpe will write "0" value. This value will be passed to 
ctl_admin_mospe (in hpc/hpcproc.c). because the operation mode is "write", 
then the section "write" (in if (write) ) is executed.

5. Because this is a shutdown, PE (the node number) is changed to 0 (zero). 
(OK I admit, it is very hard to decode what get_intarr and do_intarr mean)

6. we call the function mosix_config_set_pe with 0 (zero) as parameter. Many 
checking is done inside config_set_pe, like verifying  the new pe value 
(which is zero) is inside allowed range (between zero to MOSIXMAX=65535).
Then it will call mosix_config_set_table(). before that, it will grab the 
spinlock to avoid race condition (while modifying the mosix conf table)

7. Inside mosix_config_set_table, it will execute the statement inside if (! 
newpe || !nents ) In human word, it will be executed (the shutdown stage) if 
newpe = 0 or there is no entry in the mosix table. Eventually it will call 
config_shutdown

8. Inside config_shutdown, there is many task to do:
	a. calling config_validate. its task is to "command" any process under 			
openMosix control to do "self checking" of new table. (This is unconfirmed 
because i have trouble understanding the code) each process which get the 
command will realize that oM is doing shutdown, so: every process originating 
from the going-to-shutdown node will be called back and current process 
running on the going-to-shutdown node originating from other node will be 
kicked out

	b. in info-reconfig, it will fill zero value to all load info in GTS (going 
to shutdown) node

	c. wake up the mig_daemon and info_daemon by sending them SIGALRM. This is to 
make sure that they read new configuration

	d. then, there is a tight loop to wait until mig_daemon and info_daemon 
become inactive. Something that i don't understand is: it is checking for 
occurence of pending signal. By doing set_current_state(TASK_INTERRUPTIBLE) 
and schedule_timeout, it will certainly come back after certain interval. So 
i guess , this checking is intended to fail the shutdown if it is woken up 
via signal, not by the expiratio of timer

	e. It will unregister reboot notifier. This notifier is will be executed in 
case u do three finger salute/kicking the power button/ power failure :)

	f. if you use MFS, it will close all opened MFS file descriptor

	g. if something wrong happen during a to f, the config_shutdown if undo the 
step by restoring the valid node number, the network communication for 
migration, etc

I am still working on figure out the "config" proc handler. So, if anyone has 
the idea about it, then just add into this Dojo-newsletter :)

NB: I am a kernel newbie, so feel free to correct it if u found any mistakes. 
This is just my little contribution to shed up some light in oM code

regards

Mulyadi
a.k.a "jack25"--> meet me in the #openmosix freenode channel :-))

regards

Mulyadi



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id70&alloc_id638&opĚk
_______________________________________________
openMosix-general mailing list
openMosix-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openmosix-general

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic