[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-ha-dev
Subject: RE: [Linux-ha-dev] Patch: Fast node fail detection (part 2)
From: "Zou, Yixiong" <yixiong.zou () intel ! com>
Date: 2005-02-11 20:23:20
Message-ID: 012676D607FCF54E986746512C22CE7D02E299F0 () orsmsx407
[Download RAW message or body]
[Attachment #2 (unknown)]
I did two tests using the "faildetection" utility. Results are posted here. The \
first set of 10 tests used "nodefail", the second did not.
------------------------------------------------------------------------------------------------
This is the result for the first test. I run "faildetection 10", which tells it to \
run 10 times.
Here's the settings in the /etc/ha.d/ha.cf
keepalive 50ms
deadtime 500ms
warntime 250ms
[root@coldplay nodefail]# ./faildetection 10
lt-faildetection[2534]: 2005/02/11_11:49:04 info: current pid = 2534, test count = 10
lt-faildetection[2534]: 2005/02/11_11:49:04 debug: Signing in with heartbeat
lt-faildetection[2534]: 2005/02/11_11:49:04 info: myid = coldplay, ntfid = caliber
lt-faildetection[2534]: 2005/02/11_11:49:04 info: wait two seconds before we start \
the test. Starting High-Availability services:
[ OK ]
lt-faildetection[2534]: 2005/02/11_11:49:46 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat; snmptrap -v 2c -c public coldplay '' \
.1.3.6.1.4.1.4682.900.1" lt-faildetection[2534]: 2005/02/11_11:49:47 info: failure \
detection time for test No. 1: 80ms
Starting High-Availability services:
[ OK ]
lt-faildetection[2534]: 2005/02/11_11:50:47 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat; snmptrap -v 2c -c public coldplay '' \
.1.3.6.1.4.1.4682.900.1" lt-faildetection[2534]: 2005/02/11_11:50:48 info: failure \
detection time for test No. 2: 70ms
Starting High-Availability services:
[ OK ]
lt-faildetection[2534]: 2005/02/11_11:51:48 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat; snmptrap -v 2c -c public coldplay '' \
.1.3.6.1.4.1.4682.900.1" lt-faildetection[2534]: 2005/02/11_11:51:49 info: failure \
detection time for test No. 3: 90ms
Starting High-Availability services:
[ OK ]
lt-faildetection[2534]: 2005/02/11_11:52:49 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat; snmptrap -v 2c -c public coldplay '' \
.1.3.6.1.4.1.4682.900.1" lt-faildetection[2534]: 2005/02/11_11:52:50 info: failure \
detection time for test No. 4: 70ms
Starting High-Availability services:
[ OK ]
lt-faildetection[2534]: 2005/02/11_11:53:50 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat; snmptrap -v 2c -c public coldplay '' \
.1.3.6.1.4.1.4682.900.1" lt-faildetection[2534]: 2005/02/11_11:53:50 info: failure \
detection time for test No. 5: 70ms
Starting High-Availability services:
[ OK ]
lt-faildetection[2534]: 2005/02/11_11:54:51 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat; snmptrap -v 2c -c public coldplay '' \
.1.3.6.1.4.1.4682.900.1" lt-faildetection[2534]: 2005/02/11_11:54:51 info: failure \
detection time for test No. 6: 100ms
Starting High-Availability services:
[ OK ]
lt-faildetection[2534]: 2005/02/11_11:55:52 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat; snmptrap -v 2c -c public coldplay '' \
.1.3.6.1.4.1.4682.900.1" lt-faildetection[2534]: 2005/02/11_11:55:52 info: failure \
detection time for test No. 7: 70ms
Starting High-Availability services:
[ OK ]
lt-faildetection[2534]: 2005/02/11_11:56:53 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat; snmptrap -v 2c -c public coldplay '' \
.1.3.6.1.4.1.4682.900.1" lt-faildetection[2534]: 2005/02/11_11:56:53 info: failure \
detection time for test No. 8: 90ms
Starting High-Availability services:
[ OK ]
lt-faildetection[2534]: 2005/02/11_11:57:54 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat; snmptrap -v 2c -c public coldplay '' \
.1.3.6.1.4.1.4682.900.1" lt-faildetection[2534]: 2005/02/11_11:57:54 info: failure \
detection time for test No. 9: 90ms
Starting High-Availability services:
[ OK ]
lt-faildetection[2534]: 2005/02/11_11:58:55 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat; snmptrap -v 2c -c public coldplay '' \
.1.3.6.1.4.1.4682.900.1" lt-faildetection[2534]: 2005/02/11_11:58:55 info: failure \
detection time for test No. 10: 100ms
lt-faildetection[2534]: 2005/02/11_11:58:55 info: average failover detection time = \
83, min = 70, max = 100, number of tests fall within range: 10, within requirment: 10
-------------------------------------------------------------------------------------- \
-------------------------------------------------------------------------------------
Below is the second test result. I run "faildetection -nt 10". This tells \
faildetection to not to send the trap command, thus the number we get is the original \
heartbeat detection ttime. I also changed the ha.cf settings to the following:
keepalive 25ms
deadtime 250ms
warntime 125ms
[root@coldplay nodefail]# ./faildetection -nt 10
lt-faildetection[12433]: 2005/02/11_12:04:04 info: current pid = 12433, test count = \
10 lt-faildetection[12433]: 2005/02/11_12:04:04 debug: Signing in with heartbeat
lt-faildetection[12433]: 2005/02/11_12:04:04 info: myid = coldplay, ntfid = caliber
lt-faildetection[12433]: 2005/02/11_12:04:04 info: wait two seconds before we start \
the test. Starting High-Availability services:
[ OK ]
lt-faildetection[12433]: 2005/02/11_12:04:47 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat" lt-faildetection[12433]: 2005/02/11_12:04:47 info: \
failure detection time for test No. 1: 230ms
Starting High-Availability services:
[ OK ]
lt-faildetection[12433]: 2005/02/11_12:05:48 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat" lt-faildetection[12433]: 2005/02/11_12:05:48 info: \
failure detection time for test No. 2: 420ms
Starting High-Availability services:
[ OK ]
lt-faildetection[12433]: 2005/02/11_12:06:49 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat" lt-faildetection[12433]: 2005/02/11_12:06:50 info: \
failure detection time for test No. 3: 450ms
Starting High-Availability services:
[ OK ]
lt-faildetection[12433]: 2005/02/11_12:07:50 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat" lt-faildetection[12433]: 2005/02/11_12:07:51 info: \
failure detection time for test No. 4: 350ms
Starting High-Availability services:
[ OK ]
lt-faildetection[12433]: 2005/02/11_12:08:51 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat" lt-faildetection[12433]: 2005/02/11_12:08:52 info: \
failure detection time for test No. 5: 350ms
Starting High-Availability services:
[ OK ]
lt-faildetection[12433]: 2005/02/11_12:09:52 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat" lt-faildetection[12433]: 2005/02/11_12:09:53 info: \
failure detection time for test No. 6: 370ms
Starting High-Availability services:
[ OK ]
lt-faildetection[12433]: 2005/02/11_12:10:53 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat" lt-faildetection[12433]: 2005/02/11_12:10:54 info: \
failure detection time for test No. 7: 450ms
Starting High-Availability services:
[ OK ]
lt-faildetection[12433]: 2005/02/11_12:11:54 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat" lt-faildetection[12433]: 2005/02/11_12:11:55 info: \
failure detection time for test No. 8: 420ms
Starting High-Availability services:
[ OK ]
lt-faildetection[12433]: 2005/02/11_12:12:55 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat" lt-faildetection[12433]: 2005/02/11_12:12:56 info: \
failure detection time for test No. 9: 360ms
Starting High-Availability services:
[ OK ]
lt-faildetection[12433]: 2005/02/11_12:13:57 info: exec: ssh -q -x -n -l root \
"caliber" "killall -9 heartbeat" lt-faildetection[12433]: 2005/02/11_12:13:57 info: \
failure detection time for test No. 10: 470ms
lt-faildetection[12433]: 2005/02/11_12:13:57 info: average failover detection time = \
387, min = 230, max = 470, number of tests fall within range: 1, within requirment: 1
-----------------------------------
Yixiong Zou (yixiong.zou@intel.com)
Open Source Technology Center
Intel Corp.
All views expressed in this email are those of the individual sender.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic