[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha
Subject:    [Linux-HA] heartbeat strange behavior
From:       Douglas Pasqua <douglas.pasqua () gmail ! com>
Date:       2012-04-30 16:52:05
Message-ID: CAJ8AT4QYDMgyqSDYA620OmbN=CzqOqN2Jf49nBOcnwcWMKUDeg () mail ! gmail ! com
[Download RAW message or body]

Hi friends,

I create a linux ha solution using 2 nodes: node-a and node-b.

My /etc/ha.d/ha.cf:

use_logd yes
keepalive 1
deadtime 90
warntime 5
initdead 120
bcast eth6
node node-a
node node-b
crm off
auto_failback off

My /etc/ha.d/haresources
node-a x.x.x.x/24 x.x.x.x/24 x.x.x.x/24 service1 service2 service3

I booted the two nodes together. node-a become master and node-b become
slave. After, I booted the node-a. Then node-b become master. When node-a
return from boot, it become slave, because *auto_failback is off* i think.
All as expected until here.

As the node-a as a slave, I decide to halt the node-a (using halt command).
Then heartbeat in node-b go standby and my cluster was down. The virtual
ips was down too. I expected the node-b stay on. Why did this happen ?

Some log from node2:

Apr 30 00:02:57 node-b heartbeat: [3082]: info: Received shutdown notice
from 'node-a'.
Apr 30 00:02:57 node-b heartbeat: [3082]: info: Resources being acquired
from node-a.
Apr 30 00:02:57 node-b heartbeat: [4414]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Apr 30 00:02:57 node-b harc[4414]: [4428]: info: Running
/etc/ha.d/rc.d/status status
Apr 30 00:02:57 node-b heartbeat: [4416]: info: No local resources
[/usr/share/heartbeat/ResourceManager listkeys node-b] to acquire.
Apr 30 00:02:57 node-b heartbeat: [3082]: debug: StartNextRemoteRscReq():
child count 1

Apr 30 00:02:58 node-b ResourceManager[4462]: [4657]: debug:
/etc/init.d/asterisk  start done. RC=1
Apr 30 00:02:58 node-b ResourceManager[4462]: [4658]: ERROR: Return code 1
from /etc/init.d/asterisk
Apr 30 00:02:58 node-b ResourceManager[4462]: [4659]: CRIT: Giving up
resources due to failure of asterisk
Apr 30 00:02:58 node-b ResourceManager[4462]: [4660]: info: Releasing
resource group: node-a x.x.x.x/24 x.x.x.x/24 x.x.x.x/24 asterisk
sincronismo notificacao
Apr 30 00:02:58 node-b ResourceManager[4462]: [4670]: info: Running
/etc/init.d/notificacao  stop
Apr 30 00:02:58 node-b ResourceManager[4462]: [4671]: debug: Starting
/etc/init.d/notificacao  stop

Apr 30 00:02:58 node-b ResourceManager[4462]: [4694]: debug:
/etc/init.d/notificacao  stop done. RC=0
Apr 30 00:02:58 node-b ResourceManager[4462]: [4704]: info: Running
/etc/init.d/sincronismo  stop
Apr 30 00:02:58 node-b ResourceManager[4462]: [4705]: debug: Starting
/etc/init.d/sincronismo  stop
Apr 30 00:02:58 node-b ResourceManager[4462]: [4711]: debug:
/etc/init.d/sincronismo  stop done. RC=0
Apr 30 00:02:58 node-b ResourceManager[4462]: [4720]: info: Running
/etc/init.d/asterisk  stop
Apr 30 00:02:58 node-b ResourceManager[4462]: [4721]: debug: Starting
/etc/init.d/asterisk  stop
Apr 30 00:02:58 node-b ResourceManager[4462]: [4725]: debug:
/etc/init.d/asterisk  stop done. RC=0
Apr 30 00:02:58 node-b ResourceManager[4462]: [4741]: info: Running
/etc/ha.d/resource.d/IPaddr x.x.x.x/24 stop
Apr 30 00:02:58 node-b ResourceManager[4462]: [4742]: debug: Starting
/etc/ha.d/resource.d/IPaddr x.x.x.x/24 stop

Apr 30 00:03:29 node-b heartbeat: [3082]: info: node-b wants to go standby
[foreign]
Apr 30 00:03:39 node-b heartbeat: [3082]: WARN: No reply to standby
request.  Standby request cancelled.
Apr 30 00:04:29 node-b heartbeat: [3082]: WARN: node node-a: is dead
Apr 30 00:04:29 node-b heartbeat: [3082]: info: Dead node node-a gave up
resources.
Apr 30 00:04:29 node-b heartbeat: [3082]: info: Link node-a:eth6 dead.


Best Regards,
Douglas V. Pasqua
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic