[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha
Subject:    [Linux-HA] not all resources started
From:       Tom Brown <brown () esteem ! com>
Date:       2006-12-21 23:22:17
Message-ID: 200612211522.17910.brown () esteem ! com
[Download RAW message or body]

Hi,

I'm running heartbeat 2.0.2. We just had a power failure on our mail server. I 
started the primary node only. The ha-debug log is given below. The resources 
heartbeat starts are:

drbddisk::drbd0 Filesystem::/dev/drbd0::/mirror::ext3 postgresql apache2 
courier

It started postgresql then stopped. It looks like it made no attempt to start 
apach2 or courier. When I tested heartbeat initially it all worked fine. I've 
made configuration changes to courier, but that is it. Any ideas on why this 
might occur?

Thank you,
Tom 

ha-debug:
heartbeat[2213]: 2006/12/21_14:17:23 WARN: Core dumps could be lost if 
multiple dumps occur
heartbeat[2213]: 2006/12/21_14:17:23 WARN: Consider 
setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum 
supportability
heartbeat[2213]: 2006/12/21_14:17:23 WARN: Logging daemon is disabled 
--enabling logging daemon is recommended
heartbeat[2213]: 2006/12/21_14:17:23 info: **************************
heartbeat[2213]: 2006/12/21_14:17:23 info: Configuration validated. Starting 
heartbeat 2.0.2
heartbeat[2214]: 2006/12/21_14:17:23 info: heartbeat: version 2.0.2
heartbeat[2214]: 2006/12/21_14:17:24 info: Heartbeat generation: 33
heartbeat[2214]: 2006/12/21_14:17:24 info: Removing /var/run/heartbeat/rsctmp 
failed, recreating.
heartbeat[2214]: 2006/12/21_14:17:24 info: glib: Starting serial heartbeat on 
tty /dev/ttyS0 (19200 baud)
heartbeat[2214]: 2006/12/21_14:17:24 info: G_main_add_SignalHandler: Added 
signal handler for signal 17
heartbeat[2214]: 2006/12/21_14:17:24 info: pid 2214 locked in memory.
heartbeat[2214]: 2006/12/21_14:17:24 info: Local status now set to: 'up'
heartbeat[2237]: 2006/12/21_14:17:25 info: pid 2237 locked in memory.
heartbeat[2238]: 2006/12/21_14:17:25 info: pid 2238 locked in memory.
heartbeat[2239]: 2006/12/21_14:17:25 info: pid 2239 locked in memory.
heartbeat[2214]: 2006/12/21_14:17:34 WARN: node mail02.esteem.com: is dead
heartbeat[2214]: 2006/12/21_14:17:34 info: Local status now set to: 'active'
heartbeat[2214]: 2006/12/21_14:17:34 WARN: No STONITH device configured.
heartbeat[2214]: 2006/12/21_14:17:34 WARN: Shared disks are not protected.
heartbeat[2214]: 2006/12/21_14:17:34 info: Resources being acquired from 
mail02.esteem.com.
heartbeat[2338]: 2006/12/21_14:17:34 debug: notify_world: setting SIGCHLD 
Handler to SIG_DFL
harc[2338]:     2006/12/21_14:17:34 info: Running /etc/ha.d/rc.d/status status
mach_down[2359]:        2006/12/21_14:17:35 
info: /usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired
heartbeat[2214]: 2006/12/21_14:17:35 info: Initial resource acquisition 
complete (T_RESOURCES(us))
heartbeat[2214]: 2006/12/21_14:17:35 info: mach_down takeover complete.
mach_down[2359]:        2006/12/21_14:17:35 info: mach_down takeover complete 
for node mail02.esteem.com.
heartbeat[2214]: 2006/12/21_14:17:35 debug: StartNextRemoteRscReq(): child 
count 1
heartbeat[2214]: 2006/12/21_14:17:35 debug: StartNextRemoteRscReq(): child 
count 1
heartbeat[2339]: 2006/12/21_14:17:35 info: Local Resource acquisition 
completed.
heartbeat[2415]: 2006/12/21_14:17:35 debug: notify_world: setting SIGCHLD 
Handler to SIG_DFL
harc[2415]:     2006/12/21_14:17:35 info: 
Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp
ip-request-resp[2415]:  2006/12/21_14:17:35 received ip-request-resp 
172.16.1.10 OK yes
ResourceManager[2430]:  2006/12/21_14:17:35 info: Acquiring resource group: 
mail01.esteem.com 172.16.1.10 drbddisk::drbd0 
Filesystem::/dev/drbd0::/mirror::ext3
 postgresql apache2 courier
ResourceManager[2430]:  2006/12/21_14:17:35 info: 
Running /etc/ha.d/resource.d/IPaddr 172.16.1.10 start
ResourceManager[2430]:  2006/12/21_14:17:35 debug: 
Starting /etc/ha.d/resource.d/IPaddr 172.16.1.10 start
heartbeat[2238]: 2006/12/21_14:17:35 WARN: glib: TTY write timeout on 
[/dev/ttyS0] (no connection or bad cable? [see documentation])
ls: /var/run/heartbeat/rsctmp/IPaddr/eth0:*: No such file or directory
IPaddr[2488]:   2006/12/21_14:17:35 info: /sbin/ifconfig eth0:0 172.16.1.10 
netmask 255.255.0.0 broadcast 172.16.255.255
IPaddr[2488]:   2006/12/21_14:17:35 info: Sending Gratuitous Arp for 
172.16.1.10 on eth0:0 [eth0]
IPaddr[2488]:   2006/12/21_14:17:35 /usr/lib/heartbeat/send_arp -i 500 -r 10 
-p /var/run/heartbeat/rsctmp/send_arp/send_arp-172.16.1.10 eth0 172.16.1.10 
auto 1
72.16.1.10 ffffffffffff
ResourceManager[2430]:  2006/12/21_14:17:35 debug: /etc/ha.d/resource.d/IPaddr 
172.16.1.10 start done. RC=0
send_arp[2558]: 2006/12/21_14:17:35 info: Disable using logging daemon
ResourceManager[2430]:  2006/12/21_14:17:35 info: 
Running /etc/ha.d/resource.d/drbddisk drbd0 start
ResourceManager[2430]:  2006/12/21_14:17:35 debug: 
Starting /etc/ha.d/resource.d/drbddisk drbd0 start
ResourceManager[2430]:  2006/12/21_14:17:35 
debug: /etc/ha.d/resource.d/drbddisk drbd0 start done. RC=0
ResourceManager[2430]:  2006/12/21_14:17:35 info: 
Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /mirror ext3 start
ResourceManager[2430]:  2006/12/21_14:17:35 debug: 
Starting /etc/ha.d/resource.d/Filesystem /dev/drbd0 /mirror ext3 start
ResourceManager[2430]:  2006/12/21_14:17:37 
debug: /etc/ha.d/resource.d/Filesystem /dev/drbd0 /mirror ext3 start done. 
RC=0
ResourceManager[2430]:  2006/12/21_14:17:38 info: 
Running /etc/init.d/postgresql  start
ResourceManager[2430]:  2006/12/21_14:17:38 debug: 
Starting /etc/init.d/postgresql  start
Starting PostgreSQL: ok
ResourceManager[2430]:  2006/12/21_14:17:38 debug: /etc/init.d/postgresql  
start done. RC=0
heartbeat[2214]: 2006/12/21_14:17:45 info: Local Resource acquisition 
completed. (none)
heartbeat[2214]: 2006/12/21_14:17:45 info: local resource transition 
completed.
^^^^^^^^^
Why is it completed? It still has apache2 and courier to startup.
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic