[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] Anybody notice this ccm library segfault
From:       James Pan <jmltc () cn ! ibm ! com>
Date:       2006-03-07 6:04:17
Message-ID: 440D2261.5080903 () cn ! ibm ! com
[Download RAW message or body]

This should be a ccm bug, the ccm in heartbeat 1.2.4 is quite out of date.
In heartbeat 2.0,  __ccm_data will be initialized in the function 
saClmInitialize.
So I suggest do not use ccm in heartbeat 1.2.4. 

However, since ccm is an independent component of heartbeat,
I guess it should work if you replace the whole ccm stuff of heartbeat 
1.2.4
with the heartbeat 2 one and rebuild them.

An alternative way is to use heartbeat 2.0 directly and run it in the 
legacy
mode, that is , set crm to false in ha.cf. IMHO.


Tony Scott wrote:

> Hi,
>
> I'm using heartbeat  1.2.4.
> When I use  try to start  hbagent,  the following happens:
>
> init_membership() in hbagent.c calls saClmInitialize in  the ccm library
> (ccmlib_clm.c)    ...
> saClmInitialize uses oc_ev_set_callback to register its function 
> ccm_events
> as a callback function.
>
> The ccm_events callback function is the function which sets up the
> __ccm_data pointer with a non NULL value.
>
> Next, in init_membership(),  the ccmlib_clm.c function
> saClmClusterTrackStart   is called.
> It does the following:
> const oc_ev_membership_t *oc;
> ...
> ...
> oc = __ccm_data;
> itemnum = oc->m_n_member;
>
> However,
> The problem is that this callback function "ccm_events" is never being
> called, so __ccm_data remains as a NULL pointer....
> and the "itemnum = oc->m_n_member;"  causes a segfault (Normal for
> referencing a NULL pointer :o))
>
> --------------------------------------
> send_message In ccmclient.c  print the following :
> "ipc channel blocked"
>
> -------------------------------------
> In /var/log/messages, I get the following:
>
> Mar  2 16:21:02 cluster1 lha-snmpagent[29940]: info: node 1: cluster2, 
> type:
> normal, status: active
> Mar  2 16:21:02 cluster1 lha-snmpagent[29940]: info: node 2: cluster1, 
> type:
> normal, status: active
> Mar  2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster2,
> interface: /dev/ttyS1, status: dead
> Mar  2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster2,
> interface: bond0, status: up
> Mar  2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster1,
> interface: /dev/ttyS1, status: dead
> Mar  2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster1,
> interface: bond0, status: up
> Mar  2 16:21:02 cluster1 lha-snmpagent[29940]: info: 
> g_hash_table_insert hd
> = [0x84c58c8]
> Mar  2 16:21:02 cluster1 ccm[29853]: WARN: ipc channel blocked
> Mar  2 16:21:02 cluster1 last message repeated 2 times
> Mar  2 16:21:02 cluster1 ccm[29853]: info: dispatch:received HUP
> Mar  2 16:21:02 cluster1 ccm[29853]: info: 
> clntCh_input_destroy:received HUP
> -------------------
>
> Does anybody know of a reason why ccm would not call the callback 
> function
> "ccm_events" ??
>
> Is this todo with socket permissions or something ?
>
> Thanks in advance,
>
> Tony
>
> _________________________________________________________________
> Express yourself instantly with MSN Messenger! Download today - it's 
> FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic