[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-ha-dev
Subject: Re: [Linux-ha-dev] Anybody notice this ccm library segfault
From: James Pan <jmltc () cn ! ibm ! com>
Date: 2006-03-07 6:04:17
Message-ID: 440D2261.5080903 () cn ! ibm ! com
[Download RAW message or body]
This should be a ccm bug, the ccm in heartbeat 1.2.4 is quite out of date.
In heartbeat 2.0, __ccm_data will be initialized in the function
saClmInitialize.
So I suggest do not use ccm in heartbeat 1.2.4.
However, since ccm is an independent component of heartbeat,
I guess it should work if you replace the whole ccm stuff of heartbeat
1.2.4
with the heartbeat 2 one and rebuild them.
An alternative way is to use heartbeat 2.0 directly and run it in the
legacy
mode, that is , set crm to false in ha.cf. IMHO.
Tony Scott wrote:
> Hi,
>
> I'm using heartbeat 1.2.4.
> When I use try to start hbagent, the following happens:
>
> init_membership() in hbagent.c calls saClmInitialize in the ccm library
> (ccmlib_clm.c) ...
> saClmInitialize uses oc_ev_set_callback to register its function
> ccm_events
> as a callback function.
>
> The ccm_events callback function is the function which sets up the
> __ccm_data pointer with a non NULL value.
>
> Next, in init_membership(), the ccmlib_clm.c function
> saClmClusterTrackStart is called.
> It does the following:
> const oc_ev_membership_t *oc;
> ...
> ...
> oc = __ccm_data;
> itemnum = oc->m_n_member;
>
> However,
> The problem is that this callback function "ccm_events" is never being
> called, so __ccm_data remains as a NULL pointer....
> and the "itemnum = oc->m_n_member;" causes a segfault (Normal for
> referencing a NULL pointer :o))
>
> --------------------------------------
> send_message In ccmclient.c print the following :
> "ipc channel blocked"
>
> -------------------------------------
> In /var/log/messages, I get the following:
>
> Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node 1: cluster2,
> type:
> normal, status: active
> Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node 2: cluster1,
> type:
> normal, status: active
> Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster2,
> interface: /dev/ttyS1, status: dead
> Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster2,
> interface: bond0, status: up
> Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster1,
> interface: /dev/ttyS1, status: dead
> Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster1,
> interface: bond0, status: up
> Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info:
> g_hash_table_insert hd
> = [0x84c58c8]
> Mar 2 16:21:02 cluster1 ccm[29853]: WARN: ipc channel blocked
> Mar 2 16:21:02 cluster1 last message repeated 2 times
> Mar 2 16:21:02 cluster1 ccm[29853]: info: dispatch:received HUP
> Mar 2 16:21:02 cluster1 ccm[29853]: info:
> clntCh_input_destroy:received HUP
> -------------------
>
> Does anybody know of a reason why ccm would not call the callback
> function
> "ccm_events" ??
>
> Is this todo with socket permissions or something ?
>
> Thanks in advance,
>
> Tony
>
> _________________________________________________________________
> Express yourself instantly with MSN Messenger! Download today - it's
> FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic