[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] Tickle Ack function in portblock resource
From:       Sam Tran <stlist () gmail ! com>
Date:       2010-04-30 15:54:48
Message-ID: h2tcee681b01004300854qa07c013brc2a585d737b684da () mail ! gmail ! com
[Download RAW message or body]

On Tue, Apr 27, 2010 at 6:02 AM, Dejan Muhamedagic <dejanmm@fastmail.fm> wrote:
> Hi,
>
> On Mon, Apr 26, 2010 at 04:35:31PM -0400, Sam Tran wrote:
>> Hi All,
>>
>> I am running pacemaker 1.0.8 + corosync 1.2.1 + resource-agents 1.0.3
>> on a pair of OpenLDAP master servers (CentOS Linux 5.4).
>>
>> The active OpenLDAP master hold the failover IP resource. An OpenLDAP
>> replica server is connecting to that failover IP address the master
>> server for updates, then the connection is maintained, and the replica
>> is waiting for subsequent updates from the master server. The
>> connection state is successfully synchronized using csync2. If the
>> active master server fails, the other master takes over the resources
>> and I was expecting the Tickle Ack function in the portblock resource
>> to break the established connection between the replica and the
>> failover IP. But the latter didn't happen. I am not sure what I am
>> doing wrong. Here is my crm configuration:
>>
>> node info-ldap-015.internal.example.com
>> node info-ldap-016.internal.example.com
>> primitive email-notify ocf:heartbeat:MailTo \
>>         params email="stlist@example.com" subject="TEST_LDAP_PROVIDER_CLUSTER"
>> primitive failover-ip1 ocf:heartbeat:IPaddr2 \
>>         params ip="192.168.8.171" \
>>         op monitor interval="5s"
>> primitive portblock_block ocf:heartbeat:portblock \
>>         params protocol="tcp" ip="192.168.8.171" portno="636" action="block" \
>>         op monitor interval="10" timeout="10" depth="0"
>> primitive portblock_unblock ocf:heartbeat:portblock \
>>         params protocol="tcp" ip="192.168.8.171" portno="636" action="unblock" \
>>         op monitor interval="10" timeout="10" depth="0"
>> tickle_dir="/tmp/tickle" sync_script="/usr/sbin/csync2 -xvr"
>
> This can't stand just so, on its own. Wrong cut&paste? Anyway, it
> should look like this:
>
> primitive portblock_unblock ocf:heartbeat:portblock \
>        params protocol="tcp" ip="192.168.8.171" portno="636" action="unblock" \
>                tickle_dir="/tmp/tickle" sync_script="/usr/sbin/csync2 -xvr" \
>        op monitor interval="10" timeout="10" depth="0"
>
> BTW, better keep the tickle_dir where only root can write.
>
> Otherwise, you can try to watch the wire with tcpdump and see if
> the RA sends reset TCP packets to the clients.
>

I did a packet capture on the node that is taking over the resources:
it doesn't send any TCP packets to the LDAP replica. Here are the
relevant lines in the message log that include the string 'portblock':

Apr 30 10:06:13 info-ldap-015 crmd: [11246]: info: do_lrm_rsc_op:
Performing key=14:170:0:71879926-a8a3-4c77-89f5-0cfb84e836b7
op=portblock_block_start_0 )
Apr 30 10:06:13 info-ldap-015 lrmd: [11243]: info: rsc:portblock_block:9: start
Apr 30 10:06:13 info-ldap-015 lrmd: [11243]: info: Managed
portblock_block:start process 11572 exited with return code 0.
Apr 30 10:06:13 info-ldap-015 crmd: [11246]: info: process_lrm_event:
LRM operation portblock_block_start_0 (call=9, rc=0, cib-update=15,
confirmed=true) ok
Apr 30 10:06:13 info-ldap-015 crmd: [11246]: info: do_lrm_rsc_op:
Performing key=15:170:0:71879926-a8a3-4c77-89f5-0cfb84e836b7
op=portblock_block_monitor_10000 )
Apr 30 10:06:13 info-ldap-015 lrmd: [11243]: info:
rsc:portblock_block:10: monitor
Apr 30 10:06:13 info-ldap-015 crmd: [11246]: info: do_lrm_rsc_op:
Performing key=17:170:0:71879926-a8a3-4c77-89f5-0cfb84e836b7
op=portblock_unblock_start_0 )
Apr 30 10:06:13 info-ldap-015 lrmd: [11243]: info:
rsc:portblock_unblock:11: start
Apr 30 10:06:13 info-ldap-015 lrmd: [11243]: info: Managed
portblock_block:monitor process 11589 exited with return code 0.
Apr 30 10:06:13 info-ldap-015 crmd: [11246]: info: process_lrm_event:
LRM operation portblock_block_monitor_10000 (call=10, rc=0,
cib-update=16, confirmed=false) ok
Apr 30 10:06:13 info-ldap-015 lrmd: [11243]: info: Managed
portblock_unblock:start process 11599 exited with return code 0.
Apr 30 10:06:13 info-ldap-015 crmd: [11246]: info: process_lrm_event:
LRM operation portblock_unblock_start_0 (call=11, rc=0, cib-update=17,
confirmed=true) ok
Apr 30 10:06:13 info-ldap-015 crmd: [11246]: info: do_lrm_rsc_op:
Performing key=18:170:0:71879926-a8a3-4c77-89f5-0cfb84e836b7
op=portblock_unblock_monitor_10000 )
Apr 30 10:06:13 info-ldap-015 lrmd: [11243]: info:
rsc:portblock_unblock:12: monitor
Apr 30 10:06:14 info-ldap-015 lrmd: [11243]: info: Managed
portblock_unblock:monitor process 11609 exited with return code 0.
Apr 30 10:06:14 info-ldap-015 crmd: [11246]: info: process_lrm_event:
LRM operation portblock_unblock_monitor_10000 (call=12, rc=0,
cib-update=18, confirmed=false) ok
Apr 30 10:06:23 info-ldap-015 pengine: [11245]: notice: native_print:
    portblock_block      (ocf::heartbeat:portblock):     Started
info-ldap-015.internal.example.com
Apr 30 10:06:23 info-ldap-015 pengine: [11245]: notice: native_print:
    portblock_unblock    (ocf::heartbeat:portblock):     Started
info-ldap-015.internal.example.com
Apr 30 10:06:23 info-ldap-015 pengine: [11245]: info:
native_merge_weights: failover-ip1: Rolling back scores from
portblock_block
Apr 30 10:06:23 info-ldap-015 pengine: [11245]: info:
native_merge_weights: email-notify: Rolling back scores from
portblock_block
Apr 30 10:06:23 info-ldap-015 pengine: [11245]: notice: LogActions:
Leave resource portblock_block      (Started
info-ldap-015.internal.example.com)
Apr 30 10:06:23 info-ldap-015 pengine: [11245]: notice: LogActions:
Leave resource portblock_unblock    (Started
info-ldap-015.internal.example.com)
Apr 30 10:06:24 info-ldap-015 pengine: [11245]: notice: native_print:
    portblock_block      (ocf::heartbeat:portblock):     Started
info-ldap-015.internal.example.com
Apr 30 10:06:24 info-ldap-015 pengine: [11245]: notice: native_print:
    portblock_unblock    (ocf::heartbeat:portblock):     Started
info-ldap-015.internal.example.com
Apr 30 10:06:24 info-ldap-015 pengine: [11245]: info:
native_merge_weights: failover-ip1: Rolling back scores from
portblock_block
Apr 30 10:06:24 info-ldap-015 pengine: [11245]: info:
native_merge_weights: email-notify: Rolling back scores from
portblock_block
Apr 30 10:06:24 info-ldap-015 pengine: [11245]: notice: LogActions:
Leave resource portblock_block      (Started
info-ldap-015.internal.example.com)
Apr 30 10:06:24 info-ldap-015 pengine: [11245]: notice: LogActions:
Leave resource portblock_unblock    (Started
info-ldap-015.internal.example.com)

Thanks,
Sam
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic