[prev in list] [next in list] [prev in thread] [next in thread] 

List:       keepalived-devel
Subject:    [Keepalived-devel] notification scrips executed in wrong order
From:       <Martin_Zielinski () McAfee ! com>
Date:       2014-01-17 12:28:44
Message-ID: FD608EF7DD0BD64B94678BD0AC25AD18160F44 () MIVEXEMEA1N2 ! corp ! nai ! org
[Download RAW message or body]

Using keepalived 1.2.10 on a CentOS 6 Linux.

I have a backup instance with notification scripts:

 notify_master "/root/test.sh master"
 notify_backup "/root/test.sh backup"

script is at the bottom of the mail.
Now I simulate a missed VRRP packet as regularly seen at a site in the field:

# iptables -I INPUT -d 224.0.0.0/8 -j DROP

... wait until I see something happen in /var/log/message

# iptables -I INPUT -d 224.0.0.0/8 -j ACCEPT

I see the following output from keepalived (as expected):

VRRP_Instance({) Transition to MASTER STATE
VRRP_Instance({) Entering MASTER STATE
VRRP_Instance({) setting protocol VIPs.
VRRP_Instance({) Sending gratuitous ARPs on eth1 for 192.168.3.155
Opening script file /root/test.sh
Netlink reflector reports IP 192.168.3.155 added
VRRP_Instance({) Received higher prio advert
VRRP_Instance({) Entering BACKUP STATE
VRRP_Instance({) removing protocol VIPs.
Opening script file /root/test.sh
Netlink reflector reports IP 192.168.3.155 removed

But the scripts are executed in the wrong order (1 out of 3 attempts):
[12:01:00.776829933] [13695 -> 13696] (./keepalived-l-D-d-n-f/root/keepalived.conf) \
backup to backup [12:01:00.779875164] [13694 -> 13697] \
(./keepalived-l-D-d-n-f/root/keepalived.conf) backup to master

First number is the PPID of the script, second the PID. So the process that calls the \
script to go into backup state overtakes the script that is called to go into the \
master state. This can also nicely be seen when attaching strace.

As the real script controls an application that uses this state information, the \
result is that the application is in master state, but should be in backup state.

I believe some rework for the notification mechanism needs to be done to ensure the \
correct order of script execution. What is the correct process? I'd be happy to \
contribute to that (though I have no easy fix in mind, yet).

Cheers,
Martin

[root@cluster2 ~]# cat test.sh
#!/bin/bash

BASEDIR=`dirname $0`
LOGFILE="$BASEDIR/test.log"
LOCKFILE="$BASEDIR/test.lock"
STATE="$BASEDIR/state"

do_log()
{
        echo -n "[`date +%T.%N`] " >> $LOGFILE
        if [ -n "$PPID" ] ; then
                echo -n "[$PPID -> $$] " >> $LOGFILE
                echo -n "(`cat /proc/$PPID/cmdline`) " >> $LOGFILE
        fi
        echo "$*" >> $LOGFILE
}

(
        flock -x 200

        if [ -f $STATE ] ; then
                state=`cat $STATE`
        else
                state="backup"
        fi

case "$1" in
        master)
        do_log "$state to master"
        echo "master" > $STATE
        ;;
        backup)
        do_log "$state to backup"
        echo "backup" > $STATE
        ;;
        fault)
        do_log "fault"
        ;;
esac
) 200> $LOCKFILE


Firmensitz: Muenchen Amtsgericht: AG Muenchen Handelsregister: HRB 144340 \
Geschaeftsfuehrer: Timothy James Daly, Jonathan Park, Martin Stecher Bankverbindung: \
ABN-Amro Bank N.V. Konto 671 211 9006 UST-ID: DE168122444

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Keepalived-devel mailing list
Keepalived-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/keepalived-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic