[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] Patch to ocf:heartbeat:IPaddr2 to check if
From:       Dejan Muhamedagic <dejan () suse ! de>
Date:       2011-06-15 13:40:21
Message-ID: 20110615134020.GA3495 () squib
[Download RAW message or body]

On Tue, Jun 14, 2011 at 07:15:21PM +0200, alexander.krauth@basf.com wrote:
> Dejan Muhamedagic schrieb am 08.06.2011 18:32:16:
> > Hi Alexander,
> > On Mon, Jun 06, 2011 at 05:42:30PM +0200, alexander.krauth@basf.com 
> wrote:
> > > Dejan Muhamedagic schrieb am 04.04.2011 14:35:34:
> > > > On Fri, Mar 18, 2011 at 04:15:16PM +0100, alexander.krauth@basf.com 
> > > wrote:
> > > > > Hi,
> > > > > 
> > > > > Dejan Muhamedagic schrieb am 18.03.2011 14:31:08:
> > > > > > Hi,
> > > > > > 
> > > > > > On Wed, Mar 16, 2011 at 04:58:25PM +0100, Corvus Corax wrote:
> > > > > > > 
> > > > > > > IPAddr2 puts the interface up on start and down on stop.
> > > > > > > But its not able to detect an UP or DOWN change in status or 
> > > monitor.
> > > > > > > 
> > > > > > > Therefore an "ifconfig <interface> down" from a thrird program 
> or 
> > > a
> > > > > > > careless administrator would drop the link without pacemaker 
> > > noticing!
> > > > > > 
> > > > > > Hmm, careless administrator is somewhat of a paradox, right?
> > > > > > 
> > > > > > Really, what was your motivation for this? It makes me wonder,
> > > > > > since this RA has existed for many years and so far nobody
> > > > > > bothered to test this.
> > > > > 
> > > > > Hm, maybe the idea behind is not totally new. Remember this 
> thread:
> > > > > 
> > > 
> http://lists.community.tummy.com/pipermail/linux-ha-dev/2011-February/018184.html
> 
> > > 
> > > > > 
> > > > > I would go with the remarks of LMB, that this is something closer 
> to
> > > > > the pingd than to Ipaddr2. Isn't the real intention of both post, 
> that 
> > > you
> > > > > want to know, if your network interface is vital ?
> > > > 
> > > > Yes.
> > > > 
> > > > > You may use pingd for that, but someone may be concerned to ping 
> the 
> > > right
> > > > > remote device (also a default-gateway might not be a very static 
> thing 
> > > in
> > > > > a modern network).
> > > > > 
> > > > > My imagination is currently an agent (let's call it ethmonitor) 
> that 
> > > > > monitors
> > > > > a network interface with a combination of the fine methods that 
> Robert 
> > > 
> > > > > Euhus
> > > > > has posted in his patch. Than you could define some rules in CIB 
> how 
> > > to
> > > > > react on the event of a failed network interface. Sure this 
> assumes 
> > > that 
> > > > > you
> > > > > do your heartbeats over more than one interface.
> > > > > 
> > > > > It would check:
> > > > >  1. interface link up ?
> > > > >  2. does the RX counter of the interface increase during a certain 
> 
> > > amout 
> > > > > of time ?
> > > > >  3. do I have some other nodes in my arp-cache which I could 
> arping ?
> > > > >  4. maybe retry all checks to overcome short outages
> > > > > If all questions are answered with NO - the interface is dead.
> > > > > 
> > > > > I would add my vote for such a feature.
> > > > 
> > > > Just took a look at the thread you referenced above.
> > > > Unfortunately, the author didn't get back with the new code
> > > > after review and short discussion.
> > > > 
> > > 
> > > Now I took the code from Robert in the above referenced thread and put 
> it 
> > > into a complete new RA.
> > > It is based very much on the existing pind agent, but implements the 
> > > monitoring like discussed above.
> > 
> > Great!
> > 
> > > Please let me know, what you think about it.
> > 
> > Does it work? :)
> 
> Yes, it does. For me in my test environment. :-)
> I did review your comments and attached a new version of the agent (as it 
> is not in the repository for diffs).
> Some comments of your comments below.
> 
> Regards
> Alex
> 
> > 
> > See below for a few comments.
> > 
> > Cheers,
> > 
> > Dejan
> > 
> > > 
> > > Cheers,
> > > Alex
> > 
> > > #!/bin/sh
> > > #
> > > #       OCF Resource Agent compliant script.
> > > #       Monitor the vitality of a local network interface.
> > > #
> > > #    Based on the work by Robert Euhus and Lars Marowsky-Brée.
> > > #
> > > #   Transfered from Ipaddr2 into ethmonitor by Alexander Krauth
> > > #
> > > # Copyright (c) 2011 Robert Euhus, Alexander Krauth, Lars 
> Marowsky-Brée
> > > #                    All Rights Reserved.
> > > #
> > > # This program is free software; you can redistribute it and/or modify
> > > # it under the terms of version 2 of the GNU General Public License as
> > > # published by the Free Software Foundation.
> > > #
> > > # This program is distributed in the hope that it would be useful, but
> > > # WITHOUT ANY WARRANTY; without even the implied warranty of
> > > # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> > > #
> > > # Further, this software is distributed without any warranty that it 
> is
> > > # free of the rightful claim of any third person regarding 
> infringement
> > > # or the like.  Any license provided herein, whether implied or
> > > # otherwise, applies only to this software file.  Patent licenses, if
> > > # any, provided herein do not apply to combinations of this program 
> with
> > > # other software, or any other product whatsoever.
> > > #
> > > # You should have received a copy of the GNU General Public License
> > > # along with this program; if not, write the Free Software Foundation,
> > > # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
> > > #
> > > #     OCF parameters are as below
> > > #
> > > #   OCF_RESKEY_interface
> > > #   OCF_RESKEY_multiplicator
> > > #   OCF_RESKEY_name
> > > #       OCF_RESKEY_repeat_count
> > > #   OCF_RESKEY_repeat_interval
> > > #   OCF_RESKEY_pktcnt_timeout
> > > #   OCF_RESKEY_arping_count
> > > #   OCF_RESKEY_arping_timeout
> > > #   OCF_RESKEY_arping_cache_entries
> > > #
> > > #   TODO: Check against IPv6
> > > #
> > > 
> #######################################################################
> > > # Initialization:
> > > 
> > > : ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/resource.d/heartbeat}
> > > . ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs
> > > 
> > > 
> #######################################################################
> > > 
> > > meta_data() {
> > >    cat <<END
> > > <?xml version="1.0"?>
> > > <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
> > > <resource-agent name="ethmonitor">
> > > <version>1.2</version>
> > > 
> > > <LONGdesc lang="en">
> > > Monitor the vitality of a local network interface.
> > > 
> > > You may setup this RA as a clone resource to monitor the network 
> interfaces on different nodes, with the same interface name.
> > > This is not related to the IP adress or the network on which a 
> interface is configured.
> > > You may use this RA to move resources away from a node, which has a 
> faulty interface or prevent moving resources to such a node.
> > > This gives you independend control of the resources, without involving 
> cluster intercommunication. But it requires your nodes to have more than 
> one network interface.
> > > 
> > > The resource configuration requires a monitor operation, because the 
> monitor does the main part of the work.
> > > In addition to the resource configuration, you need to configure some 
> location contraints, based on a CIB attribute value.
> > > The name of the attribute value is configured in the 'name' option of 
> this RA.
> > > 
> > > Example constraint configuration:
> > > location loc_connected_node my_resource_grp \
> > >         rule $id="rule_loc_connected_node" -INF: ethmonitor eq 0
> > > 
> > > The ethmonitor works in 3 different modes to test the interface 
> vitality.
> > > 1. call ip to see if the link status is up (if link is down -> error)
> > > 2. call ip an watch the RX counter (if packages come around in a 
> certain time -> success)
> > > 3. call arping to check wether any of the IPs found in the lokal ARP 
> cache answers an ARP REQUEST (one answer -> success)
> > > 4. return error
> > 
> > I think that some parts of this long description should go to 
> www.linux-ha.org/wiki/ethmonitor_(resource_agent)
> > 
> > > </longdesc>
> > > <shortdesc lang="en">Monitors network interfaces</shortdesc>
> > > 
> > > <parameters>
> > > <parameter name="interface" unique="0" required="1">
> > 
> > shouldn't this be unique?
> 
> Hm, I never really understand the unique flag (see also current 
> mailinglist discussion).
> If, I clone this resource, because I want to monitor eth0 on two nodes. 
> May it then be set to unique ?

Yes.

unique:

Two resources of the same kind in the cluster may not have the
same value for a "unique" parameter.

> > > <longdesc lang="en">
> > > The name of the network interface which should be monitored (e.g. 
> eth0).
> > > </longdesc>
> > > <shortdesc lang="en">Network interface name</shortdesc>
> > > <content type="string" default=""/>
> > > </parameter>
> > > 
> > > <parameter name="name" unique="0">
> > 
> > and this too?
> 
> Didn't "unique=1" also require "required=1" ? So then it is not, because 
> it has a default.

Well, not entirely true. The user still can set the parameter.
However, if the parameter is unset in two resources, then that
would effectively make the parameter non-unique. Currently, the
crm shell won't notice that. But ultimately it is up to the user
to keep the configuration sane. Imagine what would happen if the
two resources used the same attribute to write the interface
status.

One option would be to name the attribute after the interface by
default, say eth0mon, br0mon or perhaps mon-eth0, mon-br0. That
way just by setting the interface we make sure that the
attribute is unique as well.

OK. No more comments here.

Cheers,

Dejan

> > > <longdesc lang="en">
> > > The name of the CIB attribute to set.  This is the name to be used in 
> the constraints.
> > > </longdesc>
> > > <shortdesc lang="en">Attribute name</shortdesc>
> > > <content type="integer" default="ethmonitor"/>
> > > </parameter>
> > > 
> > > <parameter name="multiplier" unique="0" >
> > > <longdesc lang="en">
> > > Multiplier for the value of the CIB attriobute specified in parameter 
> name. 
> > > </longdesc>
> > > <shortdesc lang="en">Multiplier for result variable</shortdesc>
> > > <content type="integer" default="1"/>
> > > </parameter>
> > > 
> > > <parameter name="repeat_count">
> > > <longdesc lang="en">
> > > Specify how often the interface will be monitored, before the status 
> is set to failed. You need to set the timeout of the monitoring operation 
> to at least repeat_count * repeat_interval
> > > </longdesc>
> > > <shortdesc lang="en">Monitor repeat count</shortdesc>
> > > <content type="integer" default="5"/>
> > > </parameter>
> > > 
> > > <parameter name="repeat_interval">
> > > <longdesc lang="en">
> > > Specify how long to wait in seconds between the repeat_counts.
> > > </longdesc>
> > > <shortdesc lang="en">Monitor repeat interval in seconds</shortdesc>
> > > <content type="integer" default="10"/>
> > > </parameter>
> > > 
> > > <parameter name="pktcnt_timeout">
> > > <longdesc lang="en">
> > > Timeout for the RX packet counter. Stop listening for packet counter 
> changes after the given number of seconds.
> > > </longdesc>
> > > <shortdesc lang="en">packet counter timeout</shortdesc>
> > > <content type="integer" default="5"/>
> > > </parameter>
> > > 
> > > <parameter name="arping_count">
> > > <longdesc lang="en">
> > > Number of ARP REQUEST packets to send for every IP.
> > > Usually one ARP REQUEST (arping) is send
> > > </longdesc>
> > > <shortdesc lang="en">Number of arpings per IP</shortdesc>
> > > <content type="integer" default="1"/>
> > > </parameter>
> > > 
> > > <parameter name="arping_timeout">
> > > <longdesc lang="en">
> > > Time in seconds to wait for ARP REQUESTs (all packets of 
> arping_count).
> > > This is to limit the time for arp requests, to be able to send 
> requests to more than one node, without running in the monitor operation 
> timeout.
> > > </longdesc>
> > > <shortdesc lang="en">Timeout for arpings per IP</shortdesc>
> > > <content type="integer" default="1"/>
> > > </parameter>
> > > 
> > > <parameter name="arping_cache_entries">
> > > <longdesc lang="en">
> > > Maximum number of IPs from ARP cache list to check for ARP REQUEST 
> (arping) answers. Newest entries are tried first.
> > > </longdesc>
> > > <shortdesc lang="en">Number of ARP cache entries to try</shortdesc>
> > > <content type="integer" default="5"/>
> > > </parameter>
> > > 
> > > </parameters>
> > > <actions>
> > > <action name="start"   timeout="20s" />
> > > <action name="stop"    timeout="20s" />
> > > <action name="status" depth="0"  timeout="20s" interval="10s" />
> > > <action name="monitor" depth="0"  timeout="20s" interval="10s" />
> > > <action name="meta-data"  timeout="5s" />
> > > <action name="validate-all"  timeout="20s" />
> > > </actions>
> > > </resource-agent>
> > > END
> > > 
> > >    exit $OCF_SUCCESS
> > > }
> > > 
> > > #
> > > #   Return true, if the interface exists
> > > #
> > > is_interface() {
> > >    #
> > >    # List interfaces but exclude FreeS/WAN ipsecN virtual interfaces
> > >    #
> > >    local iface=`$IP2UTIL -o -f inet addr show | grep " $1 " \
> > >       | cut -d ' ' -f2 | sort -u | grep -v '^ipsec[0-9][0-9]*$'`
> > >    if [ "$iface" != "" ]; then return 0; fi
> > >    return 1
> > 
> > [ "$iface" != "" ] is enough instead of the previous two lines
> done.
> 
> > 
> > > }
> > > 
> > > if_init() {
> > >    local rc
> > > 
> > >    if [ X"$OCF_RESKEY_interface" = "X" ]; then
> > >       ocf_log err "Interface name (the interface parameter) is 
> mandatory"
> > >       exit $OCF_ERR_CONFIGURED
> > >    fi
> > > 
> > >    NIC="$OCF_RESKEY_interface"
> > > 
> > >    if is_interface $NIC
> > >    then
> > >      case "$NIC" in
> > >        *:*) ocf_log err "Do not specify a virtual interface : 
> $OCF_RESKEY_interface"
> > >             exit $OCF_ERR_CONFIGURED;;
> > >        *)  ;;
> > >      esac
> > >    else
> > >      case $__OCF_ACTION in
> > >        validate-all) ocf_log err "Interface $OCF_RESKEY_interface does 
> not exist"
> > >                             exit $OCF_ERR_CONFIGURED;;
> > >        *)          ocf_log warn "Interface $OCF_RESKEY_interface does 
> not exist"
> > >                             ## It might be a bond interface which is 
> temporarily not available, therefore we want to continue here
> > >                        ;;
> > 
> > Why not use NIC instead of OCF_RESKEY_interface when you already set 
> that?
> done.
> 
> > 
> > >      esac
> > >    fi
> > > 
> > >    : ${OCF_RESKEY_multiplier:="1"}
> > >    if ! ocf_is_decimal "$OCF_RESKEY_multiplier"; then
> > >       ocf_log err "Invalid OCF_RESKEY_multiplier 
> [$OCF_RESKEY_multiplier]"
> > >       exit $OCF_ERR_CONFIGURED
> > >    fi
> > > 
> > >    ATTRNAME=${OCF_RESKEY_name:-ethmonitor}
> > > 
> > >         REP_COUNT=${OCF_RESKEY_repeat_count:-5}
> > >    if ! ocf_is_decimal "$REP_COUNT" -o [ $REP_COUNT -lt 1 ]; then
> > >       ocf_log err "Invalid OCF_RESKEY_repeat_count [$REP_COUNT]"
> > >       exit $OCF_ERR_CONFIGURED
> > >         fi
> > >    REP_INTERVAL_S=${OCF_RESKEY_repeat_interval:-10}
> > >    if ! ocf_is_decimal "$REP_INTERVAL_S"; then
> > >       ocf_log err "Invalid OCF_RESKEY_repeat_interval 
> [$REP_INTERVAL_S]"
> > >       exit $OCF_ERR_CONFIGURED
> > >    fi
> > >    : ${OCF_RESKEY_pktcnt_timeout:="5"}
> > >    if ! ocf_is_decimal "$OCF_RESKEY_pktcnt_timeout"; then
> > >       ocf_log err "Invalid OCF_RESKEY_pktcnt_timeout 
> [$OCF_RESKEY_pktcnt_timeout]"
> > >       exit $OCF_ERR_CONFIGURED
> > >    fi
> > >    : ${OCF_RESKEY_arping_count:="1"}
> > >    if ! ocf_is_decimal "$OCF_RESKEY_arping_count"; then
> > >       ocf_log err "Invalid OCF_RESKEY_arping_count 
> [$OCF_RESKEY_arping_count]"
> > >       exit $OCF_ERR_CONFIGURED
> > >    fi
> > >    : ${OCF_RESKEY_arping_timeout:="1"}
> > >    if ! ocf_is_decimal "$OCF_RESKEY_arping_timeout"; then
> > >       ocf_log err "Invalid OCF_RESKEY_arping_timeout 
> [$OCF_RESKEY_arping_count]"
> > >       exit $OCF_ERR_CONFIGURED
> > >    fi
> > >    : ${OCF_RESKEY_arping_cache_entries:="5"}
> > >    if ! ocf_is_decimal "$OCF_RESKEY_arping_cache_entries"; then
> > >       ocf_log err "Invalid OCF_RESKEY_arping_cache_entries 
> [$OCF_RESKEY_arping_cache_entries]"
> > >       exit $OCF_ERR_CONFIGURED
> > >    fi
> > >   return $OCF_SUCCESS
> > > }
> > > 
> > > # get the link status on $NIC
> > > # returns UP or DOWN or whatever ip reports (UNKNOWN?)
> > > get_link_status () {
> > >    $IP2UTIL -o link show dev "$NIC" \
> > >       | sed 's/.* state \([^ ]*\) .*/\1/'
> > 
> > This prints "UNKNOWN" for my bridge (brn) interfaces which are up:
> > [0]hex-12:~ > ip -o link show dev br0
> > 6: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
> UNKNOWN \
> >     link/ether 00:23:7d:a7:29:96 brd ff:ff:ff:ff:ff:ff
> 
> Hm, yes. Didn't feel well with this function all the time. Changed it, to 
> let ip decide what is up or down.
> I think this is much more up- and downward compatible.
> 
> get_link_status () {
>         $IP2UTIL -o link show up dev "$NIC" | grep -c "$NIC"
> }
> 
> > 
> > > }
> > > 
> > > # returns the number of received rx packets on $NIC
> > > get_rx_packets () {
> > >    ocf_log debug "$IP2UTIL -o -s link show dev $NIC"
> > >    $IP2UTIL -o -s link show dev "$NIC" \
> > >       | sed 's/.* RX: [^0-9]*[0-9]* *\([0-9]*\) .*/\1/'
> > >       # the first number after RX: ist the # of bytes ,
> > >       # the second is the # of packets received
> > > }
> > > 
> > > # watch for packet counter changes for max. OCF_RESKEY_pktcnt_timeout 
> seconds
> > > # returns immedeately with return code 0 if any packets were received
> > > # otherwise 1 is returned
> > > watch_pkt_counter () {
> > >    local RX_PACKETS_NEW
> > >    local RX_PACKETS_OLD
> > >    RX_PACKETS_OLD="`get_rx_packets`"
> > >    for n in `seq $(( $OCF_RESKEY_pktcnt_timeout * 10 ))`; do
> > >       sleep 0.1
> > >       RX_PACKETS_NEW="`get_rx_packets`"
> > >       ocf_log debug "RX_PACKETS_OLD: $RX_PACKETS_OLD RX_PACKETS_NEW: 
> $RX_PACKETS_NEW"
> > >       if [ "$RX_PACKETS_OLD" -ne "$RX_PACKETS_NEW" ]; then
> > >          ocf_log debug "we received some packets."
> > >          return 0
> > >       fi
> > >    done
> > >    return 1
> > > }
> > > 
> > > # returns list of cached ARP entries for $NIC
> > > # sorted by age ("last confirmed")
> > > # max. OCF_RESKEY_arping_cache_entries entries
> > > get_arp_list () {
> > >    $IP2UTIL -s neighbour show dev $NIC \
> > >       | sort -t/ -k2,2n | cut -d' ' -f1 \
> > >       | head -n $OCF_RESKEY_arping_cache_entries
> > >       # the "used" entries in `ip -s neighbour show` are:
> > >       # "last used"/"last confirmed"/"last updated"
> > > }
> > > 
> > > # arping the IP given as argument $1 on $NIC
> > > # until OCF_RESKEY_arping_count answers are received
> > > do_arping () {
> > >    # TODO: add the source IP
> > >    # TODO: check for diffenrent arping versions out there
> > >    arping -q -c $OCF_RESKEY_arping_count -w $OCF_RESKEY_arping_timeout 
> -I $NIC $1
> > >    # return with the exit code of the arping command 
> > >    return $?
> > > }
> > > 
> > > #
> > > #    Check the interface depending on the level given as parameter: 
> $OCF_RESKEY_check_level
> > > #
> > > # 09: check for nonempty ARP cache
> > > # 10: watch for packet counter changes
> > > #
> > > # 19: check arping_ip_list
> > > # 20: check arping ARP cache entries
> > > # 
> > > # 30:  watch for packet counter changes in promiscios mode
> > > # 
> > > # If unsuccessfull in levels 18 and above,
> > > # the tests for higher check levels are run.
> > > #
> > > if_check () {
> > >    # always check link status first
> > >    link_status="`get_link_status`"
> > >    ocf_log debug "link_status: $link_status"
> > >    case $link_status in
> > >       UP)
> > >          ;;
> > >       DOWN)
> > >          # remove address from NIC
> > >          return $OCF_NOT_RUNNING
> > >          ;;
> > >       *) # this should not happen.
> > >          return $OCF_ERR_GENERIC
> > >          ;;
> > >    esac
> > > 
> > >    # watch for packet counter changes
> > >    ocf_log debug "watch for packet counter changes" 
> > >    watch_pkt_counter && return $OCF_SUCCESS
> > > 
> > >    # check arping ARP cache entries
> > >    ocf_log debug "check arping ARP cache entries" 
> > >    for ip in `get_arp_list`; do
> > >       do_arping $ip && return $OCF_SUCCESS
> > >    done
> > > 
> > >    # watch for packet counter changes in promiscios mode
> > > #   ocf_log debug "watch for packet counter changes in promiscios 
> mode" 
> > >    # be sure switch off promiscios mode in any case
> > >    # TODO: check first, wether promisc is already on and leave it 
> untouched.
> > > #   trap "$IP2UTIL link set dev $NIC promisc off; exit" INT TERM EXIT
> > > #      $IP2UTIL link set dev $NIC promisc on
> > > #      watch_pkt_counter && return $OCF_SUCCESS
> > > #      $IP2UTIL link set dev $NIC promisc off
> > > #   trap - INT TERM EXIT
> > > 
> > >    # looks like it's not working (for whatever reason)
> > >    return $OCF_NOT_RUNNING
> > > }
> > > 
> > > 
> #######################################################################
> > > 
> > > if_usage() {
> > >    cat <<END
> > > usage: $0 {start|stop|status|monitor|validate-all|meta-data}
> > > 
> > > Expects to have a fully populated OCF RA-compliant environment set.
> > > END
> > > }
> > > 
> > > set_cib_value() {
> > >     local score=`expr $1 \* $OCF_RESKEY_multiplier`
> > >     attrd_updater -n $ATTRNAME -v $score -q
> > >     local rc=$?
> > >     case $rc in
> > >         0) ocf_log debug "attrd_updater: Updated $ATTRNAME = $score" 
> ;;
> > >         *) ocf_log warn "attrd_updater: Could not update $ATTRNAME = 
> $score: rc=$rc";;
> > >     esac
> > >     return $rc
> > > }
> > > 
> > > if_monitor() {
> > >     ha_pseudo_resource $OCF_RESOURCE_INSTANCE monitor
> > >     local pseudo_status=$?
> > >     if [ $pseudo_status -ne $OCF_SUCCESS ]; then
> > >       exit $pseudo_status
> > >     fi
> > > 
> > >     local mon_rc=$OCF_NOT_RUNNING
> > >     local attr_rc=$OCF_NOT_RUNNING
> > >     local runs=0
> > >     local start_time
> > >     local end_time
> > >     local sleep_time
> > >     while [ $mon_rc -ne $OCF_SUCCESS -a $REP_COUNT -gt 0 ]
> > >     do
> > >       start_time=`date +%s%N`
> > >       if_check
> > >       mon_rc=$?
> > >       REP_COUNT=$(( $REP_COUNT - 1 ))
> > >       if [ $mon_rc -ne $OCF_SUCCESS -a $REP_COUNT -gt 0 ]; then
> > >         ocf_log warn "Monitoring of $OCF_RESOURCE_INSTANCE failed, 
> $REP_COUNT retries left."
> > >    end_time=`date +%s%N`
> > >    sleep_time=`echo "scale=9; ( $start_time + ( $REP_INTERVAL_S * 
> 1000000000 ) - $end_time ) / 1000000000" | bc -q 2> /dev/null`
> > >         sleep $sleep_time 2> /dev/null
> > >         runs=$(($runs + 1))
> > >       fi
> > > 
> > >       if [ $mon_rc -eq $OCF_SUCCESS -a $runs -ne 0 ]; then
> > >         ocf_log info "Monitoring of $OCF_RESOURCE_INSTANCE recovered 
> from error"
> > >       fi
> > >     done
> > > 
> > >     ocf_log debug "Monitoring return code: $mon_rc"
> > >     if [ $mon_rc -eq $OCF_SUCCESS ]; then
> > >       set_cib_value 1
> > >       attr_rc=$?
> > >     else
> > >       ocf_log err "Monitoring of $OCF_RESOURCE_INSTANCE failed."
> > >       set_cib_value 0
> > >       attr_rc=$?
> > >     fi
> > > 
> > >     ## The resource should not fail, if the interface is down. It 
> should fail, if the update of the CIB variable has errors.
> > >     ## To react on the interface failure you must use constraints 
> based on the CIB variable value, not on the recourse itself.
> > 
> > recourse -> resource
> done.
> 
> > 
> > >     exit $attr_rc
> > > }
> > > 
> > > if_validate() {
> > >     check_binary $IP2UTIL
> > 
> > check_binary arping ?
> done.
> 
> > 
> > >     if_init
> > >     return $?
> > 
> > this line is superfluous
> done.
> 
> > 
> > > }
> > > 
> > > case $__OCF_ACTION in
> > > meta-data)   meta_data
> > >       ;;
> > > usage|help)   if_usage
> > >       exit $OCF_SUCCESS
> > >       ;;
> > > esac
> > > 
> > > if_validate
> > > 
> > > case $__OCF_ACTION in
> > > start)      ha_pseudo_resource $OCF_RESOURCE_INSTANCE start
> > >       exit $?
> > >       ;;
> > > stop)      attrd_updater -D -n $ATTRNAME
> > >                 ha_pseudo_resource $OCF_RESOURCE_INSTANCE stop
> > >       exit $?
> > >       ;;
> > > monitor|status)   if_monitor
> > >       exit $?
> > >       ;;
> > > validate-all)   exit $?
> > >                 ;;
> > > *)      if_usage
> > >       exit $OCF_ERR_UNIMPLEMENTED
> > >       ;;
> > > esac
> > _______________________________________________________
> > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
> 
> 
> 


> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic