[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] VirtualDomain issue
From:       Dejan Muhamedagic <dejan () suse ! de>
Date:       2012-01-31 15:57:10
Message-ID: 20120131155708.GD29800 () walrus ! homenet
[Download RAW message or body]

Hi,

On Mon, Nov 14, 2011 at 11:58:06AM +0100, Dejan Muhamedagic wrote:
> Hi,
> 
> On Thu, Jun 23, 2011 at 07:51:48AM +0200, Dominik Klein wrote:
> > Hi
> > 
> > code snippet from
> > http://hg.linux-ha.org/agents/raw-file/7a11934b142d/heartbeat/VirtualDomain
> > (which I believe is the current version)
> > 
> > VirtualDomain_Validate_All() {
> > <snip>
> >      if [ ! -r $OCF_RESKEY_config ]; then
> > 	if ocf_is_probe; then
> > 	    ocf_log info "Configuration file $OCF_RESKEY_config not readable
> > during probe."
> > 	else
> > 	    ocf_log error "Configuration file $OCF_RESKEY_config does not exist
> > or is not readable."
> > 	    return $OCF_ERR_INSTALLED
> > 	fi
> >      fi
> > }
> > <snip>
> > VirtualDomain_Validate_All || exit $?
> > <snip>
> > if ocf_is_probe && [ ! -r $OCF_RESKEY_config ]; then
> >      exit $OCF_NOT_RUNNING
> > fi
> > 
> > So, say one node does not have the config, but the cluster decides to
> > run the vm on that node. The probe returns NOT_RUNNING, so the cluster
> > tries to start the vm, that start returns ERR_INSTALLED, the cluster has
> > to try to recover from the start failure, so stop it, but that stop op
> > returns ERR_INSTALLED as well, so we need to be stonith'd.
> > 
> > I think this is wrong behaviour.
> 
> On stop, it should return OCF_SUCCESS. I wonder if it would be
> safe for the CRM to interpret ERR_INSTALLED on stop as "resource
> stopped."
> 
> Opinions?

Florian, can you please ack/nack this patch.

Cheers,

Dejan

> Cheers,
> 
> Dejan
> 
> P.S. Very sorry for such a delay!
> 
> > I read the comments about
> > configurations being on shared storage which might not be available at
> > certain points in time and I see the point. But the way this is
> > implemented clearly does not work for everybody. I vote for making this
> > configurable. Unfortunately, due to several reasons, I am not able to
> > contribute this patch myself at the moment.
> > 
> > Regards
> > Dominik
> > _______________________________________________________
> > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/

["VirtualDomain.noconf-stop.patch" (text/x-patch)]

commit 2d825cd0269022434f16357eb024aa43211c74dd
Author: Dejan <dejan@suse.de>
Date:   Tue Jan 31 16:52:53 2012 +0100

    Medium: VirtualDomain: if the configuration file is missing on stop exit with success

diff --git a/heartbeat/VirtualDomain b/heartbeat/VirtualDomain
index 00916c0..dd85657 100755
--- a/heartbeat/VirtualDomain
+++ b/heartbeat/VirtualDomain
@@ -472,6 +472,8 @@ VirtualDomain_Validate_All() {
     if [ ! -r $OCF_RESKEY_config ]; then
 	if ocf_is_probe; then
 	    ocf_log info "Configuration file $OCF_RESKEY_config not readable during probe."
+	elif [ "$__OCF_ACTION" = "stop" ]; then
+	    ocf_log info "Configuration file $OCF_RESKEY_config not readable, resource considered stopped."
 	else
 	    ocf_log error "Configuration file $OCF_RESKEY_config does not exist or is not readable."
 	    return $OCF_ERR_INSTALLED


_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic