[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] Re: Standby complains and does nothing
From:       Alan Robertson <alanr () unix ! sh>
Date:       2002-02-21 13:38:12
[Download RAW message or body]

Maurice Volaski wrote:
> 
> >I assume you mean 0.4.9a.
> 
> yes.
> 
> >This should be a transient condition which goes away in a few seconds.
> >
> >Basically you can't ask things to go into standby while it's figuring out
> >which machine has which resources.
> 
> I'm not sure I follow. I told the machine to go into the standby and
> it gave that message right away. There was no other error message at
> the moment I told it to go into standby.

I don't think there should be any.  When the cluster first starts up it
takes a little while to put resources on the right machine.  Curiously
enough, it takes a little longer when they both start simultaneously.  It
makes the comments on started the resources, then they move.  After they've
been completely started (for which there is no message), and both sides know
it, *then* you can go into standby.

> >If it lasts a few minutes, that's a problem that I didn't see in testing.
> 
> I did let it go for several minutes and nothing happened! I was
> unable to test it further at this moment because of the situation
> described in the following message...

Here's what you should do to exercise standby...

1) start the cluster (configured with nice_failback)...

2) wait for things to settle a little bit (probably a minute or two)

3) Go to the machine which has the resources, and issue the standby command
on
	that machine.

It should just work.  I've run the beta version probably a thousand
iterations without any occurrance of this (except for one initial occurance
because the testing software doesn't wait for things to stabilize before the
first test).

Some notes:
Both machines MUST be configured with nice_failback.

If it occurs once, try again in a few seconds.  If it doesn't work a minute
or two after the last message from the cluster after a transition, then
something is wrong.  Please send both copies of ha.cf complete logs from
both machines from at least 5 minutes before the occurance to a few minutes
after it.

	-- Alan Robertson
	   alanr@unix.sh
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.community.tummy.com
http://lists.community.tummy.com/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic