[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] RFC: "status" operation in stonith modules needs
From:       Alan Robertson <alanr () unix ! sh>
Date:       2002-01-17 16:38:58
[Download RAW message or body]

Lars Marowsky-Bree wrote:
> 
> Good morning,
> 
> the issue:
> 
> - The status operation often connects to the actual hardware and checks
>   whether it is responding and has the node configured.
> 
>   This is nice, however many kinds of STONITH hardware do only support a
>   single session; as long as one node is trying the status operation, all
>   connection attempts from other nodes fail (or the first connected node gets
>   disconnected).
> 
> This means that - for example - FailSafe, which queries the status information
> to permanently monitor whether it can talk to the reset devices, will receive
> frequent errors.
> 
> For now, I think I will fix it by disabling the monitoring of this part in
> FailSafe.
> 
> However I would suggest that the status operation should detect these
> conflicts and not return a hard error but only a soft error, or alternatively,
> not do a monitoring operation which can fail this easily.


For the Baytech, I put a retry layer on top because of things which seem
like bugs to me.

On the other hand, it is the job of the Stonith software to reflect the
capabilities of the hardware.  If the hardware is broken, it may be
reflected to the software as broken.


	-- Alan Robertson
	   alanr@unix.sh
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.community.tummy.com
http://lists.community.tummy.com/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic