[prev in list] [next in list] [prev in thread] [next in thread]
List: veritas-ha
Subject: RE: [Veritas-ha] system state FAULTED
From: Bhavin Thaker <bhavin () veritas ! com>
Date: 2005-04-27 16:04:27
Message-ID: Pine.GSO.4.62.0504270847160.7687 () suraj ! engba ! veritas ! com
[Download RAW message or body]
And ... to avoid this scenario, on Solaris,
instead of using the "reboot" command,
you could use the "shutdown -i6" command
to do a graceful shutdown.
The "shutdown -i6" command invokes the
VCS shutdown script from /etc/rc0.d/K*vcs,
that runs the required steps before
rebooting/shutting down the machine,
while the "reboot" command does not.
... Bhavin Thaker.
On Tue, 26 Apr 2005, Jim Senicka wrote:
> Sequence as follows
>
> 1 issue reboot
> 2. HAD exits
> 3. Other nodes see GAB up, HAD down for a node. Port A open, port H
> closed.
> 4. All nodes place odd node in "special jeopardy" which means we know it
> is alive,
> but since HAD is down, we have no idea what the node is doing
> 5. This shifts the departing node to faulted
> 6. Since we do not know what the state of the applications on the node
> are, we auto disable
> any group that could run on that node
> 7. GAB exits, closing port A. This means the node is truly dead
> 8. Node is gone, so autodisabled cleared
>
>
> _____
>
> From: veritas-ha-admin@mailman.eng.auburn.edu
> [mailto:veritas-ha-admin@mailman.eng.auburn.edu] On Behalf Of Rajeev
> Verma
> Sent: Tuesday, April 26, 2005 10:53 PM
> To: Symantec Veritas
> Cc: veritas-ha@mailman.eng.auburn.edu
> Subject: RE: [Veritas-ha] system state FAULTED
>
>
> Looks like you didn't stop the cluster (hastop -all) before doing the
> reboot.
> So "had" on the node(s) that is not rebooted yet, will report the
> rebooting nodes as FAULTED.
>
> thanks,
> -Rajeev Verma.
>
>
> _____
>
> From: veritas-ha-admin@mailman.eng.auburn.edu
> [mailto:veritas-ha-admin@mailman.eng.auburn.edu] On Behalf Of Symantec
> Veritas
> Sent: Tuesday, April 26, 2005 7:06 PM
> To: veritas-ha@mailman.eng.auburn.edu
> Subject: [Veritas-ha] system state FAULTED
>
>
> Hi,
>
> I am having VCS 3.5 running on solaris 8 on a 4 node cluster. All the
> nodes were rebooted as part of scheduled job, during which i got number
> of messages in engine log which i am trying to understand. The sequence
> of events in log are as:
>
> 1) all nodes showed as jeopardy state after boot for a moment. Why ? Is
> it that one of the link was down for a moment just after booting
> 2) After that log says system changed state from RUNNING to FAULTED. I
> have never seen that system goes to FAULTED state, why it went to
> faulted state.
> 3) After this log shows that service groups became autodisabled on all
> these nodes.
> 4) after this, System (hostname) is in Down State - Membership: 0x4a
> 5) VCS:10451:Cleared attribute-'autodisabled' for Group on node, does
> the autodisabled flag gets cleared on its own. Many times i have faced
> situations where i have cleared the autodisable flag for servicegroup
> manually.
>
> Does somebody know about these error messages & what could be the reason
> behind this, specifically the system going to FAULTED state and service
> groups getting autodisabled.
>
> Thanks
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
>
_______________________________________________
Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic