[prev in list] [next in list] [prev in thread] [next in thread]
List: veritas-ha
Subject: Re: [Veritas-ha] FW: I/O Fencing non-CFS
From: "Hudes, Dana" <hudesd () hra ! nyc ! gov>
Date: 2010-10-13 20:21:49
Message-ID: 0CC36EED613AED418A80EE6F44A659DB0DB7B1FC1A () XCH2 ! windows ! nyc ! hra ! nycnet
[Download RAW message or body]
I had a situation once where we used direct-attached storage (JBOD) which w=
as dual-attached to the two hosts in a cluster.
Normally, all was well: one would go down and the other would take over wi=
th noone trying to import the dg on both hosts at once. An operator found a=
way to make it fail: the application client wasn't responding so he blamed=
the server. He then went into the data center where we had conveniently le=
ft the keys in the machine and turned the keys for both to off. Then he tur=
ned them both back on in very quick succession (same rack, mounted one abov=
e the other at the time). Both attempted to import the DG and run the appl=
ication. This did not end well for the shared disk. It took another system =
admin and myself 2 days (and I mean day and night) to recover the volumes e=
tc.
The operator in question not only wasn't (formally) disciplined (just told =
never, ever to put his hands on a machine without specific direction from a=
system admin to do so), he later that year got an 'attaboy' award from man=
agement.
________________________________
From: veritas-ha-bounces@mailman.eng.auburn.edu [mailto:veritas-ha-bounces@=
mailman.eng.auburn.edu] On Behalf Of John Cronin
Sent: Wednesday, October 13, 2010 4:08 PM
To: Everett Henson
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] FW: I/O Fencing non-CFS
As others have stated, it is indeed possible to forcibly import a disk grou=
p on more than one server simultaneously.
One other point - I/O fencing should protect you even in situations where o=
utside servers are not using VCS or Volume Manager.
Example: A server is being decommissioned and the disks are being wiped usi=
ng a destructive analyze program. Unknown to anybody, one of the disks was=
also zoned to an active VCS server, which was using it (obviously a mistak=
e, perhaps made many years before). An application on the VCS server goes =
down and has to be restored from backup, because the data was corrupted; th=
e resulting outage lasted several hours, and some data was lost and not rec=
overable.
If the VCS server had been using I/O fencing, the SCSI-3 reservation should=
have prevented the other server from accessing the disk.
This is based on a situation I was directly involved in (and asked to find =
the root cause for).
On Wed, Oct 13, 2010 at 2:44 PM, Everett Henson <EHenson@nyx.com<mailto:EHe=
nson@nyx.com>> wrote:
Meant for this to go to the group...
-----Original Message-----
From: Everett Henson
Sent: Wednesday, October 13, 2010 2:29 PM
To: 'A Darren Dunham'
Subject: RE: [Veritas-ha] I/O Fencing non-CFS
Hmm. The consensus seems to be for Fencing in all cases where possible. I g=
ot my first taste of VCS on version 3.5 before Fencing was an option and I'=
ve never had to deal with it until a few months ago.
My thanks to everyone for your replies.
-----Original Message-----
From: veritas-ha-bounces@mailman.eng.auburn.edu<mailto:veritas-ha-bounces@m=
ailman.eng.auburn.edu> [mailto:veritas-ha-bounces@mailman.eng.auburn.edu<ma=
ilto:veritas-ha-bounces@mailman.eng.auburn.edu>] On Behalf Of A Darren Dunh=
am
Sent: Wednesday, October 13, 2010 12:27 PM
To: veritas-ha@mailman.eng.auburn.edu<mailto:veritas-ha@mailman.eng.auburn.=
edu>
Subject: Re: [Veritas-ha] I/O Fencing non-CFS
On Wed, Oct 13, 2010 at 11:39:30AM -0400, Everett Henson wrote:
> Thanks Gene. I understand it's possible to import the group manually,
> but the discussion here was around how VCS would behave normally,
> without manual intervention.
You're talking about a system designed to prevent problems in failure
situations. If everything is happy and healthy, it wouldn't be needed.
Older versions of VCS didn't have fencing and things worked okay most
of the time.
> Wouldn't a vxdg -C to clear the private region give the server
> with the group already imported heartburn? Would it allow both
> servers simultaneous access to the storage?
Both servers will assume they have exclusive control over the data and
will begin writing information. Yes, eventually the first server will
run into problems, but possibly not before corrupting the filesystem
and/or the disk group. I did some tests like this (*years* ago) and
managed to end up with a disk group that I couldn't import.
> BTW, We are using two private interconnects as well as a link-lowpri alre=
ady.
As long as the cluster is healthy and you don't have split-brain, then
it won't try to import on two. The fencing is more robust in those
situations where things are unhealthy.
--
Darren
_______________________________________________
Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu<mailto:Veritas-ha=
@mailman.eng.auburn.edu>
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
Please consider the environment before printing this email.
Visit our website at http://www.nyse.com<http://www.nyse.com/>
****************************************************
Note: The information contained in this message and any attachment to it i=
s privileged, confidential and protected from disclosure. If the reader of=
this message is not the intended recipient, or an employee or agent respon=
sible for delivering this message to the intended recipient, you are hereby=
notified that any dissemination, distribution or copying of this communica=
tion is strictly prohibited. If you have received this communication in er=
ror, please notify the sender immediately by replying to the message, and p=
lease delete it from your system. Thank you. NYSE Euronext.
_______________________________________________
Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu<mailto:Veritas-ha=
@mailman.eng.auburn.edu>
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
[Attachment #3 (text/html)]
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=us-ascii" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 8.00.6001.18939"></HEAD>
<BODY>
<DIV dir=ltr align=left><FONT color=#0000ff face="Lucida Bright"><SPAN
class=327111520-13102010>I had a situation once where we used direct-attached
storage (JBOD) which was dual-attached to the two hosts in a
cluster.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff face="Lucida Bright"><SPAN
class=327111520-13102010>Normally, all was well: one would go down and the
other would take over with noone trying to import the dg on both hosts at once.
An operator found a way to make it fail: the application client wasn't
responding so he blamed the server. He then went into the data center where we
had conveniently left the keys in the machine and turned the keys for both to
off. Then he turned them both back on in very quick succession (same rack,
mounted one above the other at the time). Both attempted to import the DG
and run the application. This did not end well for the shared disk. It
took another system admin and myself 2 days (and I mean day and
night) to recover the volumes etc.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff face="Lucida Bright"><SPAN
class=327111520-13102010></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff face="Lucida Bright"><SPAN
class=327111520-13102010>The operator in question not only wasn't (formally)
disciplined (just told never, ever to put his hands on a machine without
specific direction from a system admin to do so), he later that year got an
'attaboy' award from management.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff face="Lucida Bright"><SPAN
class=327111520-13102010></SPAN></FONT> </DIV><BR>
<BLOCKQUOTE
style="BORDER-LEFT: #0000ff 2px solid; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; \
MARGIN-RIGHT: 0px"> <DIV dir=ltr lang=en-us class=OutlookMessageHeader align=left>
<HR tabIndex=-1>
<FONT size=2 face=Tahoma><B>From:</B>
veritas-ha-bounces@mailman.eng.auburn.edu
[mailto:veritas-ha-bounces@mailman.eng.auburn.edu] <B>On Behalf Of </B>John
Cronin<BR><B>Sent:</B> Wednesday, October 13, 2010 4:08 PM<BR><B>To:</B>
Everett Henson<BR><B>Cc:</B>
veritas-ha@mailman.eng.auburn.edu<BR><B>Subject:</B> Re: [Veritas-ha] FW: I/O
Fencing non-CFS<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV>As others have stated, it is indeed possible to forcibly import a disk
group on more than one server simultaneously.</DIV>
<DIV> </DIV>
<DIV>One other point - I/O fencing should protect you even in situations where
outside servers are not using VCS or Volume Manager.</DIV>
<DIV> </DIV>
<DIV>Example: A server is being decommissioned and the disks are being wiped
using a destructive analyze program. Unknown to anybody, one of the
disks was also zoned to an active VCS server, which was using it (obviously a
mistake, perhaps made many years before). An application on the VCS
server goes down and has to be restored from backup, because the data was
corrupted; the resulting outage lasted several hours, and some data was lost
and not recoverable.</DIV>
<DIV> </DIV>
<DIV>If the VCS server had been using I/O fencing, the SCSI-3 reservation
should have prevented the other server from accessing the disk.</DIV>
<DIV> </DIV>
<DIV>This is based on a situation I was directly involved in (and asked to
find the root cause for).<BR><BR></DIV>
<DIV class=gmail_quote>On Wed, Oct 13, 2010 at 2:44 PM, Everett Henson <SPAN
dir=ltr><<A href="mailto:EHenson@nyx.com">EHenson@nyx.com</A>></SPAN>
wrote:<BR>
<BLOCKQUOTE
style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex"
class=gmail_quote>Meant for this to go to the group...<BR><BR>-----Original
Message-----<BR>From: Everett Henson<BR>Sent: Wednesday, October 13, 2010
2:29 PM<BR>To: 'A Darren Dunham'<BR>Subject: RE: [Veritas-ha] I/O Fencing
non-CFS<BR><BR>Hmm. The consensus seems to be for Fencing in all cases where
possible. I got my first taste of VCS on version 3.5 before Fencing was an
option and I've never had to deal with it until a few months ago.<BR><BR>My
thanks to everyone for your replies.<BR>
<DIV class=im><BR>-----Original Message-----<BR>From: <A
href="mailto:veritas-ha-bounces@mailman.eng.auburn.edu">veritas-ha-bounces@mailman.eng.auburn.edu</A> \
[mailto:<A
href="mailto:veritas-ha-bounces@mailman.eng.auburn.edu">veritas-ha-bounces@mailman.eng.auburn.edu</A>] \
On Behalf Of A Darren Dunham<BR>Sent: Wednesday, October 13, 2010 12:27
PM<BR>To: <A
href="mailto:veritas-ha@mailman.eng.auburn.edu">veritas-ha@mailman.eng.auburn.edu</A><BR></DIV>
<DIV class=im>Subject: Re: [Veritas-ha] I/O Fencing non-CFS<BR><BR></DIV>
<DIV class=im>On Wed, Oct 13, 2010 at 11:39:30AM -0400, Everett Henson
wrote:<BR><BR></DIV>
<DIV class=im>> Thanks Gene. I understand it's possible to import the
group manually,<BR>> but the discussion here was around how VCS would
behave normally,<BR>> without manual intervention.<BR><BR></DIV>
<DIV class=im>You're talking about a system designed to prevent problems in
failure<BR>situations. If everything is happy and healthy, it wouldn't
be needed.<BR>Older versions of VCS didn't have fencing and things worked
okay most<BR>of the time.<BR><BR></DIV>
<DIV class=im>> Wouldn't a vxdg -C to clear the private region give the
server<BR>> with the group already imported heartburn? Would it allow
both<BR>> servers simultaneous access to the storage?<BR><BR></DIV>
<DIV class=im>Both servers will assume they have exclusive control over the
data and<BR>will begin writing information. Yes, eventually the first
server will<BR>run into problems, but possibly not before corrupting the
filesystem<BR>and/or the disk group. I did some tests like this
(*years* ago) and<BR>managed to end up with a disk group that I couldn't
import.<BR><BR></DIV>
<DIV class=im>> BTW, We are using two private interconnects as well as a
link-lowpri already.<BR><BR></DIV>
<DIV class=im>As long as the cluster is healthy and you don't have
split-brain, then<BR>it won't try to import on two. The fencing is
more robust in those<BR>situations where things are
unhealthy.<BR><BR>--<BR>Darren<BR>_______________________________________________<BR>Veritas-ha \
maillist - <A
href="mailto:Veritas-ha@mailman.eng.auburn.edu">Veritas-ha@mailman.eng.auburn.edu</A><BR><A \
href="http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha"
target=_blank>http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha</A><BR></DIV>
<DIV class=im>Please consider the environment before printing this
email.<BR><BR>Visit our website at <A href="http://www.nyse.com/"
target=_blank>http://www.nyse.com</A><BR><BR>****************************************************<BR><BR>Note: \
The information contained in this message and any attachment to it is
privileged, confidential and protected from disclosure. If the reader
of this message is not the intended recipient, or an employee or agent
responsible for delivering this message to the intended recipient, you are
hereby notified that any dissemination, distribution or copying of this
communication is strictly prohibited. If you have received this
communication in error, please notify the sender immediately by replying to
the message, and please delete it from your system. Thank you.
NYSE Euronext.<BR><BR></DIV>
<DIV>
<DIV></DIV>
<DIV class=h5>_______________________________________________<BR>Veritas-ha
maillist - <A
href="mailto:Veritas-ha@mailman.eng.auburn.edu">Veritas-ha@mailman.eng.auburn.edu</A><BR><A \
href="http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha"
target=_blank>http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha</A><BR></DIV></DIV></BLOCKQUOTE></DIV><BR></BLOCKQUOTE></BODY></HTML>
_______________________________________________
Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
--===============2856850302423433699==--
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic