[prev in list] [next in list] [prev in thread] [next in thread] 

List:       veritas-ha
Subject:    Re: [Veritas-ha] FW: I/O Fencing non-CFS
From:       "Hudes, Dana" <hudesd () hra ! nyc ! gov>
Date:       2010-10-13 20:21:49
Message-ID: 0CC36EED613AED418A80EE6F44A659DB0DB7B1FC1A () XCH2 ! windows ! nyc ! hra ! nycnet
[Download RAW message or body]

I had a situation once where we used direct-attached storage (JBOD) which w=
as dual-attached to the two hosts in a cluster.
Normally, all was well:  one would go down and the other would take over wi=
th noone trying to import the dg on both hosts at once. An operator found a=
 way to make it fail: the application client wasn't responding so he blamed=
 the server. He then went into the data center where we had conveniently le=
ft the keys in the machine and turned the keys for both to off. Then he tur=
ned them both back on in very quick succession (same rack, mounted one abov=
e the other at the time).  Both attempted to import the DG and run the appl=
ication. This did not end well for the shared disk. It took another system =
admin and myself 2 days (and I mean day and night) to recover the volumes e=
tc.

The operator in question not only wasn't (formally) disciplined (just told =
never, ever to put his hands on a machine without specific direction from a=
 system admin to do so), he later that year got an 'attaboy' award from man=
agement.


________________________________
From: veritas-ha-bounces@mailman.eng.auburn.edu [mailto:veritas-ha-bounces@=
mailman.eng.auburn.edu] On Behalf Of John Cronin
Sent: Wednesday, October 13, 2010 4:08 PM
To: Everett Henson
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] FW: I/O Fencing non-CFS

As others have stated, it is indeed possible to forcibly import a disk grou=
p on more than one server simultaneously.

One other point - I/O fencing should protect you even in situations where o=
utside servers are not using VCS or Volume Manager.

Example: A server is being decommissioned and the disks are being wiped usi=
ng a destructive analyze program.  Unknown to anybody, one of the disks was=
 also zoned to an active VCS server, which was using it (obviously a mistak=
e, perhaps made many years before).  An application on the VCS server goes =
down and has to be restored from backup, because the data was corrupted; th=
e resulting outage lasted several hours, and some data was lost and not rec=
overable.

If the VCS server had been using I/O fencing, the SCSI-3 reservation should=
 have prevented the other server from accessing the disk.

This is based on a situation I was directly involved in (and asked to find =
the root cause for).

On Wed, Oct 13, 2010 at 2:44 PM, Everett Henson <EHenson@nyx.com<mailto:EHe=
nson@nyx.com>> wrote:
Meant for this to go to the group...

-----Original Message-----
From: Everett Henson
Sent: Wednesday, October 13, 2010 2:29 PM
To: 'A Darren Dunham'
Subject: RE: [Veritas-ha] I/O Fencing non-CFS

Hmm. The consensus seems to be for Fencing in all cases where possible. I g=
ot my first taste of VCS on version 3.5 before Fencing was an option and I'=
ve never had to deal with it until a few months ago.

My thanks to everyone for your replies.

-----Original Message-----
From: veritas-ha-bounces@mailman.eng.auburn.edu<mailto:veritas-ha-bounces@m=
ailman.eng.auburn.edu> [mailto:veritas-ha-bounces@mailman.eng.auburn.edu<ma=
ilto:veritas-ha-bounces@mailman.eng.auburn.edu>] On Behalf Of A Darren Dunh=
am
Sent: Wednesday, October 13, 2010 12:27 PM
To: veritas-ha@mailman.eng.auburn.edu<mailto:veritas-ha@mailman.eng.auburn.=
edu>
Subject: Re: [Veritas-ha] I/O Fencing non-CFS

On Wed, Oct 13, 2010 at 11:39:30AM -0400, Everett Henson wrote:

> Thanks Gene. I understand it's possible to import the group manually,
> but the discussion here was around how VCS would behave normally,
> without manual intervention.

You're talking about a system designed to prevent problems in failure
situations.  If everything is happy and healthy, it wouldn't be needed.
Older versions of VCS didn't have fencing and things worked okay most
of the time.

> Wouldn't a vxdg -C to clear the private region give the server
> with the group already imported heartburn? Would it allow both
> servers simultaneous access to the storage?

Both servers will assume they have exclusive control over the data and
will begin writing information.  Yes, eventually the first server will
run into problems, but possibly not before corrupting the filesystem
and/or the disk group.  I did some tests like this (*years* ago) and
managed to end up with a disk group that I couldn't import.

> BTW, We are using two private interconnects as well as a link-lowpri alre=
ady.

As long as the cluster is healthy and you don't have split-brain, then
it won't try to import on two.  The fencing is more robust in those
situations where things are unhealthy.

--
Darren
_______________________________________________
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu<mailto:Veritas-ha=
@mailman.eng.auburn.edu>
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
Please consider the environment before printing this email.

Visit our website at http://www.nyse.com<http://www.nyse.com/>

****************************************************

Note:  The information contained in this message and any attachment to it i=
s privileged, confidential and protected from disclosure.  If the reader of=
 this message is not the intended recipient, or an employee or agent respon=
sible for delivering this message to the intended recipient, you are hereby=
 notified that any dissemination, distribution or copying of this communica=
tion is strictly prohibited.  If you have received this communication in er=
ror, please notify the sender immediately by replying to the message, and p=
lease delete it from your system.  Thank you.  NYSE Euronext.

_______________________________________________
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu<mailto:Veritas-ha=
@mailman.eng.auburn.edu>
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha


[Attachment #3 (text/html)]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=us-ascii" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 8.00.6001.18939"></HEAD>
<BODY>
<DIV dir=ltr align=left><FONT color=#0000ff face="Lucida Bright"><SPAN 
class=327111520-13102010>I had a situation once where we used direct-attached 
storage (JBOD) which was dual-attached to the two hosts in a 
cluster.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff face="Lucida Bright"><SPAN 
class=327111520-13102010>Normally, all was well:&nbsp; one would go down and the 
other would take over with noone trying to import the dg on both hosts at once. 
An operator found a way to make it fail: the application client&nbsp;wasn't 
responding so he blamed the server. He then went into the data center where we 
had conveniently left the keys in the machine and turned the keys for both to 
off. Then he turned them both back on in very quick succession (same rack, 
mounted one above the other at the time).&nbsp; Both attempted to import the DG 
and run the application. This did not end well for the shared disk. It 
took&nbsp;another system admin and myself&nbsp;2 days (and&nbsp;I mean day and 
night) to recover the volumes etc.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff face="Lucida Bright"><SPAN 
class=327111520-13102010></SPAN></FONT>&nbsp;</DIV>
<DIV dir=ltr align=left><FONT color=#0000ff face="Lucida Bright"><SPAN 
class=327111520-13102010>The operator in question not only wasn't (formally) 
disciplined (just told never, ever to put his hands on a machine without 
specific direction from a system admin to do so), he later that year got an 
'attaboy'&nbsp;award from management.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff face="Lucida Bright"><SPAN 
class=327111520-13102010></SPAN></FONT>&nbsp;</DIV><BR>
<BLOCKQUOTE 
style="BORDER-LEFT: #0000ff 2px solid; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; \
MARGIN-RIGHT: 0px">  <DIV dir=ltr lang=en-us class=OutlookMessageHeader align=left>
  <HR tabIndex=-1>
  <FONT size=2 face=Tahoma><B>From:</B> 
  veritas-ha-bounces@mailman.eng.auburn.edu 
  [mailto:veritas-ha-bounces@mailman.eng.auburn.edu] <B>On Behalf Of </B>John 
  Cronin<BR><B>Sent:</B> Wednesday, October 13, 2010 4:08 PM<BR><B>To:</B> 
  Everett Henson<BR><B>Cc:</B> 
  veritas-ha@mailman.eng.auburn.edu<BR><B>Subject:</B> Re: [Veritas-ha] FW: I/O 
  Fencing non-CFS<BR></FONT><BR></DIV>
  <DIV></DIV>
  <DIV>As others have stated, it is indeed possible to forcibly import a disk 
  group on more than one server simultaneously.</DIV>
  <DIV>&nbsp;</DIV>
  <DIV>One other point - I/O fencing should protect you even in situations where 
  outside servers are not using VCS or Volume Manager.</DIV>
  <DIV>&nbsp;</DIV>
  <DIV>Example: A server is being decommissioned and the disks are being wiped 
  using a destructive analyze program.&nbsp; Unknown to anybody, one of the 
  disks was also zoned to an active VCS server, which was using it (obviously a 
  mistake, perhaps made many years before).&nbsp; An application on the VCS 
  server goes down and has to be restored from backup, because the data was 
  corrupted; the resulting outage lasted several hours, and some data was lost 
  and not recoverable.</DIV>
  <DIV>&nbsp;</DIV>
  <DIV>If the VCS server had been using I/O fencing, the SCSI-3 reservation 
  should have prevented the other server from accessing the disk.</DIV>
  <DIV>&nbsp;</DIV>
  <DIV>This is based on a situation I was directly involved in (and asked to 
  find the root cause for).<BR><BR></DIV>
  <DIV class=gmail_quote>On Wed, Oct 13, 2010 at 2:44 PM, Everett Henson <SPAN 
  dir=ltr>&lt;<A href="mailto:EHenson@nyx.com">EHenson@nyx.com</A>&gt;</SPAN> 
  wrote:<BR>
  <BLOCKQUOTE 
  style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex" 
  class=gmail_quote>Meant for this to go to the group...<BR><BR>-----Original 
    Message-----<BR>From: Everett Henson<BR>Sent: Wednesday, October 13, 2010 
    2:29 PM<BR>To: 'A Darren Dunham'<BR>Subject: RE: [Veritas-ha] I/O Fencing 
    non-CFS<BR><BR>Hmm. The consensus seems to be for Fencing in all cases where 
    possible. I got my first taste of VCS on version 3.5 before Fencing was an 
    option and I've never had to deal with it until a few months ago.<BR><BR>My 
    thanks to everyone for your replies.<BR>
    <DIV class=im><BR>-----Original Message-----<BR>From: <A 
    href="mailto:veritas-ha-bounces@mailman.eng.auburn.edu">veritas-ha-bounces@mailman.eng.auburn.edu</A> \
  [mailto:<A 
    href="mailto:veritas-ha-bounces@mailman.eng.auburn.edu">veritas-ha-bounces@mailman.eng.auburn.edu</A>] \
  On Behalf Of A Darren Dunham<BR>Sent: Wednesday, October 13, 2010 12:27 
    PM<BR>To: <A 
    href="mailto:veritas-ha@mailman.eng.auburn.edu">veritas-ha@mailman.eng.auburn.edu</A><BR></DIV>
  <DIV class=im>Subject: Re: [Veritas-ha] I/O Fencing non-CFS<BR><BR></DIV>
    <DIV class=im>On Wed, Oct 13, 2010 at 11:39:30AM -0400, Everett Henson 
    wrote:<BR><BR></DIV>
    <DIV class=im>&gt; Thanks Gene. I understand it's possible to import the 
    group manually,<BR>&gt; but the discussion here was around how VCS would 
    behave normally,<BR>&gt; without manual intervention.<BR><BR></DIV>
    <DIV class=im>You're talking about a system designed to prevent problems in 
    failure<BR>situations. &nbsp;If everything is happy and healthy, it wouldn't 
    be needed.<BR>Older versions of VCS didn't have fencing and things worked 
    okay most<BR>of the time.<BR><BR></DIV>
    <DIV class=im>&gt; Wouldn't a vxdg -C to clear the private region give the 
    server<BR>&gt; with the group already imported heartburn? Would it allow 
    both<BR>&gt; servers simultaneous access to the storage?<BR><BR></DIV>
    <DIV class=im>Both servers will assume they have exclusive control over the 
    data and<BR>will begin writing information. &nbsp;Yes, eventually the first 
    server will<BR>run into problems, but possibly not before corrupting the 
    filesystem<BR>and/or the disk group. &nbsp;I did some tests like this 
    (*years* ago) and<BR>managed to end up with a disk group that I couldn't 
    import.<BR><BR></DIV>
    <DIV class=im>&gt; BTW, We are using two private interconnects as well as a 
    link-lowpri already.<BR><BR></DIV>
    <DIV class=im>As long as the cluster is healthy and you don't have 
    split-brain, then<BR>it won't try to import on two. &nbsp;The fencing is 
    more robust in those<BR>situations where things are 
    unhealthy.<BR><BR>--<BR>Darren<BR>_______________________________________________<BR>Veritas-ha \
  maillist &nbsp;- &nbsp;<A 
    href="mailto:Veritas-ha@mailman.eng.auburn.edu">Veritas-ha@mailman.eng.auburn.edu</A><BR><A \
  href="http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha" 
    target=_blank>http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha</A><BR></DIV>
  <DIV class=im>Please consider the environment before printing this 
    email.<BR><BR>Visit our website at <A href="http://www.nyse.com/" 
    target=_blank>http://www.nyse.com</A><BR><BR>****************************************************<BR><BR>Note: \
  &nbsp;The information contained in this message and any attachment to it is 
    privileged, confidential and protected from disclosure. &nbsp;If the reader 
    of this message is not the intended recipient, or an employee or agent 
    responsible for delivering this message to the intended recipient, you are 
    hereby notified that any dissemination, distribution or copying of this 
    communication is strictly prohibited. &nbsp;If you have received this 
    communication in error, please notify the sender immediately by replying to 
    the message, and please delete it from your system. &nbsp;Thank you. 
    &nbsp;NYSE Euronext.<BR><BR></DIV>
    <DIV>
    <DIV></DIV>
    <DIV class=h5>_______________________________________________<BR>Veritas-ha 
    maillist &nbsp;- &nbsp;<A 
    href="mailto:Veritas-ha@mailman.eng.auburn.edu">Veritas-ha@mailman.eng.auburn.edu</A><BR><A \
  href="http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha" 
    target=_blank>http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha</A><BR></DIV></DIV></BLOCKQUOTE></DIV><BR></BLOCKQUOTE></BODY></HTML>




_______________________________________________
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

--===============2856850302423433699==--

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic