'Re: [gfs-users] ..more'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gfs-users
Subject:    Re: [gfs-users] ..more
From:       "Rado S" <radigyy () hotmail ! com>
Date:       2002-03-27 21:36:33
[Download RAW message or body]

Michael,

Makes sense. This solves the problem for application failover,
but it's good if the call backs can go to more than one IP, also
the requests to memxped or any other ip lock mechanism.
Otherwise there is a big chance to loose a GFS node
, if there is no access to the lock server, because
of networking issue. I assume that the IP lock daemon is on one of
the gfs-nodes, and it has failover. Solution is to put the daemon
to listen on the same network as the applications, so in case
of server failure, it can still access the backup server thru
IP alias.

Is that make sense?

>From: Michael Welcome <mlwelcome@lbl.gov>
>To: Rado S <radigyy@hotmail.com>
>CC: gfs-users@sistina.com, gfs-devel@sistina.com
>Subject: Re: [gfs-users] ..more
>Date: Wed, 27 Mar 2002 12:06:51 -0800
>
>From my reading of the code (GFS 4.2) it appears that the
>heartbeat is not done through IP, but rather each node
>increments a counter in its record in the cidev pool.
>That is, it heartbeats to disk and all nodes monitor each
>other by reading the cidev pool  When another node notices
>that a heartbeat for node X has not occurred withing the
>specified timeout period, it begins recovery operations
>and requests that X be stomithed.
>
>The stomith methods do not use IP, they fence at the
>FC switch port, or request a reboot on a plug at the
>network power switch, etc.  However, it appears that
>the configuration of each node is based on IP address.
>That is, you cannot specify two GFS nodes with the same
>IP address.
>
>In your case, where you want to do application failover
>by IP alias, you might be able to solve your problem
>through additional hardware.  If you have two NICs
>per node and use the application fail-over on one set of
>NICs (IP alias) and use the other NICs for GFS configuration and
>traffic, then GFS will properly stomith the failed node
>and your application server will fail over to the other
>node which is already a live GFS server.
>
>[Hey GFS developers... Is this correct?]
>
>Rado S wrote:
>
>>  From my understanding the stomith is based on the heartbeat
>>between the gfs-nodes, which is thru IP networking..
>>
>>Now if one of the nodes die, and the other one take over the IP
>>before the heartbeat recognize that, how gfs will recognize the failure?
>>
>>Also, do you use heartbeat between the servers, based on SCSI<->SCSI?
>>
>>Rado
>>
>>
>>
>>// rado
>>
>>
>>_________________________________________________________________
>>Get your FREE download of MSN Explorer at 
>>http://explorer.msn.com/intl.asp.
>>
>>
>>_______________________________________________
>>gfs-users mailing list
>>gfs-users@sistina.com
>>http://lists.sistina.com/mailman/listinfo/gfs-users
>>Read the GFS Howto:  http://www.sistina.com/gfs/Pages/howto.html
>
>
>--
>Michael L. Welcome
>NERSC Future Technologies Group		 	      mlwelcome@lbl.gov
>National Energy Research Scientific Computing Center  PHONE 510-486-5224
>Lawrence Berkeley National Laboratory	              FAX   510-495-2998
>
>
>_______________________________________________
>gfs-users mailing list
>gfs-users@sistina.com
>http://lists.sistina.com/mailman/listinfo/gfs-users
>Read the GFS Howto:  http://www.sistina.com/gfs/Pages/howto.html

// rado

// rado

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp.

_______________________________________________
gfs-users mailing list
gfs-users@sistina.com
http://lists.sistina.com/mailman/listinfo/gfs-users
Read the GFS Howto:  http://www.sistina.com/gfs/Pages/howto.html
[prev in list] [next in list] [prev in thread] [next in thread]