[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha
Subject:    Re: Releasing IP on slave when master comes back online...?
From:       Alan Robertson <alanr () suse ! com>
Date:       2000-04-30 2:42:33
[Download RAW message or body]

Turbo Fredriksson wrote:
> 
> > Turbo Fredriksson wrote:
> > > 
> > > file haresources on slave:
> > > ---------------------------
> > > slave xxx.xxx.xxx.12/26
> > > master xxx.xxx.xxx.11/26
> > > 
> > > file haresources on master:
> > > ---------------------------
> > > master xxx.xxx.xxx.11/26
> > 
> > The haresource files must be identical between the two machines.
> 
> I tried that, but when I took down the slave, master took that address,
> and I DEFENATLY don't want that!

If I understand your desire, you just want the "master" line to be in
each haresources file.  And, you MUST spell the lines the same on both
machines.  From both a conceptual and practical point of view, you want
them to be identical.

> If I on slave specify the master resource first, slave takes that IP
> when starting up, and vise versa for the master. If I specify 'slave
> and then master' on the slave node and 'master then slave' on the
> master node, it works as I intended EXCEPT that master takes over the
> slave IP when that goes off-line. The slave is intended as a failsafe
> machine just in case. Nothing should run on that if it's in 'standby'
> mode. It shouldn't even be accessible, ie totally invisible!

Heartbeat won't do that.  Also, it's probably not what you want to do
either.  If it's been broken for a month, you won't have any way to know
until it fails to take over the address.  And then, you won't have any
way to telnet into it to see what's wrong.  The way this is designed to
work is with at least three IP addresses.  Only one needs to be public
or routable, but the other two have to exist.  All I do is add aliases
to existing, configured ethernets.  It won't configure base ethernets.
 
> > If  you want  a non-moveable  address, simply  don't  tell heartbeat
> > about it.  It can only add  and delete aliases, it won't configure a
> > "base" interface.
> 
> That's what I did. I didn't tell master about slaves address (the non-
> movable address).
> 
> As I said, 'slave' should take 'master's address, but NOT the other way
> around...
> 
> > > To release the IP, I have to restart heartbeat... Is that
> > > intentional? Or did I miss something?
> > 
> > No.  This isn't intentional.  What the code does is when "master" comes
> > back online, it requests it's resources, and the slave gives them up.  I
> > see no evidence of this in the pieces of the logs you sent.  What
> > version are you running?
> 
> Version: 0.4.6
> 
> The 'info: node master: status up', says that the slave is detecting that
> the master is up again, but it don't release the IP... Heartbeat have to
> be restarted on the slave, for it to do that...
> 
> > You've only sent the slave logs.  What do the master logs look like?
> 
> Log on the slave:
> slave:~# tail -n0 -f /var/log/ha-log
> ---- taking master off-line ----
> heartbeat: 2000/04/29_15:09:10 warn: node master: is dead
> heartbeat: 2000/04/29_15:09:10 INFO: Running /etc/ha.d/rc.d/status status
> heartbeat: 2000/04/29_15:09:10 Taking over resource group xxx.xxx.xxx.11/26
> heartbeat: 2000/04/29_15:09:11 Acquiring resource group: master xxx.xxx.xxx.11/26
> heartbeat: 2000/04/29_15:09:11 INFO: Running /etc/ha.d/resource.d/IPaddr \
>                 xxx.xxx.xxx.11/26 start
> heartbeat: 2000/04/29_15:09:11 INFO: ifconfig eth0:0 xxx.xxx.xxx.11 netmask \
>                 255.255.255.192      broadcast xxx.xxx.xxx.63
> heartbeat: 2000/04/29_15:09:11 Sending Gratuitous Arp for xxx.xxx.xxx.11 on eth0:0 \
>                 [eth0]
> ---- taking master online again ----
> heartbeat: 2000/04/29_15:10:14 notice: node master seq restart 2 vs 22241
> heartbeat: 2000/04/29_15:10:14 info: node master: status up
> heartbeat: 2000/04/29_15:10:14 INFO: Running /etc/ha.d/rc.d/status status
> 
> Log on the master:
> master:~# tail -n0 -f /var/log/ha-log
> ---- taking master off-line ----
> heartbeat: 2000/04/29_21:09:21 info: Heartbeat shutdown in progress.
> heartbeat: 2000/04/29_21:09:21 info: Giving up all HA resources.
> heartbeat: 2000/04/29_21:09:21 Releasing resource group: master xxx.xxx.xxx.11/26
> heartbeat: 2000/04/29_21:09:21 INFO: Running /etc/ha.d/resource.d/IPaddr \
>                 xxx.xxx.xxx.11/26 stop
> heartbeat: 2000/04/29_21:09:21 info: All HA resources relinquished.
> heartbeat: 2000/04/29_21:09:21 info: Heartbeat shutdown complete.
> ----- taking master online again ----
> heartbeat: 2000/04/29_21:10:27 info: ***********************
> heartbeat: 2000/04/29_21:10:27 info: Configuration validated. Starting heartbeat.
> heartbeat: 2000/04/29_21:10:28 notice: Starting serial heartbeat on tty /dev/ttyS0
> heartbeat: 2000/04/29_21:10:28 notice: Using watchdog device: /dev/watchdog
> heartbeat: 2000/04/29_21:10:28 error: Cannot open /proc/ha/.control: No such file \
>                 or directory
> heartbeat: 2000/04/29_21:10:28 info: Requesting our resources.
> heartbeat: 2000/04/29_21:10:28 INFO: Running /etc/ha.d/resource.d/IPaddr \
> xxx.xxx.xxx.11/26 status

Are the clocks on the two machines off from each other by more than 6
hours?  If not, these logs don't go together.  Why don't you try this
with the configuration advice given above, and the current version,
0.4.7?

My best guess is that you changed the haresources file in the middle of
the run.  Since you didn't supply the whole log, I can't tell.

Either that, or you configured this address:
	master xxx.xxx.xxx.11/26
into the system outside heartbeat before you started heartbeat, which
will cause no end of troubles.  You're the second person to have that
trouble in the last couple of days, so I wonder if the documentation
isn't very clear on this matter.

It is important that the .11 address be managed only by heartbeat, and
not by your distribution's network bringup scripts.  You need three IP
addresses, one for each adapter that *are* managed by the system bringup
scripts, and a distinct third one that heartbeat manages.

	-- Alan Robertson
	   alanr@suse.com

------------------------------------------------------------------------------
Linux HA Web Site:
  http://linux-ha.org/
Linux HA HOWTO:
  http://metalab.unc.edu/pub/Linux/ALPHA/linux-ha/High-Availability-HOWTO.html
------------------------------------------------------------------------------


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic