[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-ha-dev
Subject: Re: [Linux-ha-dev] heartbeat startup on master but not on slave does bad things
From: "Luis Claudio R. Goncalves" <lclaudio () conectiva ! com ! br>
Date: 2003-08-25 19:50:30
[Download RAW message or body]
Hi!
This a dirty, tricky and brown paper bag worth patch. Besides of its
evilness, this patch seems to work fine...
I surely know this is not the best way to solve the problem, but this patch
is incredible simple and can lead to a different solution, easier than the
first one I thought about.
I'd recommend not using this patch in a production environment. But if the
problem described in this thread is boring you, try it out. The patch
applies cleanly against heartbeat-1.0.3.
Remember, don't blame me! This patch is just a proof of concept. :)
[]'s
Luis
On Fri, Aug 22, 2003 at 06:27:02PM +0200, Lars Marowsky-Bree wrote:
| On 2003-08-15T15:49:45,
| "Luis Claudio R. Goncalves" <lclaudio@conectiva.com.br> said:
|
| > I believe it would'nt break the other cases where takeover_from_node()
| > is called. Any Ideas?
|
| Hi Luis and Alan,
|
| I have tried understanding the code in hb_resource.c and its
| dependencies within heartbeat.c, but process_resources() alone is 200
| lines for a single function, and I can't seem to understand the state
| machine within.
|
| As you two have both much superior experience with this code than I,
| could you please look at this bug? I think it is kind of important.
|
| It would probably take me more than one day to understand; if you don't
| find the time, please drop me a mail and I'll get started, but I'd like
| to avoid that ;-) (And Alan, I _will_ call you and ask tons of stupid
| questions then! ;-)
|
|
| Sincerely,
| Lars Marowsky-Brée <lmb@suse.de>
|
| --
| High Availability & Clustering ever tried. ever failed. no matter.
| SuSE Labs try again. fail again. fail better.
| Research & Development, SuSE Linux AG -- Samuel Beckett
|
---end quoted text---
--
[ Luis Claudio R. Goncalves lclaudio@conectiva.com.br ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
[ Msc has come!!!! - Conectiva HA Team - Gospel User - Linuxer - !Java ]
[ Fault Tolerance - Real-Time - Distributed Systems - IECLB - IS 40:31 ]
[ LateNite Programmer -- My Utmost for His Highest -- ]
["heartbeat-1.0.3-hb_resources.patch" (text/plain)]
--- heartbeat-1.0.3/heartbeat/hb_resource.c 2003-06-16 02:51:14.000000000 -0300
+++ /tmp/hb_resource-new.c 2003-08-25 16:39:14.000000000 -0300
@@ -453,8 +453,12 @@
if (!nice_failback) {
/* Original ("normal") starting behavior */
if (!WeAreRestarting && !resources_requested_yet) {
- resources_requested_yet=1;
- req_our_resources(FALSE);
+ if (!takeover_in_progress) {
+ resources_requested_yet=1;
+ req_our_resources(FALSE);
+ } else {
+ takeover_in_progress = 0;
+ }
}
return;
}
@@ -866,7 +870,6 @@
other_holds_resources = HB_NO_RSC;
other_is_stable = 1; /* Not going anywhere */
- takeover_in_progress = 1;
if (ANYDEBUG) {
ha_log(LOG_DEBUG
, "takeover_from_node: other now stable");
@@ -880,6 +883,7 @@
/* case 1 - part 1 */
/* part 2 is done by the mach_down script... */
}
+ takeover_in_progress = 1;
req_our_resources(TRUE);
/* req_our_resources turns on the HB_LOCAL_RSC bit */
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic