[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-ha-dev
Subject: Re: [Linux-ha-dev] Re: Initial resource takeover problems in
From: Alan Robertson <alanr () bell-labs ! com>
Date: 1999-11-09 5:50:49
[Download RAW message or body]
Thomas Hepper wrote:
>
> Hi,
> On Fri, Nov 05, 1999 at 11:50:09PM -0700, Alan Robertson wrote:
> > Several different people have reported problems with initial resource takeover
> > in heartbeat 0.4.5a.
> >
> > I tried to reproduce it here, and I could -- on one machine. When I recompiled
> > it from source, it seemed to go away. One of the people reporting it seemed to
> > have the same experience.
> >
> > I have added a little debug to the code, fixed a problem with logging from shell
> > scripts, and now call it 0.4.5b.
> >
> > It's now pointed to by the download page.
> >
> > Please let me know what you find. I would encourage anyone who is willing to
> > try the RPM version first.
>
> OK gave it a try, and no luck (debian system). So i added some debugging
> to find the place where it fails. It seems that that command which is
> run by req_our_resources does not respond in time. I changed the fgets
> loop to retry the read more than once if the first read fails, waiting 1 second
> after every failed read, and it works .....
> I have no idea why the first read fail ..., errno is set to 4.
Errno 4 is EINTR. This process has an alarm running, and it's success or
failure probably depends on where in the alarm cycle it occurs. A good bit of
my testing isn't on a real cluster, so this means I never hear from any other
machines -- so my test cases are synchronized to the alarm code, so I almost
always have a full second before the SIGALRM goes off.
This sounds like a great find!
Thanks Thomas!
-- Alan Robertson
alanr@bell-labs.com
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.tummy.com
http://lists.tummy.com/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic