[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha
Subject:    Re: [Linux-HA] Heartbeat 1.2.0 random segfaults on node restart
From:       Umberto Nicoletti <unicoletti () prometeo ! it>
Date:       2004-03-25 16:47:36
Message-ID: 1080233256.4814.8.camel () friedrich
[Download RAW message or body]

I ran heartbeat 1.2.0 for almost all day yesterday and did not get any
more segfaults. While it was runnning I tried to put as much pressure on
the machines as possibile by running heavy rsync and cpu intensive md5
computations, with no luck.

Now I switched back to 1.0.3 because I am in a hurry and must deliver
the cluster. 1.0.3 proved to be stable in similar environments.
BTW: I have two firewalls with VPN on Compaq Proliants (Always SuSE
Linux 8.0) that have been running heartbeat 1.2.0 for almost two weeks
without any issue.

If I have news I will report to this list.
For now many thanks for your support,
Umberto


On Thu, 2004-03-25 at 17:31, Alan Robertson wrote:
> Umberto Nicoletti wrote:
> > Hi Alan,
> > thanks for the prompt reply and sorry for cross-posting with Mike, but I
> > thought that maybe the two issues were related, since they both dealt
> > with tty errors and I hoped that it was only a problem with version
> > 1.2.0, something that can be quickly fixed with a downgrade.
> 
> No one else has reported this problem before...
> 
> 
> >>Is it possible you have other processes trying to use these lines for 
> >>something?
> > 
> > 
> > I checked that already and I am sure heartbeat is the only process
> > accessing the serial line:
> > 
> > what I did to check this was:
> > robin:/var/log # fuser -v /dev/ttyS0
> >  
> >                      USER        PID ACCESS COMMAND
> > /dev/ttyS0           root        736 f....  heartbeat
> >                      root        839 f....  heartbeat
> >                      root        840 f....  heartbeat
> >                      root        841 f....  heartbeat
> >                      root        842 f....  heartbeat
> >                      root        843 f....  heartbeat
> > 
> > checked that no gpm or getty was running.
> > 
> > I tried to cat /dev/ttyS0 for some time, but the only output I got was
> > related to hertbeat.
> 
> This is a perfect example of interfering with heartbeat ;-).  If you read 
> these characters, then heartbeat can't.
> 
> > It should not be cable problem , as the cluster ran heartbeat 0.4.9 for
> > almost a month and never had problems.
> 
> Your problem is definitely not cables.  It's a bug.  But, I'm afraid I'll 
> need a core file to diagnose this...  The hbread processes are very 
> simple...  :-(

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic