[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    RE:  [Linux-ha-dev] Looking for help with apphbd bug in current C
From:       "Li, Adam" <adam.li () intel ! com>
Date:       2002-12-06 6:49:03
[Download RAW message or body]

Hi Alan,

I ran the apphbd on Suse8.1 (2.4.19-4GB). It will hang the machine almost
100%.

Procedure:
1. Start apphbd in one text console;
2. Start apphbtest in another text console;
3. When apphbtest says: "client registering", system hangs. The system does
not respond to any key strike.
I am using glib-1.2.10.

An interesting thing is that  if I change the apphbd.c a little.
Set the
int debug = 3;

And ran again, the system will not hang.
Also if I using gdb to trace the execution step by step(set debug=0), the
system wil not hang either.

So I think the bug is related with *time*. When debug is set to 3, in
functions like apphb_dispatch, 
if(debug >= DBGDETAIL) will be true and log something. So that the execute
becomes slow down.

Can anyone give some idea on how to debug it?

- adam

------Original Message-----
From: Alan Robertson [mailto:alanr@unix.sh]
Sent: Thursday, November 14, 2002 7:45 AM
To: Linux-HA Development List
Subject: [Linux-ha-dev] Looking for help with apphbd bug in current CVS


Hi,

There is a bug in apphbd in the current CVS version which is holding up the 
next beta (and the next stable release).

The symptoms are:
	- it hangs the whole machine - seemingly requiring a reboot

Since apphbd is a soft realtime process, this is certainly possible if it 
goes into an infinite loop.  I've had this happen with heartbeat before. 
But this code is a *very* simple client-server model, and the way it behaves

is very strange...

And it doesn't happen under all circumstances...


A) If you start it up normally from a shell with normal priority - it hangs 
the whole system.

B) If you start it up normally from a shell with normal priority, but you 
use the -l flag to keep the daemon process from running at high priority, it

works fine - and *no infinite loop is apparent*.

C) If you start the client up from a shell which is a higher realtime 
priority than the daemon - it doesn't hang.  Again, no infinite loop is 
apparent.

D) If you start a priority-higher-than-the-daemon shell up, and spawn the 
client from a normal priority shell - it hangs the whole system and the 
higher-priority shell never runs again(!).  I am doing this from one of the 
text console windows to keep X (running at normal priority) from being the 
problem.  As I understand it, this should work.  This makes it sound a bit 
like a kernel bug.

	here's the exact procedure I followed:

		start high-priority shell in alt-f2 console window
		start normal-priority client in alt-f3 console window:
			sleep 30; /usr/lib/heartbeat/apphbtest
		switch back to high-priority alt-f2 window
			{~30 seconds later system hangs}

I'm running the SuSE 2.4.18-4GB kernel.

This all very odd.  I expect to see some kind of infinite loop manifest 
itself in B) or C).

	-- Alan Robertson
	   alanr@unix.sh

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.community.tummy.com
http://lists.community.tummy.com/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.community.tummy.com
http://lists.community.tummy.com/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic