[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha
Subject:    Re: mysitic HB log message
From:       Alan Robertson <alanr () unix ! sh>
Date:       2002-09-29 21:04:27
[Download RAW message or body]

Shallyee Shang wrote:
> Hello, Alanr
> 
> Thanks for your reply, it's very helpful. I noticed the "info" mark,
> but the word "malloc" made me nervous. I am sorry for posting
> a having been asked question without searching the past.

No cause for sorrow.  We are happy to help.  Several other people have had 
the same concern, so it obviously needs clearer messages.  The best solution 
is one where no one needs to search the archives, because it's obvious.

>>By the way, when this system is succesfully in production, 
>>please consider
>>writing a few paragraphs to put into
>>http://linux-ha.org/heartbeat/users.html so others can see how you use
>>heartbeat.
>>
> 
> We built a Linux Cluster using Heartbeat+drbd for just evaluation 
> purpose now. We need HA system in our production line. Currently,
> some parts of this system based on HP's solution, which is depended 
> on a single server, and very expensive. We believe cluster based on
> two cheap servers is far available and reliable than HP's single, ofen 
> trouble one.
> 
> Basically, HB+drbd satisfy most of our need, but we'd like to spend a 
> little more time to run the cluster continually, and try some advanced 
> usage, which is necessary for the real system.

The main place where people have problems with heartbeat once they have it 
up and working is that they set their deadtime too short.

I would suggest setting warntime to something like 5 seconds and deadtime to 
a value of at least 10 seconds.

Then, whenver a heartbeat comes in late (over warntime), then a message will 
appear in the logs.  These messages can help you tune deadtime. I would 
recommend a deadtime at least twice the highest observed warntime interval. 
  The optimal value depends a lot on your workload and your kernel version.

We are trying to fix this, but the delay appears to be in Linux itself, so 
it may take a while for us to resolve.

	-- Alan Robertson
	   alanr@unix.sh

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic