[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-ha
Subject: Re: [Linux-HA] unexpected late heartbeats/failovers with 1.2.0
From: Dave Holland <dh3 () sanger ! ac ! uk>
Date: 2004-03-25 14:12:30
Message-ID: 20040325141230.GH14924 () sanger ! ac ! uk
[Download RAW message or body]
On Thu, Mar 25, 2004 at 02:18:26PM +0100, Lars Marowsky-Bree wrote:
> Interesting. Is there any pattern at all to it?
Not that I can see. Under current load (20 requests/second) it looks
like this: (vmstat 5)
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 8228 6592 592300 2582436 0 0 1 7 6 5 0 2 98 0
0 0 8228 6768 592292 2582324 0 0 0 141 362 175 0 0 99 0
0 0 8228 6600 592320 2582464 0 0 7 131 463 236 0 1 99 0
0 0 8228 6724 592332 2582336 0 0 3 212 581 296 0 1 99 0
0 0 8228 6544 592360 2582480 0 0 35 214 544 265 1 1 98 0
0 0 8228 6624 592344 2582412 0 0 0 205 492 248 0 1 99 0
(Yes, the machines are somewhat overspecified for the job.)
> Maybe you can make it more apparent if you reduce the heartbeat interval
> to 200ms, warntime 600ms, deadtime 10s. That would help us with
> debugging, but if it's already production, you may not want to do that
> ;-)
It's in production. :-} I might try that if I don't see another
unexpected hiccup by Monday.
> OTOH, if it's related to your NICs acting up (which is possible), maybe
> adding a serial heartbeat or a second network interface (I figure you
> could at least also heartbeat over eth0 in mcast / ucast mode?) would
> help.
I've added a serial heartbeat connection.
Thanks,
Dave
--
** Dave Holland ** Systems Support - Special Projects Team **
** 01223 834244 ** Sanger Institute, Hinxton, Cambridge, UK **
"You can learn many things from children. How much patience you have,
for instance."
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic