[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha
Subject:    Re: [Linux-HA] unexpected late heartbeats/failovers with 1.2.0
From:       Dave Holland <dh3 () sanger ! ac ! uk>
Date:       2004-03-25 14:12:30
Message-ID: 20040325141230.GH14924 () sanger ! ac ! uk
[Download RAW message or body]

On Thu, Mar 25, 2004 at 02:18:26PM +0100, Lars Marowsky-Bree wrote:
> Interesting. Is there any pattern at all to it?

Not that I can see. Under current load (20 requests/second) it looks
like this: (vmstat 5)

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  0   8228   6592 592300 2582436    0    0     1     7    6     5  0 2 98  0
 0  0   8228   6768 592292 2582324    0    0     0   141  362   175  0 0 99  0
 0  0   8228   6600 592320 2582464    0    0     7   131  463   236  0 1 99  0
 0  0   8228   6724 592332 2582336    0    0     3   212  581   296  0 1 99  0
 0  0   8228   6544 592360 2582480    0    0    35   214  544   265  1 1 98  0
 0  0   8228   6624 592344 2582412    0    0     0   205  492   248  0 1 99  0

(Yes, the machines are somewhat overspecified for the job.)

> Maybe you can make it more apparent if you reduce the heartbeat interval
> to 200ms, warntime 600ms, deadtime 10s. That would help us with
> debugging, but if it's already production, you may not want to do that
> ;-)

It's in production. :-} I might try that if I don't see another
unexpected hiccup by Monday.

> OTOH, if it's related to your NICs acting up (which is possible), maybe
> adding a serial heartbeat or a second network interface (I figure you
> could at least also heartbeat over eth0 in mcast / ucast mode?) would
> help.

I've added a serial heartbeat connection.

Thanks,
Dave
-- 
** Dave Holland ** Systems Support  -  Special Projects Team **
** 01223 834244 ** Sanger Institute, Hinxton, Cambridge, UK  **
"You can learn many things from children. How much patience you have,
for instance."
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic