[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] Problem WARN: Gmain_timeout_dispatch Again
From:       gilmarlinux () agrovale ! com ! br
Date:       2011-05-16 21:22:41
Message-ID: 48046.201.24.133.203.1305580961.squirrel () mail ! agrovale ! com ! br
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Ok,Thank you. I'm trying to isolate the problem to the maximum, so I try to diagnose \
the problem. I've tried tools like sar iostat to check the system queries. But for \
now everything without problems
> That's probably OK.  If you're really having a problem, it should>
ordinarily show it up before it causes a false failover.> > Then you
can figure out if you want to raise your timeout or figure out> what's causing
the slow processing.> > > On 05/14/2011 09:08 AM,
gilmarlinux@agrovale.com.br wrote:>> Thanks again.>> deadtime 30
and warntime 15 this good ?>>>> > BUT also either make
warntime smaller or deadtime larger...>> >>> >>> > On 5/13/2011 7:48 PM, \
gilmarlinux@agrovale.com.br wrote:>>
> > Thank you for your attention.>> >> His recommendation and
wait, if only to continue the logs I get>> >> following warning if the
services do not migrate to another server>> >> just keep watching the
logs warning.>> >>>> >> > I typically make
deadtime something like 3 times warntime. That way>> >> > you'll
get data before you get into trouble. When your heartbeats>> >> >
exceed warntime, you get information on how late it is. I would>> >>
> typically make deadtime AT LEAST twice the latest time you've>> ever
seen>> >> > with warntime.>> >> >>> >> > If the worst case you ever saw was this 60ms \
instead of 50ms, I'd>> look>> >> > somewhere else for the problem.
However, it is possible that you>> have a>> >> >
hardware trouble, or a kernel bug. Possible, but unlikely.>> >>
> > > > > > More logs are always good when looking at a problem
like this.>> >> > hb_report will get you lots of logs and so on for
the next time it>> >> happens.>> >> >>> >> > On 05/13/2011 11:44 AM, \
gilmarlinux@agrovale.com.br wrote:>> >> >> Thanks for the help.>> >> >>>> >> >> I had \
a problem the 30 days that began with this post, and after two>> >> >> days the \
heartbeat message that the accused had fallen server1 and>> >> >> services migrated \
to server2>> >> >> Now with this change to eth1 and eth2 for drbd and heartbeat to \
the>> >> >> amendment of warntime deadtime 20 to 15 and do not know if this will>> >> \
>> happen again.>>
> > > > Thanks>> >> >>>> >> >>
> That's related to process dispatch time in the kernel. It might>>
> > be the>> >> >> > case that this expectation is a bit
aggressive (mea culpa).>> >> >> >>> >>
> > > In the mean time, as long as those timings remain close to the>> >> >> > \
> > > expectations (60 vs 50ms) I'd ignore them.>> >> >> >>> >> >> > Those messages
are meant to debug real-time problems - which you>> >> don't>> >> >> > appear to be \
having.>> >> >>
> > > > > > > > -- Alan Robertson>> >>
> > > alanr@unix.sh>> >> >> >>> >>
> > > > > > > > > > On 05/12/2011 12:54 PM,
gilmarlinux@agrovale.com.br wrote:>> >> >> >> Hello!>> >> >> >> I'm using heartbeat \
version 3.0.3-2 on debian squeeze with>> dedicated>> >> >> >> gigabit
ethernet interface for the heartbeat.>> >> >> >> But even
this generates the following message:>> >> >> >> WARN:
Gmain_timeout_dispatch: Dispatch function for send local>> >>
status>> >> >> >> took too long to execute: 60 ms (> 50
ms) (GSource: 0x101c350)>> >> >> >> I'm using eth1 to eth2
and to Synchronize DRBD(eth1) HEARBEAT>> >> (eth2).>>
> > > > > > I tried increasing the values deadtime = 20 and 15
warntime>> >> >> >> Interface Gigabit Ethernet controller:
Intel Corporation 82575GB>> >> >> >> Serv.1 and the
Ethernet controller: Broadcom Corporation>> >> NetXtreme II>> >> >> >> BCM5709 in \
Serv.2>> >> >>
> > Tested using two Broadcom for the heartbeat, also without>>
success.>> >> >> >>>> >> >>
> > Thanks>> >> >> >>> >> >>
> -->> >> >>>> >>>> >>>> >> _______________________________________________________>> \
> >> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org>> >>
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev>> >> Home
Page: http://linux-ha.org/>> >>> >
_______________________________________________________>> > Linux-HA-Dev:
Linux-HA-Dev@lists.linux-ha.org>> >
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev>> > Home Page:
http://linux-ha.org/>> >>>>>>>
_______________________________________________________>> Linux-HA-Dev:
Linux-HA-Dev@lists.linux-ha.org>>
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev>> Home Page:
http://linux-ha.org/> > > -->      Alan
Robertson<alanr@unix.sh>> > "Openness is the foundation and
preservative of friendship...  Let me claim from you at> all times your
undisguised opinions." - William Wilberforce> >
_______________________________________________________> Linux-HA-Dev:
Linux-HA-Dev@lists.linux-ha.org>
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev> Home Page:
http://linux-ha.org/>


[Attachment #5 (text/html)]

<div align="left"><span lang="en" class="long_text" id="result_box"><span class="hps"
title="Clique para mostrar traduções alternativas">Ok</span><span title="Clique para
mostrar traduções alternativas" class="">,<br />Thank you</span><span title="Clique para
mostrar traduções alternativas" class="">.</span><br /> <span class="hps" title="Clique
para mostrar traduções alternativas">I'm</span> <span class="hps" title="Clique para
mostrar traduções alternativas">trying to</span> <span class="hps" title="Clique para
mostrar traduções alternativas">isolate the</span> <span class="hps" title="Clique para
mostrar traduções alternativas">problem</span> <span class="hps" title="Clique para
mostrar traduções alternativas">to the</span> <span class="hps" title="Clique para
mostrar traduções alternativas">maximum</span><span title="Clique para mostrar traduções
alternativas" class="">,</span> <span class="hps" title="Clique para mostrar traduções
alternativas">so</span> <span class="hps" title="Clique para mostrar traduções
alternativas">I try to</span> <span class="hps" title="Clique para mostrar traduções
alternativas">diagnose</span> <span class="hps" title="Clique para mostrar traduções
alternativas">the problem.</span> <span class="hps" title="Clique para mostrar traduções
alternativas"><br />I've tried</span> <span class="hps" title="Clique para mostrar
traduções alternativas">tools</span> <span class="hps" title="Clique para mostrar
traduções alternativas">like</span> <span class="hps" title="Clique para mostrar
traduções alternativas">sar</span> <span class="hps" title="Clique para mostrar
traduções alternativas">iostat</span> <span class="hps" title="Clique para mostrar
traduções alternativas">to check</span> <span class="hps" title="Clique para mostrar
traduções alternativas">the</span> <span class="hps" title="Clique para mostrar
traduções alternativas">system queries.</span> <span class="hps" title="Clique para
mostrar traduções alternativas"><br />But</span> <span class="hps" title="Clique para
mostrar traduções alternativas">for</span> <span class="hps" title="Clique para mostrar
traduções alternativas">now everything</span> <span class="hps" title="Clique para
mostrar traduções alternativas">without</span> <span class="hps" title="Clique para
mostrar traduções alternativas">problems</span></span></div>
<br />&gt; That's probably OK.  If you're really having a problem, it should<br />&gt;
ordinarily show it up before it causes a false failover.<br />&gt; <br />&gt; Then you
can figure out if you want to raise your timeout or figure out<br />&gt; what's causing
the slow processing.<br />&gt; <br />&gt; <br />&gt; On 05/14/2011 09:08 AM,
gilmarlinux@agrovale.com.br wrote:<br />&gt;&gt; Thanks again.<br />&gt;&gt; deadtime 30
and warntime 15 this good ?<br />&gt;&gt;<br />&gt;&gt; &gt; BUT also either make
warntime smaller or deadtime larger...<br />&gt;&gt; &gt;<br />&gt;&gt; &gt;<br
/>&gt;&gt; &gt; On 5/13/2011 7:48 PM, gilmarlinux@agrovale.com.br wrote:<br />&gt;&gt;
&gt;&gt; Thank you for your attention.<br />&gt;&gt; &gt;&gt; His recommendation and
wait, if only to continue the logs I get<br />&gt;&gt; &gt;&gt; following warning if the
services do not migrate to another server<br />&gt;&gt; &gt;&gt; just keep watching the
logs warning.<br />&gt;&gt; &gt;&gt;<br />&gt;&gt; &gt;&gt; &gt; I typically make
deadtime something like 3 times warntime. That way<br />&gt;&gt; &gt;&gt; &gt; you'll
get data before you get into trouble. When your heartbeats<br />&gt;&gt; &gt;&gt; &gt;
exceed warntime, you get information on how late it is. I would<br />&gt;&gt; &gt;&gt;
&gt; typically make deadtime AT LEAST twice the latest time you've<br />&gt;&gt; ever
seen<br />&gt;&gt; &gt;&gt; &gt; with warntime.<br />&gt;&gt; &gt;&gt; &gt;<br
/>&gt;&gt; &gt;&gt; &gt; If the worst case you ever saw was this 60ms instead of 50ms,
I'd<br />&gt;&gt; look<br />&gt;&gt; &gt;&gt; &gt; somewhere else for the problem.
However, it is possible that you<br />&gt;&gt; have a<br />&gt;&gt; &gt;&gt; &gt;
hardware trouble, or a kernel bug. Possible, but unlikely.<br />&gt;&gt; &gt;&gt;
&gt;<br />&gt;&gt; &gt;&gt; &gt; More logs are always good when looking at a problem
like this.<br />&gt;&gt; &gt;&gt; &gt; hb_report will get you lots of logs and so on for
the next time it<br />&gt;&gt; &gt;&gt; happens.<br />&gt;&gt; &gt;&gt; &gt;<br
/>&gt;&gt; &gt;&gt; &gt; On 05/13/2011 11:44 AM, gilmarlinux@agrovale.com.br wrote:<br
/>&gt;&gt; &gt;&gt; &gt;&gt; Thanks for the help.<br />&gt;&gt; &gt;&gt; &gt;&gt;<br
/>&gt;&gt; &gt;&gt; &gt;&gt; I had a problem the 30 days that began with this post, and
after two<br />&gt;&gt; &gt;&gt; &gt;&gt; days the heartbeat message that the accused
had fallen server1 and<br />&gt;&gt; &gt;&gt; &gt;&gt; services migrated to server2<br
/>&gt;&gt; &gt;&gt; &gt;&gt; Now with this change to eth1 and eth2 for drbd and
heartbeat to the<br />&gt;&gt; &gt;&gt; &gt;&gt; amendment of warntime deadtime 20 to 15
and do not know if this will<br />&gt;&gt; &gt;&gt; &gt;&gt; happen again.<br />&gt;&gt;
&gt;&gt; &gt;&gt; Thanks<br />&gt;&gt; &gt;&gt; &gt;&gt;<br />&gt;&gt; &gt;&gt; &gt;&gt;
&gt; That's related to process dispatch time in the kernel. It might<br />&gt;&gt;
&gt;&gt; be the<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt; case that this expectation is a bit
aggressive (mea culpa).<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br />&gt;&gt; &gt;&gt;
&gt;&gt; &gt; In the mean time, as long as those timings remain close to the<br
/>&gt;&gt; &gt;&gt; &gt;&gt; &gt; expectations (60 vs 50ms) I'd ignore them.<br
/>&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt; Those messages
are meant to debug real-time problems - which you<br />&gt;&gt; &gt;&gt; don't<br
/>&gt;&gt; &gt;&gt; &gt;&gt; &gt; appear to be having.<br />&gt;&gt; &gt;&gt; &gt;&gt;
&gt;<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt; -- Alan Robertson<br />&gt;&gt; &gt;&gt;
&gt;&gt; &gt; alanr@unix.sh<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br />&gt;&gt; &gt;&gt;
&gt;&gt; &gt;<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt; On 05/12/2011 12:54 PM,
gilmarlinux@agrovale.com.br wrote:<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Hello!<br
/>&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; I'm using heartbeat version 3.0.3-2 on debian
squeeze with<br />&gt;&gt; dedicated<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; gigabit
ethernet interface for the heartbeat.<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; But even
this generates the following message:<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; WARN:
Gmain_timeout_dispatch: Dispatch function for send local<br />&gt;&gt; &gt;&gt;
status<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; took too long to execute: 60 ms (&gt; 50
ms) (GSource: 0x101c350)<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; I'm using eth1 to eth2
and to Synchronize DRBD(eth1) HEARBEAT<br />&gt;&gt; &gt;&gt; (eth2).<br />&gt;&gt;
&gt;&gt; &gt;&gt; &gt;&gt; I tried increasing the values deadtime = 20 and 15
warntime<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Interface Gigabit Ethernet controller:
Intel Corporation 82575GB<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Serv.1 and the
Ethernet controller: Broadcom Corporation<br />&gt;&gt; &gt;&gt; NetXtreme II<br
/>&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; BCM5709 in Serv.2<br />&gt;&gt; &gt;&gt; &gt;&gt;
&gt;&gt; Tested using two Broadcom for the heartbeat, also without<br />&gt;&gt;
success.<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br />&gt;&gt; &gt;&gt; &gt;&gt;
&gt;&gt; Thanks<br />&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br />&gt;&gt; &gt;&gt; &gt;&gt;
&gt; --<br />&gt;&gt; &gt;&gt; &gt;&gt;<br />&gt;&gt; &gt;&gt;<br />&gt;&gt; &gt;&gt;<br
/>&gt;&gt; &gt;&gt; _______________________________________________________<br
/>&gt;&gt; &gt;&gt; Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org<br />&gt;&gt; &gt;&gt;
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev<br />&gt;&gt; &gt;&gt; Home
Page: http://linux-ha.org/<br />&gt;&gt; &gt;<br />&gt;&gt; &gt;
_______________________________________________________<br />&gt;&gt; &gt; Linux-HA-Dev:
Linux-HA-Dev@lists.linux-ha.org<br />&gt;&gt; &gt;
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev<br />&gt;&gt; &gt; Home Page:
http://linux-ha.org/<br />&gt;&gt; &gt;<br />&gt;&gt;<br />&gt;&gt;<br />&gt;&gt;
_______________________________________________________<br />&gt;&gt; Linux-HA-Dev:
Linux-HA-Dev@lists.linux-ha.org<br />&gt;&gt;
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev<br />&gt;&gt; Home Page:
http://linux-ha.org/<br />&gt; <br />&gt; <br />&gt; --<br />&gt;      Alan
Robertson&lt;alanr@unix.sh&gt;<br />&gt; <br />&gt; &quot;Openness is the foundation and
preservative of friendship...  Let me claim from you at<br />&gt; all times your
undisguised opinions.&quot; - William Wilberforce<br />&gt; <br />&gt;
_______________________________________________________<br />&gt; Linux-HA-Dev:
Linux-HA-Dev@lists.linux-ha.org<br />&gt;
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev<br />&gt; Home Page:
http://linux-ha.org/<br />&gt;

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic