[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-ha-dev
Subject: RE: [Linux-ha-dev] Bug in CVS version
From: "Zou, Yixiong" <yixiong.zou () intel ! com>
Date: 2003-05-09 20:40:29
[Download RAW message or body]
Thanks Ram.
I am doing it now with 50ms. But still not seeing the "leave" msg though.
Looking into it now and will let you know asap.
------------------------------------------------------------------------
Yixiong Zou (yixiong.zou@intel.com)
(503) 677-4988
All views expressed in this email are those of the individual sender.
> -----Original Message-----
> From: Ram Pai [mailto:linuxram@us.ibm.com]
> Sent: Friday, May 09, 2003 11:58 AM
> To: 'linux-ha-dev@lists.community.tummy.com'
> Subject: RE: [Linux-ha-dev] Bug in CVS version
>
>
>
>
>
> On Fri, 9 May 2003, Zou, Yixiong wrote:
>
> > I see you've already committed it into the CVS. Yes, it does fix
> > the problem. Now I can look into the CCM issue. :)
> Yixiong,
>
> Let me know how it goes.
>
> FYI:
> The ccm protocols internal timeouts have been wired to the
> keepalive timer.
>
> I understand you are trying to run CCM with 10ms
> keepalive timer.
>
> 10ms keepalive timer is too short, because if the round trip
> messaging latency takes more than 10ms, the CCM protocol on some
> nodes will timeout, and hence will lead to multiple partitions.
> Again it all depends on the messaging latency. If the messaging
> latencies are low then CCM should just work fine.
>
> So I suggest just to see what's going wrong with CCM on your
> machine try with something like a 100ms keepalive timer.
>
> And once you know exactly the problem and the fix, you
> can reduce
> the keepalive timer.
>
> Ram Pai
>
> >
> >
> --------------------------------------------------------------
> ----------
> > Yixiong Zou (yixiong.zou@intel.com)
> > (503) 677-4988
> >
> > All views expressed in this email are those of the
> individual sender.
> >
> >
> >
> >
> > > -----Original Message-----
> > > From: Alan Robertson [mailto:alanr@unix.sh]
> > > Sent: Thursday, May 08, 2003 7:39 PM
> > > To: linux-ha-dev@lists.community.tummy.com
> > > Subject: Re: [Linux-ha-dev] Bug in CVS version
> > >
> > >
> > > Zou, Yixiong wrote:
> > > > Ok, finally made some progress regarding with the ping node
> > > + api_test
> > > > problem.
> > > >
> > > > When a "iflist" request is sent to the cluster, instead of
> > > > gettting one msg in the queue of the client, it gets two.
> > > The first one
> > > > is normal, looks like the following:
> > > >
> > > > msg = >>>
> > > > t=hbapi-resp
> > > > reqtype=iflist
> > > > ifname=172.16.1.251
> > > > result=OK
> > > > <<<
> > > >
> > > > the second one is a little odd, with two result=OK in the msg.
> > > >
> > > > msg = >>>
> > > > t=hbapi-resp
> > > > reqtype=iflist
> > > > ifname=172.16.1.251
> > > > result=OK
> > > > result=OK
> > > > <<<
> > >
> > >
> > > Good work!
> > >
> > > I think I understand this...
> > >
> > > The code did *both* api_send_client_msg(), AND returned
> > > I_API_RET. It
> > > should have done one or the other - but not both.
> > >
> > > So, I think this patch will fix it...
> > >
> > >
> ===================================================================
> > > RCS file: /home/cvs/linux-ha/linux-ha/heartbeat/hb_api.c,v
> > > retrieving revision 1.75
> > > diff -u -r1.75 hb_api.c
> > > --- hb_api.c 15 Apr 2003 23:06:53 -0000 1.75
> > > +++ hb_api.c 9 May 2003 02:36:30 -0000
> > > @@ -473,7 +473,6 @@
> > > "cannot mod field/2");
> > > return I_API_IGN;
> > > }
> > > - api_send_client_msg(client, resp);
> > > return I_API_RET;
> > > }
> > > }
> > > @@ -639,7 +638,7 @@
> > > case I_API_IGN:
> > > goto freeandexitresp;
> > > case I_API_RET:
> > > - if (ha_msg_add(resp,
> > > F_APIRESULT, API_OK)
> > > + if (ha_msg_mod(resp,
> > > F_APIRESULT, API_OK)
> > > != HA_OK) {
> > > ha_log(LOG_ERR
> > > ,
> > > "api_process_request:"
> > >
> > >
> ----------------------------------------------------------------------
> > > The real bug was the presence of the api_send_client_msg()
> > > call when the
> > > code also returned I_API_RET itself. The second fix
> > > shouldn't be necessary,
> > > but doesn't hurt anything. The original code
> unconditionally added a
> > > "result=OK" field to the message. It now will modify it if
> > > it's there,
> > > otherwise it will add it.
> > >
> > > Good detective work Yixiong!
> > >
> > > Let me know if this fixes the problem you've been seeing.
> > >
> > > --
> > > Alan Robertson <alanr@unix.sh>
> > >
> > > "Openness is the foundation and preservative of
> > > friendship.... Let me claim
> > > from you at all times your undisguised opinions." - William
> > > Wilberforce
> > >
> > > _______________________________________________________
> > > Linux-HA-Dev: Linux-HA-Dev@lists.community.tummy.com
> > > http://lists.community.tummy.com/mailman/listinfo/linux-ha-dev
> > > Home Page: http://linux-ha.org/
> > >
> > _______________________________________________________
> > Linux-HA-Dev: Linux-HA-Dev@lists.community.tummy.com
> > http://lists.community.tummy.com/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
> >
>
> --
> Ram Pai
> linuxram@us.ibm.com
> 503-5783752
> EVMS: http://www.sf.net/projects/evms
> ----------------------------------
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.community.tummy.com
> http://lists.community.tummy.com/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.community.tummy.com
http://lists.community.tummy.com/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic