'[quagga-dev 5314] Re: intermittent communication between bgpd and'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       quagga-dev
Subject:    [quagga-dev 5314] Re: intermittent communication between bgpd and
From:       "Ray Barnes" <rrb () colo4jax ! com>
Date:       2008-04-21 13:02:20
Message-ID: afa5bb590804210602g776e89a0lf861edf583f387a9 () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]

Just a followup after poking around with the code; comments inline:

On 4/18/08, Andrew J. Schorr <aschorr@telemetry-investments.com> wrote:
>
> Hi Ray,
>
> On Fri, Apr 18, 2008 at 04:24:40PM -0400, Ray Barnes wrote:
> > Thanks Andy.  The issue is that bgpd is simply not aggressive enough in
> its
> > communication attempts toward zebra.  If it made an attempt once every
> 2-3
> > seconds, that'd surely satisfy my requirement for HA/failover using BGP.
> > But even with 'watchquagga' which I understand to simply restart quagga
> > daemons if they fail, that solution is not adequate.  Per my previous
> > message, if zebra is restarted for some reason, it will not receive
> updates
> > from bgpd until bgpd has something to update.  If bgpd never makes an
> > update, zebra will not receive routes.  I've seen this condition persist
> > over and over again, for several hours at a time in my environment.
>
> Frankly, I don't see in the code why bgpd would connect to zebra even
> if it had an update.  It looks to me like this connection is attempted
> only by the bgp_init() function, and bgp_init is called once in main
> before entering the main event processing loop.  So there's probably
> something in the code that I'm missing if the behavior is as you say.

You're right - bgpd itself will not try reconnecting - this is handled by
lib/zclient.c.  For example, when something invokes
lib/zclient.c:zclient_send_message, this function checks connectivity and if
it cannot write to the zebra socket, it will 'return
zclient_failed(zclient);'.  But in practicality, the way this gets called,
i.e. bgpd:bgp_zebra.c:bgp_zebra_announce -> lib/zclient.c:zapi_ipv4_route ->
lib/zclient.c:zclient_send_message, does not report an error back to bgpd.
Even if it did, zclient is still responsible for establishing a new
connection into zebra when the existing one is interrupted.  The problem as
I see it, is that if the zebra connection is interrupted for any reason
(someone breaks iptables rules on the box, or zebra dies, etc) it does not
notify bgpd, and thus, bgpd will not resend all of its routes.

I attempted to resolve this by creating a new message type in zebra.h like
'#define ZEBRA_IDLE 99', adding the type to the zebra daemon as a throwaway
case, and writing new functions in bgp_zebra.c invoked by the bgp scanner in
bgp_nexthop.c to send the ZEBRA_IDLE to the zebra daemon every scan
interval, and if the idle fails and the idler function can detect that it
reconnected to zebra, rerun a modified version of bgp_init().  That works
fine but it doesn't resend the prefixes out of the bgpd RIB to zebra.  Not
having touched C in the last 5 years, I'm simply not experienced enough (and
lack the time) to rectify this, even with a stop-gap solution like the one
I've already attempted.  But hopefully I've shed enough light that
eventually, someone who already has a good handle on the internals of the
project can address this.  It seems to me that a major rewrite of the bgpd
interface to zebra is necessary so that it remains cognizant of zebra's
status at all time - both in terms of connectivity, and being able to lookup
zebra's routes to make sure zebra has everything in bgpd's RIB.

I'm not a heavy bgpd user, so I'm not 100% clear on the desirability
> of having bgpd attempt to connect to zebra every few seconds.  The patch
> to do this would not be difficult, but I don't know that this would always
> be desirable behavior for all users.

>From my perspective, it's very much desirable for all users.  In the status
quo as I've pointed out, a reload of both daemons would be required if
communication fails between bgpd and zebra for any reason.  From what I
understand, this is precisely one of the motivating factors for Cisco to
gravitate toward modular IOS.  Needless to say, reloading your entire router
because one process develops a problem is much less than ideal in a
production environment.

>
> > In fact, the thing that prompted all of this digging on my part, was
> that my
> > box lost its default route in zebra.  Although bgpd had defaults from
> both
> > peers, the route simply *fell out* of zebra and the kernel.  That's a
> bug
> > which will definitely preclude my use of quagga as currently deployed,
> and
> > maybe even overall.
>
> It sounds like this issue requires some debugging.  Could you please open
> a bugzilla item on this?

Unfortunately I won't be using quagga in the capacity I had originally
slated, so I won't be of much use pertaining to bug reports on this
particular issue.  All it'll be doing now is route announcement, as I can't
rely on it to populate my FIB with the gaping holes I've outlined herein.  I
can tell you, however, that the route didn't exactly "fall out" as I'd
previously stated.  I checked the screenshot I took at the time (I'll send
you a copy off-list if you like), and it merely says "incomplete" next to
the default route to my .17 route from bgpd (in reference to the previous
bgpd config I posted).  When I went into bgpd, the .17 route was valid and
best.  The only correlation I could come up with is that the time next to
the route in zebra, 15:something, was the same amount of time I had
established the peering session to my .129 router (whereas the peering
session to .17 had been up much longer).  So perhaps bgpd choked, lost its
connection into zebra and into one of my peers, then came back.  If true,
this would point to the same phenomenon I'd mentioned above, about the lack
of synchronicity between the two.  Hope that helps.

-Ray

[Attachment #5 (text/html)]

Just a followup after poking around with the code; comments inline:<br><br>
<div><span class="gmail_quote">On 4/18/08, <b class="gmail_sendername">Andrew J. \
Schorr</b> &lt;<a href="mailto:aschorr@telemetry-investments.com">aschorr@telemetry-investments.com</a>&gt; \
wrote:</span> <blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px \
0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">Hi Ray,<br><br>On Fri, Apr 18, 2008 at \
04:24:40PM -0400, Ray Barnes wrote:<br>&gt; Thanks Andy.&nbsp;&nbsp;The issue is that \
bgpd is simply not aggressive enough in its<br> &gt; communication attempts toward \
zebra.&nbsp;&nbsp;If it made an attempt once every 2-3<br>&gt; seconds, that&#39;d \
surely satisfy my requirement for HA/failover using BGP.<br>&gt; But even with \
&#39;watchquagga&#39; which I understand to simply restart quagga<br> &gt; daemons if \
they fail, that solution is not adequate.&nbsp;&nbsp;Per my previous<br>&gt; message, \
if zebra is restarted for some reason, it will not receive updates<br>&gt; from bgpd \
until bgpd has something to update.&nbsp;&nbsp;If bgpd never makes an<br> &gt; \
update, zebra will not receive routes.&nbsp;&nbsp;I&#39;ve seen this condition \
persist<br>&gt; over and over again, for several hours at a time in my \
environment.<br><br>Frankly, I don&#39;t see in the code why bgpd would connect to \
zebra even<br> if it had an update.&nbsp;&nbsp;It looks to me like this connection is \
attempted<br>only by the bgp_init() function, and bgp_init is called once in \
main<br>before entering the main event processing loop.&nbsp;&nbsp;So there&#39;s \
probably<br> something in the code that I&#39;m missing if the behavior is as you \
say.</blockquote> <div>&nbsp;</div>
<div>You&#39;re right - bgpd itself will not try reconnecting - this is handled by \
lib/zclient.c.&nbsp; For example, when&nbsp;something invokes \
lib/zclient.c:zclient_send_message, this function checks connectivity and if it \
cannot write to the zebra socket, it will &#39;return \
zclient_failed(zclient);&#39;.&nbsp; But in practicality, the way this gets called, \
i.e. bgpd:bgp_zebra.c:bgp_zebra_announce -&gt; lib/zclient.c:zapi_ipv4_route -&gt; \
lib/zclient.c:zclient_send_message, does not report an error back to bgpd.&nbsp; Even \
if it did, zclient is still responsible for establishing a new connection into zebra \
when the existing one is interrupted.&nbsp; The problem as I see it, is that if the \
zebra connection is interrupted for any reason (someone breaks iptables rules on the \
box, or zebra dies, etc) it does not notify bgpd, and thus, bgpd will not resend all \
of its routes.</div>

<div>&nbsp;</div>
<div>I attempted to resolve this by creating a new message type in zebra.h like \
&#39;#define ZEBRA_IDLE 99&#39;, adding the type to the zebra daemon as a throwaway \
case, and writing new functions in bgp_zebra.c invoked by the bgp scanner in \
bgp_nexthop.c to send the ZEBRA_IDLE to the zebra daemon every scan interval, and \
if&nbsp;the idle&nbsp;fails and the idler function can detect that it reconnected to \
zebra, rerun a modified version of bgp_init().&nbsp; That works fine but it \
doesn&#39;t resend the prefixes out of the bgpd RIB to zebra.&nbsp; Not having \
touched C in the last 5 years, I&#39;m simply not experienced enough (and lack the \
time) to rectify this, even with a stop-gap solution like the one I&#39;ve already \
attempted.&nbsp; But hopefully I&#39;ve shed enough light that eventually, someone \
who already has a good handle on the internals of the project can address this.&nbsp; \
It seems to me that a major rewrite of the bgpd interface to zebra is necessary so \
that it remains cognizant of zebra&#39;s status at all time - both in terms of \
connectivity, and being able to lookup zebra&#39;s routes to make sure zebra has \
everything in bgpd&#39;s RIB.</div> <br>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; \
BORDER-LEFT: #ccc 1px solid">I&#39;m not a heavy bgpd user, so I&#39;m not 100% clear \
on the desirability<br>of having bgpd attempt to connect to zebra every few \
seconds.&nbsp;&nbsp;The patch<br> to do this would not be difficult, but I don&#39;t \
know that this would always<br>be desirable behavior for all users.</blockquote> \
<div>&nbsp;</div> <div>From my perspective, it&#39;s very much desirable for all \
users.&nbsp; In the status quo as I&#39;ve pointed out, a reload of both daemons \
would be required if communication fails between bgpd and zebra for any reason.&nbsp; \
From what I understand, this is precisely one of the motivating factors for Cisco to \
gravitate toward modular IOS.&nbsp; Needless to say, reloading your entire router \
because one process develops a problem is much less than ideal in a production \
environment.</div> <br>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; \
BORDER-LEFT: #ccc 1px solid"><br><br>&gt; In fact, the thing that prompted all of \
this digging on my part, was that my<br>&gt; box lost its default route in \
zebra.&nbsp;&nbsp;Although bgpd had defaults from both<br> &gt; peers, the route \
simply *fell out* of zebra and the kernel.&nbsp;&nbsp;That&#39;s a bug<br>&gt; which \
will definitely preclude my use of quagga as currently deployed, and<br>&gt; maybe \
even overall.<br><br>It sounds like this issue requires some \
debugging.&nbsp;&nbsp;Could you please open<br> a bugzilla item on this?</blockquote>
<div>&nbsp;</div>
<div>Unfortunately I won&#39;t be using quagga in the capacity I had originally \
slated, so I won&#39;t be of much use pertaining to bug reports on this particular \
issue.&nbsp; All it&#39;ll be doing now is route announcement, as I can&#39;t rely on \
it to populate my FIB with the gaping holes I&#39;ve outlined herein.&nbsp; I can \
tell you, however, that the route didn&#39;t exactly &quot;fall out&quot; as I&#39;d \
previously stated.&nbsp; I checked the screenshot I took at the time (I&#39;ll send \
you a copy off-list if you like), and it merely says &quot;incomplete&quot; next to \
the default route to my .17 route from bgpd (in reference to the previous bgpd config \
I posted).&nbsp; When I went into bgpd, the .17 route was valid and best.&nbsp; The \
only correlation I could come up with is that the time next to the route in zebra, \
15:something, was the same amount of time I had established the peering session to my \
.129 router (whereas the peering session to .17 had been up much longer).&nbsp; So \
perhaps bgpd choked, lost its connection into zebra and into one of my peers, then \
came back.&nbsp; If true, this would point to the same phenomenon I&#39;d mentioned \
above, about the lack of synchronicity between the two.&nbsp; Hope that helps.</div>

<div>&nbsp;</div>
<div>-Ray</div>
<div>&nbsp;</div></div><br>

_______________________________________________
Quagga-dev mailing list
Quagga-dev@lists.quagga.net
http://lists.quagga.net/mailman/listinfo/quagga-dev

[prev in list] [next in list] [prev in thread] [next in thread]