[prev in list] [next in list] [prev in thread] [next in thread] 

List:       quagga-dev
Subject:    [quagga-dev 8178] Re: [PATCH] bgpd: fix bgp_node locking issues
From:       Balaji G <balajig81 () gmail ! com>
Date:       2010-08-25 14:42:45
Message-ID: AANLkTimg6gpyO5cgqZS+PYmpzsD_B2KOpKxOTik6Yf=D () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


> Where is your branch located (sorry if this is a daft quesion) but seeing
I need a release with this patch applied ....

 http://github.com/balajig/quagga-next


<http://github.com/balajig/quagga-next>Some how it makes me feel whether my
efforts are going void :(

Cheers,
  - Balaji

<http://github.com/balajig/quagga-next>

On Wed, Aug 25, 2010 at 7:50 PM, Richard J Palmer <richard@merula.net>wrote:

>  Ah fun...
>
> Where is your branch located (sorry if this is a daft quesion) but seeing
> I need a release with this patch applied ....
>
>
> On 25/08/2010 15:20, Balaji G wrote:
>
> Hi Richard
>
>  I am maintaining a separate branch in which lot of patches are getting
> merged and once Paul is back he could pull it from this location and hence i
> have applied your patch to my branch which i maintain. I am surprised that a
> new version has been released without merging these fixes so it makes me
> think whether what i maintain is useless ?
>
>  Cheers,
>   - Balaji
>
> On Wed, Aug 25, 2010 at 2:24 PM, Richard J Palmer <richard@merula.net>wrote:
>
>>  Can you confirm if this is in 0.99.17 ?
>>
>> I don't see this patch listed on the changelog
>>
>> Thanks
>>
>> Richard
>>
>>
>> On 28/07/2010 04:45, Balaji G wrote:
>>
>>  Hi Chris
>>
>> Applied. Thanks
>>
>> Thanks,
>> Cheers,
>>   - Balaji
>>
>> On Tue, Jul 27, 2010 at 9:58 PM, Chris Caputo <ccaputo@alt.net> wrote:
>>
>>> Balaji, please consider this for quagga-next.
>>>
>>> Thanks,
>>> Chris
>>>
>>> ---------- Forwarded message ----------
>>> Date: Sun, 25 Apr 2010 20:58:27 +0000 (UTC)
>>> From: Chris Caputo <ccaputo@alt.net>
>>> To: quagga-dev@lists.quagga.net, Denis Ovsienko <infrastation@yandex.ru>
>>> Cc: Richard Palmer <richard@merula.net>
>>> Subject: [quagga-dev 7960] Re: [PATCH] bgpd: fix bgp_node locking issues
>>>
>>> I believe the below patch is ready for inclusion.
>>>
>>> Richard Palmer has been running it for almost 2 months now and it fixed
>>> the assert problem he was experiencing.
>>>
>>> Thanks,
>>> Chris
>>>
>>> ---------- Forwarded message ----------
>>> Date: Sat, 27 Feb 2010 04:45:29 +0000 (UTC)
>>> From: Chris Caputo <ccaputo@alt.net>
>>> To: Richard Palmer <richard@merula.net>
>>> Cc: quagga-dev@lists.quagga.net
>>> Subject: [quagga-dev 7809] [PATCH] bgpd: fix bgp_node locking issues
>>>
>>> On Wed, 16 Dec 2009, Richard Palmer wrote:
>>> > i've just had another crash in BGPd same as all previous:
>>> >
>>> > 2009/12/15 11:01:59 BGP: 195.66.224.250 [Error] bgp_read_packet error:
>>> Connection reset by peer
>>> > 2009/12/15 11:04:14 BGP: Assertion `node->lock > 0' failed in file
>>> bgp_table.c, line 252, function bgp_unlock_node
>>> > 2009/12/15 11:04:14 BGP: Backtrace for 12 stack frames:
>>> > 2009/12/15 11:04:14 BGP: [bt 0]
>>> /usr/local/quagga/lib/libzebra.so.0(zlog_backtrace+0x1f) [0x1571c5]
>>> > 2009/12/15 11:04:14 BGP: [bt 1]
>>> /usr/local/quagga/lib/libzebra.so.0(_zlog_assert_failed+0x99) [0x157322]
>>> > 2009/12/15 11:04:14 BGP: [bt 2] ./bgpd [0xc53dc7]
>>> > 2009/12/15 11:04:14 BGP: [bt 3] ./bgpd [0xc525d4]
>>> > 2009/12/15 11:04:14 BGP: [bt 4] ./bgpd [0xc41e85]
>>> > 2009/12/15 11:04:14 BGP: [bt 5] ./bgpd [0xc42937]
>>> > 2009/12/15 11:04:14 BGP: [bt 6] ./bgpd [0xc490a3]
>>> > 2009/12/15 11:04:14 BGP: [bt 7] ./bgpd(bgp_read+0xa72) [0xc4aa1d]
>>> > 2009/12/15 11:04:14 BGP: [bt 8]
>>> /usr/local/quagga/lib/libzebra.so.0(thread_call+0x67) [0x14b799]
>>> > 2009/12/15 11:04:14 BGP: [bt 9] ./bgpd(main+0x418) [0xc230e1]
>>> > 2009/12/15 11:04:14 BGP: [bt 10] /lib/libc.so.6(__libc_start_main+0xe6)
>>> [0x18b5d6]
>>> > 2009/12/15 11:04:14 BGP: [bt 11] ./bgpd [0xc22bb1]
>>> >
>>> > We have had several of these :(
>>> >
>>> > Any thoughts on how we can fix this ?
>>>
>>> Richard,
>>>
>>> I've worked up a fix the crash you have been experiencing.  Please apply
>>> this patch to your 0.99.15 install and report any problems or success.
>>> Feel free to keep using the diagnostic code I sent, so you can check your
>>> BGP log for the overflow indication we were watching for previously.  Or
>>> feel free to apply this to a virgin 0.99.15 codebase.
>>>
>>> All,
>>>
>>> I've been working with Richard on his crashing issue.  The problem turned
>>> out to be connected table locks being locked but not unlocked, such that
>>> eventually a lock would exceed 2^31 and become negative, thus triggering
>>> an assert later on.
>>>
>>> Patch below...
>>>
>>> Thanks,
>>> Chris
>>>
>>> ---
>>> [PATCH] bgpd: fix bgp_node locking issues
>>>
>>> This patch fixes bugs with respect to bgp_node locking.  Also included
>>> are
>>> improvements to alloc/free & cleanup code.
>>>
>>> Thanks to Richard Palmer for assistance in narrowing down the locking
>>> problem on his system, which was regularly experiencing a lock count
>>> exceeding 2^31 and becoming negative.
>>>
>>> bgpd/bgp_main.c:
>>>  - Improve bgp_exit() cleanup code to enable developers to better detect
>>>   leaks.
>>>
>>> bgpd/bgp_nexthop.c:
>>>  - Correct use of bgp_node_match() and bgp_node_lookup() by calling
>>>   bgp_unlock_node() as appropriate.
>>>  - Track bgp_connected_ref allocations.
>>>  - Fix cleanup code by releasing nexthop cache in bgp_scan_finish().
>>>
>>> bgpd/bgp_route.c:
>>>  - Correct use of bgp_node_match() by calling bgp_unlock_node() as
>>>   appropriate.
>>>
>>> lib/memtypes.c:
>>>  - Add MTYPE_BGP_CONN.
>>>
>>> Signed-off-by: Chris Caputo <ccaputo@alt.net>
>>> ---
>>>  bgpd/bgp_main.c    |   10 ++++++-
>>>  bgpd/bgp_nexthop.c |   22 ++++++++++++---
>>>  bgpd/bgp_route.c   |   77
>>> +++++++++++++++++++++++++++++++---------------------
>>>  lib/memtypes.c     |    1 +
>>>  4 files changed, 74 insertions(+), 36 deletions(-)
>>>
>>> diff --git a/bgpd/bgp_main.c b/bgpd/bgp_main.c
>>> index 9d14683..1a460c6 100644
>>> --- a/bgpd/bgp_main.c
>>> +++ b/bgpd/bgp_main.c
>>> @@ -243,7 +243,15 @@ bgp_exit (int status)
>>>   if (retain_mode)
>>>     if_add_hook (IF_DELETE_HOOK, NULL);
>>>   for (ALL_LIST_ELEMENTS (iflist, node, nnode, ifp))
>>> -    if_delete (ifp);
>>> +    {
>>> +      struct listnode *c_node, *c_nnode;
>>> +      struct connected *c;
>>> +
>>> +      for (ALL_LIST_ELEMENTS (ifp->connected, c_node, c_nnode, c))
>>> +        bgp_connected_delete (c);
>>> +
>>> +      if_delete (ifp);
>>> +    }
>>>   list_free (iflist);
>>>
>>>   /* reverse bgp_attr_init */
>>> diff --git a/bgpd/bgp_nexthop.c b/bgpd/bgp_nexthop.c
>>> index 0cde665..719cb96 100644
>>> --- a/bgpd/bgp_nexthop.c
>>> +++ b/bgpd/bgp_nexthop.c
>>> @@ -276,6 +276,8 @@ bgp_nexthop_lookup_ipv6 (struct peer *peer, struct
>>> bgp_info *ri, int *changed,
>>>
>>>                  if (bnc->metric != oldbnc->metric)
>>>                    bnc->metricchanged = 1;
>>> +
>>> +                  bgp_unlock_node (oldrn);
>>>                }
>>>            }
>>>        }
>>> @@ -365,6 +367,8 @@ bgp_nexthop_lookup (afi_t afi, struct peer *peer,
>>> struct bgp_info *ri,
>>>
>>>                  if (bnc->metric != oldbnc->metric)
>>>                    bnc->metricchanged = 1;
>>> +
>>> +                  bgp_unlock_node (oldrn);
>>>                }
>>>            }
>>>        }
>>> @@ -571,7 +575,7 @@ bgp_connected_add (struct connected *ifc)
>>>        }
>>>       else
>>>        {
>>> -         bc = XCALLOC (0, sizeof (struct bgp_connected_ref));
>>> +         bc = XCALLOC (MTYPE_BGP_CONN, sizeof (struct
>>> bgp_connected_ref));
>>>          bc->refcnt = 1;
>>>          rn->info = bc;
>>>        }
>>> @@ -596,7 +600,7 @@ bgp_connected_add (struct connected *ifc)
>>>        }
>>>       else
>>>        {
>>> -         bc = XCALLOC (0, sizeof (struct bgp_connected_ref));
>>> +         bc = XCALLOC (MTYPE_BGP_CONN, sizeof (struct
>>> bgp_connected_ref));
>>>          bc->refcnt = 1;
>>>          rn->info = bc;
>>>        }
>>> @@ -636,7 +640,7 @@ bgp_connected_delete (struct connected *ifc)
>>>       bc->refcnt--;
>>>       if (bc->refcnt == 0)
>>>        {
>>> -         XFREE (0, bc);
>>> +         XFREE (MTYPE_BGP_CONN, bc);
>>>          rn->info = NULL;
>>>        }
>>>       bgp_unlock_node (rn);
>>> @@ -662,7 +666,7 @@ bgp_connected_delete (struct connected *ifc)
>>>       bc->refcnt--;
>>>       if (bc->refcnt == 0)
>>>        {
>>> -         XFREE (0, bc);
>>> +         XFREE (MTYPE_BGP_CONN, bc);
>>>          rn->info = NULL;
>>>        }
>>>       bgp_unlock_node (rn);
>>> @@ -1136,11 +1140,15 @@ bgp_multiaccess_check_v4 (struct in_addr nexthop,
>>> char *peer)
>>>   rn1 = bgp_node_match (bgp_connected_table[AFI_IP], &p1);
>>>   if (! rn1)
>>>     return 0;
>>> +  bgp_unlock_node (rn1);
>>>
>>>   rn2 = bgp_node_match (bgp_connected_table[AFI_IP], &p2);
>>>   if (! rn2)
>>>     return 0;
>>> +  bgp_unlock_node (rn2);
>>>
>>> +  /* This is safe, even with above unlocks, since we are just
>>> +     comparing pointers to the objects, not the objects themselves. */
>>>   if (rn1 == rn2)
>>>     return 1;
>>>
>>> @@ -1316,6 +1324,9 @@ bgp_scan_init (void)
>>>  void
>>>  bgp_scan_finish (void)
>>>  {
>>> +  /* Only the current one needs to be reset. */
>>> +  bgp_nexthop_cache_reset (bgp_nexthop_cache_table[AFI_IP]);
>>> +
>>>   bgp_table_unlock (cache1_table[AFI_IP]);
>>>   cache1_table[AFI_IP] = NULL;
>>>
>>> @@ -1326,6 +1337,9 @@ bgp_scan_finish (void)
>>>   bgp_connected_table[AFI_IP] = NULL;
>>>
>>>  #ifdef HAVE_IPV6
>>> +  /* Only the current one needs to be reset. */
>>> +  bgp_nexthop_cache_reset (bgp_nexthop_cache_table[AFI_IP6]);
>>> +
>>>   bgp_table_unlock (cache1_table[AFI_IP6]);
>>>   cache1_table[AFI_IP6] = NULL;
>>>
>>> diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c
>>> index a92ca4e..617f150 100644
>>> --- a/bgpd/bgp_route.c
>>> +++ b/bgpd/bgp_route.c
>>> @@ -6554,7 +6554,10 @@ bgp_show_route_in_table (struct vty *vty, struct
>>> bgp *bgp,
>>>               if ((rm = bgp_node_match (table, &match)) != NULL)
>>>                 {
>>>                   if (prefix_check && rm->p.prefixlen != match.prefixlen)
>>> -                    continue;
>>> +                    {
>>> +                      bgp_unlock_node (rm);
>>> +                      continue;
>>> +                    }
>>>
>>>                   for (ri = rm->info; ri; ri = ri->next)
>>>                     {
>>> @@ -6568,6 +6571,8 @@ bgp_show_route_in_table (struct vty *vty, struct
>>> bgp *bgp,
>>>                       display++;
>>>                       route_vty_out_detail (vty, bgp, &rm->p, ri, AFI_IP,
>>> SAFI_MPLS_VPN);
>>>                     }
>>> +
>>> +                  bgp_unlock_node (rm);
>>>                 }
>>>             }
>>>         }
>>> @@ -6591,6 +6596,8 @@ bgp_show_route_in_table (struct vty *vty, struct
>>> bgp *bgp,
>>>                   route_vty_out_detail (vty, bgp, &rn->p, ri, afi, safi);
>>>                 }
>>>             }
>>> +
>>> +          bgp_unlock_node (rn);
>>>         }
>>>     }
>>>
>>> @@ -11348,41 +11355,49 @@ bgp_clear_damp_route (struct vty *vty, const
>>> char *view_name,
>>>
>>>          if ((table = rn->info) != NULL)
>>>            if ((rm = bgp_node_match (table, &match)) != NULL)
>>> -             if (! prefix_check || rm->p.prefixlen == match.prefixlen)
>>> -               {
>>> -                 ri = rm->info;
>>> -                 while (ri)
>>> -                   {
>>> -                     if (ri->extra && ri->extra->damp_info)
>>> -                       {
>>> -                         ri_temp = ri->next;
>>> -                         bgp_damp_info_free (ri->extra->damp_info, 1);
>>> -                         ri = ri_temp;
>>> -                       }
>>> -                     else
>>> -                       ri = ri->next;
>>> -                   }
>>> -               }
>>> +              {
>>> +                if (! prefix_check || rm->p.prefixlen ==
>>> match.prefixlen)
>>> +                  {
>>> +                    ri = rm->info;
>>> +                    while (ri)
>>> +                      {
>>> +                        if (ri->extra && ri->extra->damp_info)
>>> +                          {
>>> +                            ri_temp = ri->next;
>>> +                            bgp_damp_info_free (ri->extra->damp_info,
>>> 1);
>>> +                            ri = ri_temp;
>>> +                          }
>>> +                        else
>>> +                          ri = ri->next;
>>> +                      }
>>> +                  }
>>> +
>>> +                bgp_unlock_node (rm);
>>> +              }
>>>         }
>>>     }
>>>   else
>>>     {
>>>       if ((rn = bgp_node_match (bgp->rib[afi][safi], &match)) != NULL)
>>> -       if (! prefix_check || rn->p.prefixlen == match.prefixlen)
>>> -         {
>>> -           ri = rn->info;
>>> -           while (ri)
>>> -             {
>>> -               if (ri->extra && ri->extra->damp_info)
>>> -                 {
>>> -                   ri_temp = ri->next;
>>> -                   bgp_damp_info_free (ri->extra->damp_info, 1);
>>> -                   ri = ri_temp;
>>> -                 }
>>> -               else
>>> -                 ri = ri->next;
>>> -             }
>>> -         }
>>> +        {
>>> +          if (! prefix_check || rn->p.prefixlen == match.prefixlen)
>>> +            {
>>> +              ri = rn->info;
>>> +              while (ri)
>>> +                {
>>> +                  if (ri->extra && ri->extra->damp_info)
>>> +                    {
>>> +                      ri_temp = ri->next;
>>> +                      bgp_damp_info_free (ri->extra->damp_info, 1);
>>> +                      ri = ri_temp;
>>> +                    }
>>> +                  else
>>> +                    ri = ri->next;
>>> +                }
>>> +            }
>>> +
>>> +          bgp_unlock_node (rn);
>>> +        }
>>>     }
>>>
>>>   return CMD_SUCCESS;
>>> diff --git a/lib/memtypes.c b/lib/memtypes.c
>>> index 05d9322..5902067 100644
>>> --- a/lib/memtypes.c
>>> +++ b/lib/memtypes.c
>>> @@ -108,6 +108,7 @@ struct memory_list memory_list_bgp[] =
>>>   { MTYPE_BGP_NODE,            "BGP node"                      },
>>>   { MTYPE_BGP_ROUTE,           "BGP route"                     },
>>>   { MTYPE_BGP_ROUTE_EXTRA,     "BGP ancillary route info"      },
>>> +  { MTYPE_BGP_CONN,            "BGP connected"                 },
>>>   { MTYPE_BGP_STATIC,          "BGP static"                    },
>>>   { MTYPE_BGP_ADVERTISE_ATTR,  "BGP adv attr"                  },
>>>   { MTYPE_BGP_ADVERTISE,       "BGP adv"                       },
>>>
>>
>>
>> _______________________________________________
>> Quagga-dev mailing listQuagga-dev@lists.quagga.nethttp://lists.quagga.net/mailman/listinfo/quagga-dev
>>
>>
>>
>> _______________________________________________
>> Quagga-dev mailing list
>> Quagga-dev@lists.quagga.net
>> http://lists.quagga.net/mailman/listinfo/quagga-dev
>>
>>
>
>

[Attachment #5 (text/html)]

<meta http-equiv="content-type" content="text/html; charset=utf-8"><div>&gt; Where is \
your branch located (sorry if this is a daft quesion) but seeing   I need a release \
with this patch applied ....</div><div><br></div>  <a \
href="http://github.com/balajig/quagga-next">http://github.com/balajig/quagga-next</a><div>
 <br></div><div><br></div><div><a \
href="http://github.com/balajig/quagga-next"></a>Some how it makes me feel whether my \
efforts are going void :(</div><div><br></div><div>Cheers,</div><div>    - \
Balaji<br><div><br></div><div> <a \
href="http://github.com/balajig/quagga-next"></a><br><br><div class="gmail_quote">On \
Wed, Aug 25, 2010 at 7:50 PM, Richard J Palmer <span dir="ltr">&lt;<a \
href="mailto:richard@merula.net">richard@merula.net</a>&gt;</span> wrote:<br> \
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex;">

  
    
  
  <div bgcolor="#ffffff" text="#000000">
    Ah fun...<br>
    <br>
    Where is your branch located (sorry if this is a daft quesion) but
    seeing   I need a release with this patch applied ....<div><div></div><div \
class="h5"><br>  <br>
    On 25/08/2010 15:20, Balaji G wrote:
    <blockquote type="cite">Hi Richard  
      <div><br>
      </div>
      <div>I am maintaining a separate branch in which lot of patches
        are getting merged and once Paul is back he could pull it from
        this location and hence i have applied your patch to my branch
        which i maintain. I am  surprised  that a new version has been
        released without merging these fixes so it makes me think
        whether what i maintain is useless ?</div>
      <div><br>
      </div>
      <div>Cheers,</div>
      <div>    - Balaji<br>
        <br>
        <div class="gmail_quote">On Wed, Aug 25, 2010 at 2:24 PM,
          Richard J Palmer <span dir="ltr">&lt;<a href="mailto:richard@merula.net" \
target="_blank">richard@merula.net</a>&gt;</span>  wrote:<br>
          <blockquote class="gmail_quote" style="margin:0pt 0pt 0pt \
0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">  <div \
bgcolor="#ffffff" text="#000000"> Can you confirm if  this is in 0.99.17 ?<br>
              <br>
              I don&#39;t see this patch listed on the changelog<br>
              <br>
              Thanks<br>
              <br>
              Richard
              <div>
                <div><br>
                  <br>
                  On 28/07/2010 04:45, Balaji G wrote: </div>
              </div>
              <blockquote type="cite">
                <div>
                  <div>Hi Chris <br>
                    <br>
                    Applied. Thanks <br>
                    <br>
                    Thanks,<br>
                    Cheers,<br>
                       - Balaji<br>
                    <br>
                    <div class="gmail_quote">On Tue, Jul 27, 2010 at
                      9:58 PM, Chris Caputo <span dir="ltr">&lt;<a \
href="mailto:ccaputo@alt.net" target="_blank">ccaputo@alt.net</a>&gt;</span>  \
                wrote:<br>
                      <blockquote class="gmail_quote" style="margin:0pt 0pt 0pt \
0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">Balaji, please  \
consider this for quagga-next.<br>  <br>
                        Thanks,<br>
                        Chris<br>
                        <br>
                        ---------- Forwarded message ----------<br>
                        Date: Sun, 25 Apr 2010 20:58:27 +0000 (UTC)<br>
                        From: Chris Caputo &lt;<a href="mailto:ccaputo@alt.net" \
                target="_blank">ccaputo@alt.net</a>&gt;<br>
                        To: <a href="mailto:quagga-dev@lists.quagga.net" \
                target="_blank">quagga-dev@lists.quagga.net</a>,
                        Denis Ovsienko &lt;<a href="mailto:infrastation@yandex.ru" \
                target="_blank">infrastation@yandex.ru</a>&gt;<br>
                        Cc: Richard Palmer &lt;<a href="mailto:richard@merula.net" \
target="_blank">richard@merula.net</a>&gt;<br>  Subject: [quagga-dev 7960] Re: \
[PATCH] bgpd: fix  bgp_node locking issues<br>
                        <br>
                        I believe the below patch is ready for
                        inclusion.<br>
                        <br>
                        Richard Palmer has been running it for almost 2
                        months now and it fixed<br>
                        the assert problem he was experiencing.<br>
                        <br>
                        Thanks,<br>
                        Chris<br>
                        <br>
                        ---------- Forwarded message ----------<br>
                        Date: Sat, 27 Feb 2010 04:45:29 +0000 (UTC)<br>
                        From: Chris Caputo &lt;<a href="mailto:ccaputo@alt.net" \
                target="_blank">ccaputo@alt.net</a>&gt;<br>
                        To: Richard Palmer &lt;<a href="mailto:richard@merula.net" \
                target="_blank">richard@merula.net</a>&gt;<br>
                        Cc: <a href="mailto:quagga-dev@lists.quagga.net" \
target="_blank">quagga-dev@lists.quagga.net</a><br>  Subject: [quagga-dev 7809] \
[PATCH] bgpd: fix  bgp_node locking issues<br>
                        <br>
                        On Wed, 16 Dec 2009, Richard Palmer wrote:<br>
                        &gt; i&#39;ve just had another crash in BGPd same as
                        all previous:<br>
                        &gt;<br>
                        &gt; 2009/12/15 11:01:59 BGP: 195.66.224.250
                        [Error] bgp_read_packet error: Connection reset
                        by peer<br>
                        &gt; 2009/12/15 11:04:14 BGP: Assertion
                        `node-&gt;lock &gt; 0&#39; failed in file
                        bgp_table.c, line 252, function bgp_unlock_node<br>
                        &gt; 2009/12/15 11:04:14 BGP: Backtrace for 12
                        stack frames:<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 0]
                        /usr/local/quagga/lib/libzebra.so.0(zlog_backtrace+0x1f)
                        [0x1571c5]<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 1]
                        /usr/local/quagga/lib/libzebra.so.0(_zlog_assert_failed+0x99)
                        [0x157322]<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 2] ./bgpd
                        [0xc53dc7]<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 3] ./bgpd
                        [0xc525d4]<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 4] ./bgpd
                        [0xc41e85]<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 5] ./bgpd
                        [0xc42937]<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 6] ./bgpd
                        [0xc490a3]<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 7]
                        ./bgpd(bgp_read+0xa72) [0xc4aa1d]<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 8]
                        /usr/local/quagga/lib/libzebra.so.0(thread_call+0x67)
                        [0x14b799]<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 9]
                        ./bgpd(main+0x418) [0xc230e1]<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 10]
                        /lib/libc.so.6(__libc_start_main+0xe6)
                        [0x18b5d6]<br>
                        &gt; 2009/12/15 11:04:14 BGP: [bt 11] ./bgpd
                        [0xc22bb1]<br>
                        &gt;<br>
                        &gt; We have had several of these :(<br>
                        &gt;<br>
                        &gt; Any thoughts on how we can fix this ?<br>
                        <br>
                        Richard,<br>
                        <br>
                        I&#39;ve worked up a fix the crash you have been
                        experiencing.   Please apply<br>
                        this patch to your 0.99.15 install and report
                        any problems or success.<br>
                        Feel free to keep using the diagnostic code I
                        sent, so you can check your<br>
                        BGP log for the overflow indication we were
                        watching for previously.   Or<br>
                        feel free to apply this to a virgin 0.99.15
                        codebase.<br>
                        <br>
                        All,<br>
                        <br>
                        I&#39;ve been working with Richard on his crashing
                        issue.   The problem turned<br>
                        out to be connected table locks being locked but
                        not unlocked, such that<br>
                        eventually a lock would exceed 2^31 and become
                        negative, thus triggering<br>
                        an assert later on.<br>
                        <br>
                        Patch below...<br>
                        <br>
                        Thanks,<br>
                        Chris<br>
                        <br>
                        ---<br>
                        [PATCH] bgpd: fix bgp_node locking issues<br>
                        <br>
                        This patch fixes bugs with respect to bgp_node
                        locking.   Also included are<br>
                        improvements to alloc/free &amp; cleanup code.<br>
                        <br>
                        Thanks to Richard Palmer for assistance in
                        narrowing down the locking<br>
                        problem on his system, which was regularly
                        experiencing a lock count<br>
                        exceeding 2^31 and becoming negative.<br>
                        <br>
                        bgpd/bgp_main.c:<br>
                          - Improve bgp_exit() cleanup code to enable
                        developers to better detect<br>
                           leaks.<br>
                        <br>
                        bgpd/bgp_nexthop.c:<br>
                          - Correct use of bgp_node_match() and
                        bgp_node_lookup() by calling<br>
                           bgp_unlock_node() as appropriate.<br>
                          - Track bgp_connected_ref allocations.<br>
                          - Fix cleanup code by releasing nexthop cache
                        in bgp_scan_finish().<br>
                        <br>
                        bgpd/bgp_route.c:<br>
                          - Correct use of bgp_node_match() by calling
                        bgp_unlock_node() as<br>
                           appropriate.<br>
                        <br>
                        lib/memtypes.c:<br>
                          - Add MTYPE_BGP_CONN.<br>
                        <br>
                        Signed-off-by: Chris Caputo &lt;<a \
                href="mailto:ccaputo@alt.net" \
                target="_blank">ccaputo@alt.net</a>&gt;<br>
                        ---<br>
                          bgpd/bgp_main.c      |    10 ++++++-<br>
                          bgpd/bgp_nexthop.c |    22 ++++++++++++---<br>
                          bgpd/bgp_route.c    |    77
                        +++++++++++++++++++++++++++++++---------------------<br>
                          lib/memtypes.c       |      1 +<br>
                          4 files changed, 74 insertions(+), 36
                        deletions(-)<br>
                        <br>
                        diff --git a/bgpd/bgp_main.c b/bgpd/bgp_main.c<br>
                        index 9d14683..1a460c6 100644<br>
                        --- a/bgpd/bgp_main.c<br>
                        +++ b/bgpd/bgp_main.c<br>
                        @@ -243,7 +243,15 @@ bgp_exit (int status)<br>
                           if (retain_mode)<br>
                              if_add_hook (IF_DELETE_HOOK, NULL);<br>
                           for (ALL_LIST_ELEMENTS (iflist, node, nnode,
                        ifp))<br>
                        -      if_delete (ifp);<br>
                        +      {<br>
                        +         struct listnode *c_node, *c_nnode;<br>
                        +         struct connected *c;<br>
                        +<br>
                        +         for (ALL_LIST_ELEMENTS
                        (ifp-&gt;connected, c_node, c_nnode, c))<br>
                        +            bgp_connected_delete (c);<br>
                        +<br>
                        +         if_delete (ifp);<br>
                        +      }<br>
                           list_free (iflist);<br>
                        <br>
                           /* reverse bgp_attr_init */<br>
                        diff --git a/bgpd/bgp_nexthop.c
                        b/bgpd/bgp_nexthop.c<br>
                        index 0cde665..719cb96 100644<br>
                        --- a/bgpd/bgp_nexthop.c<br>
                        +++ b/bgpd/bgp_nexthop.c<br>
                        @@ -276,6 +276,8 @@ bgp_nexthop_lookup_ipv6
                        (struct peer *peer, struct bgp_info *ri, int
                        *changed,<br>
                        <br>
                                                  if (bnc-&gt;metric !=
                        oldbnc-&gt;metric)<br>
                                                     bnc-&gt;metricchanged = 1;<br>
                        +<br>
                        +                           bgp_unlock_node (oldrn);<br>
                                               }<br>
                                         }<br>
                                   }<br>
                        @@ -365,6 +367,8 @@ bgp_nexthop_lookup (afi_t
                        afi, struct peer *peer, struct bgp_info *ri,<br>
                        <br>
                                                  if (bnc-&gt;metric !=
                        oldbnc-&gt;metric)<br>
                                                     bnc-&gt;metricchanged = 1;<br>
                        +<br>
                        +                           bgp_unlock_node (oldrn);<br>
                                               }<br>
                                         }<br>
                                   }<br>
                        @@ -571,7 +575,7 @@ bgp_connected_add (struct
                        connected *ifc)<br>
                                   }<br>
                                 else<br>
                                   {<br>
                        -             bc = XCALLOC (0, sizeof (struct
                        bgp_connected_ref));<br>
                        +             bc = XCALLOC (MTYPE_BGP_CONN, sizeof
                        (struct bgp_connected_ref));<br>
                                      bc-&gt;refcnt = 1;<br>
                                      rn-&gt;info = bc;<br>
                                   }<br>
                        @@ -596,7 +600,7 @@ bgp_connected_add (struct
                        connected *ifc)<br>
                                   }<br>
                                 else<br>
                                   {<br>
                        -             bc = XCALLOC (0, sizeof (struct
                        bgp_connected_ref));<br>
                        +             bc = XCALLOC (MTYPE_BGP_CONN, sizeof
                        (struct bgp_connected_ref));<br>
                                      bc-&gt;refcnt = 1;<br>
                                      rn-&gt;info = bc;<br>
                                   }<br>
                        @@ -636,7 +640,7 @@ bgp_connected_delete (struct
                        connected *ifc)<br>
                                 bc-&gt;refcnt--;<br>
                                 if (bc-&gt;refcnt == 0)<br>
                                   {<br>
                        -             XFREE (0, bc);<br>
                        +             XFREE (MTYPE_BGP_CONN, bc);<br>
                                      rn-&gt;info = NULL;<br>
                                   }<br>
                                 bgp_unlock_node (rn);<br>
                        @@ -662,7 +666,7 @@ bgp_connected_delete (struct
                        connected *ifc)<br>
                                 bc-&gt;refcnt--;<br>
                                 if (bc-&gt;refcnt == 0)<br>
                                   {<br>
                        -             XFREE (0, bc);<br>
                        +             XFREE (MTYPE_BGP_CONN, bc);<br>
                                      rn-&gt;info = NULL;<br>
                                   }<br>
                                 bgp_unlock_node (rn);<br>
                        @@ -1136,11 +1140,15 @@ bgp_multiaccess_check_v4
                        (struct in_addr nexthop, char *peer)<br>
                           rn1 = bgp_node_match
                        (bgp_connected_table[AFI_IP], &amp;p1);<br>
                           if (! rn1)<br>
                              return 0;<br>
                        +   bgp_unlock_node (rn1);<br>
                        <br>
                           rn2 = bgp_node_match
                        (bgp_connected_table[AFI_IP], &amp;p2);<br>
                           if (! rn2)<br>
                              return 0;<br>
                        +   bgp_unlock_node (rn2);<br>
                        <br>
                        +   /* This is safe, even with above unlocks,
                        since we are just<br>
                        +       comparing pointers to the objects, not the
                        objects themselves. */<br>
                           if (rn1 == rn2)<br>
                              return 1;<br>
                        <br>
                        @@ -1316,6 +1324,9 @@ bgp_scan_init (void)<br>
                          void<br>
                          bgp_scan_finish (void)<br>
                          {<br>
                        +   /* Only the current one needs to be reset. */<br>
                        +   bgp_nexthop_cache_reset
                        (bgp_nexthop_cache_table[AFI_IP]);<br>
                        +<br>
                           bgp_table_unlock (cache1_table[AFI_IP]);<br>
                           cache1_table[AFI_IP] = NULL;<br>
                        <br>
                        @@ -1326,6 +1337,9 @@ bgp_scan_finish (void)<br>
                           bgp_connected_table[AFI_IP] = NULL;<br>
                        <br>
                          #ifdef HAVE_IPV6<br>
                        +   /* Only the current one needs to be reset. */<br>
                        +   bgp_nexthop_cache_reset
                        (bgp_nexthop_cache_table[AFI_IP6]);<br>
                        +<br>
                           bgp_table_unlock (cache1_table[AFI_IP6]);<br>
                           cache1_table[AFI_IP6] = NULL;<br>
                        <br>
                        diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c<br>
                        index a92ca4e..617f150 100644<br>
                        --- a/bgpd/bgp_route.c<br>
                        +++ b/bgpd/bgp_route.c<br>
                        @@ -6554,7 +6554,10 @@ bgp_show_route_in_table
                        (struct vty *vty, struct bgp *bgp,<br>
                                             if ((rm = bgp_node_match (table,
                        &amp;match)) != NULL)<br>
                                                {<br>
                                                   if (prefix_check &amp;&amp;
                        rm-&gt;p.prefixlen != match.prefixlen)<br>
                        -                              continue;<br>
                        +                              {<br>
                        +                                 bgp_unlock_node (rm);<br>
                        +                                 continue;<br>
                        +                              }<br>
                        <br>
                                                   for (ri = rm-&gt;info; ri; ri
                        = ri-&gt;next)<br>
                                                      {<br>
                        @@ -6568,6 +6571,8 @@ bgp_show_route_in_table
                        (struct vty *vty, struct bgp *bgp,<br>
                                                         display++;<br>
                                                         route_vty_out_detail (vty,
                        bgp, &amp;rm-&gt;p, ri, AFI_IP, SAFI_MPLS_VPN);<br>
                                                      }<br>
                        +<br>
                        +                           bgp_unlock_node (rm);<br>
                                                }<br>
                                          }<br>
                                    }<br>
                        @@ -6591,6 +6596,8 @@ bgp_show_route_in_table
                        (struct vty *vty, struct bgp *bgp,<br>
                                                   route_vty_out_detail (vty,
                        bgp, &amp;rn-&gt;p, ri, afi, safi);<br>
                                                }<br>
                                          }<br>
                        +<br>
                        +               bgp_unlock_node (rn);<br>
                                    }<br>
                              }<br>
                        <br>
                        @@ -11348,41 +11355,49 @@ bgp_clear_damp_route
                        (struct vty *vty, const char *view_name,<br>
                        <br>
                                      if ((table = rn-&gt;info) != NULL)<br>
                                         if ((rm = bgp_node_match (table,
                        &amp;match)) != NULL)<br>
                        -                   if (! prefix_check ||
                        rm-&gt;p.prefixlen == match.prefixlen)<br>
                        -                      {<br>
                        -                         ri = rm-&gt;info;<br>
                        -                         while (ri)<br>
                        -                            {<br>
                        -                               if (ri-&gt;extra
                        &amp;&amp; ri-&gt;extra-&gt;damp_info)<br>
                        -                                  {<br>
                        -                                     ri_temp = \
                ri-&gt;next;<br>
                        -                                     bgp_damp_info_free
                        (ri-&gt;extra-&gt;damp_info, 1);<br>
                        -                                     ri = ri_temp;<br>
                        -                                  }<br>
                        -                               else<br>
                        -                                  ri = ri-&gt;next;<br>
                        -                            }<br>
                        -                      }<br>
                        +                     {<br>
                        +                        if (! prefix_check ||
                        rm-&gt;p.prefixlen == match.prefixlen)<br>
                        +                           {<br>
                        +                              ri = rm-&gt;info;<br>
                        +                              while (ri)<br>
                        +                                 {<br>
                        +                                    if (ri-&gt;extra
                        &amp;&amp; ri-&gt;extra-&gt;damp_info)<br>
                        +                                       {<br>
                        +                                          ri_temp =
                        ri-&gt;next;<br>
                        +                                          bgp_damp_info_free
                        (ri-&gt;extra-&gt;damp_info, 1);<br>
                        +                                          ri = ri_temp;<br>
                        +                                       }<br>
                        +                                    else<br>
                        +                                       ri = ri-&gt;next;<br>
                        +                                 }<br>
                        +                           }<br>
                        +<br>
                        +                        bgp_unlock_node (rm);<br>
                        +                     }<br>
                                    }<br>
                              }<br>
                           else<br>
                              {<br>
                                 if ((rn = bgp_node_match
                        (bgp-&gt;rib[afi][safi], &amp;match)) != NULL)<br>
                        -          if (! prefix_check || rn-&gt;p.prefixlen
                        == match.prefixlen)<br>
                        -             {<br>
                        -                ri = rn-&gt;info;<br>
                        -                while (ri)<br>
                        -                   {<br>
                        -                      if (ri-&gt;extra &amp;&amp;
                        ri-&gt;extra-&gt;damp_info)<br>
                        -                         {<br>
                        -                            ri_temp = ri-&gt;next;<br>
                        -                            bgp_damp_info_free
                        (ri-&gt;extra-&gt;damp_info, 1);<br>
                        -                            ri = ri_temp;<br>
                        -                         }<br>
                        -                      else<br>
                        -                         ri = ri-&gt;next;<br>
                        -                   }<br>
                        -             }<br>
                        +            {<br>
                        +               if (! prefix_check ||
                        rn-&gt;p.prefixlen == match.prefixlen)<br>
                        +                  {<br>
                        +                     ri = rn-&gt;info;<br>
                        +                     while (ri)<br>
                        +                        {<br>
                        +                           if (ri-&gt;extra &amp;&amp;
                        ri-&gt;extra-&gt;damp_info)<br>
                        +                              {<br>
                        +                                 ri_temp = ri-&gt;next;<br>
                        +                                 bgp_damp_info_free
                        (ri-&gt;extra-&gt;damp_info, 1);<br>
                        +                                 ri = ri_temp;<br>
                        +                              }<br>
                        +                           else<br>
                        +                              ri = ri-&gt;next;<br>
                        +                        }<br>
                        +                  }<br>
                        +<br>
                        +               bgp_unlock_node (rn);<br>
                        +            }<br>
                              }<br>
                        <br>
                           return CMD_SUCCESS;<br>
                        diff --git a/lib/memtypes.c b/lib/memtypes.c<br>
                        index 05d9322..5902067 100644<br>
                        --- a/lib/memtypes.c<br>
                        +++ b/lib/memtypes.c<br>
                        @@ -108,6 +108,7 @@ struct memory_list
                        memory_list_bgp[] =<br>
                           { MTYPE_BGP_NODE,                  &quot;BGP node&quot;    \
  },<br>
                           { MTYPE_BGP_ROUTE,                &quot;BGP route&quot;    \
  },<br>
                           { MTYPE_BGP_ROUTE_EXTRA,       &quot;BGP ancillary
                        route info&quot;         },<br>
                        +   { MTYPE_BGP_CONN,                  &quot;BGP \
connected&quot;  },<br>
                           { MTYPE_BGP_STATIC,               &quot;BGP static&quot;   \
  },<br>
                           { MTYPE_BGP_ADVERTISE_ATTR,   &quot;BGP adv attr&quot;   
                                               },<br>
                           { MTYPE_BGP_ADVERTISE,          &quot;BGP adv&quot;        \
  },<br>
                      </blockquote>
                    </div>
                    <br>
                  </div>
                </div>
                <pre><fieldset></fieldset>
_______________________________________________
Quagga-dev mailing list
<a href="mailto:Quagga-dev@lists.quagga.net" \
target="_blank">Quagga-dev@lists.quagga.net</a> <a \
href="http://lists.quagga.net/mailman/listinfo/quagga-dev" \
target="_blank">http://lists.quagga.net/mailman/listinfo/quagga-dev</a> </pre>
              </blockquote>
              <br>
            </div>
            <br>
            _______________________________________________<br>
            Quagga-dev mailing list<br>
            <a href="mailto:Quagga-dev@lists.quagga.net" \
                target="_blank">Quagga-dev@lists.quagga.net</a><br>
            <a href="http://lists.quagga.net/mailman/listinfo/quagga-dev" \
target="_blank">http://lists.quagga.net/mailman/listinfo/quagga-dev</a><br>  <br>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br></div></div>



_______________________________________________
Quagga-dev mailing list
Quagga-dev@lists.quagga.net
http://lists.quagga.net/mailman/listinfo/quagga-dev


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic