[prev in list] [next in list] [prev in thread] [next in thread] 

List:       bird-users
Subject:    Unwanted IPv6 static route flap
From:       Neil Jerram <neil () tigera ! io>
Date:       2020-02-29 13:05:31
Message-ID: CAMGh4hMxbWcDTHWzgTaN77n5CRQkT2__cbUho-+PN1nq6rBtcg () mail ! gmail ! com
[Download RAW message or body]

I am struggling to understand and ideally eliminate an unwanted flap (i.e.
delete and re-add) of an IPv6 route on node M, when a neighbouring node R
restarts, and R is configured to advertise that IPv6 route statically.

Here is the config on R to advertise the route:

protocol static {
   # IP blocks for this host.
   route fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122 blackhole;
}

BIRD6 on node R is killed (with -9) at 12:22:08, and restarts (with the -R
flag) at 12:22:14.

The flap (in the kernel routing table) is detected by running "ip -ts
monitor route" on the monitor node M.  It reports this at 12:22:17:

[2020-02-29T12:22:17.954263] Deleted fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122
via 2001:20::2 dev eth0 proto bird metric 1024 pref medium
[2020-02-29T12:22:17.954470] fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122 via
2001:20::2 dev eth0 proto bird metric 1024 pref medium

Here is the BIRD6 log on R, from when it restarted, until it reached a
steady state.  R's peering to M is "Mesh_2001_20__1".

2020-02-29T12:22:14.214961381Z bird: device1: Initializing
2020-02-29T12:22:14.215018144Z bird: direct1: Initializing
2020-02-29T12:22:14.215035765Z bird: Mesh_2001_20__8: Initializing
2020-02-29T12:22:14.215047665Z bird: Mesh_2001_20__1: Initializing
2020-02-29T12:22:14.215057649Z bird: Mesh_2001_20__3: Initializing
2020-02-29T12:22:14.215144477Z bird: device1: Starting
2020-02-29T12:22:14.215369797Z bird: device1: Connected to table master
2020-02-29T12:22:14.215402544Z bird: device1: State changed to feed
2020-02-29T12:22:14.215431056Z bird: direct1: Starting
2020-02-29T12:22:14.215439899Z bird: direct1: Connected to table master
2020-02-29T12:22:14.215447877Z bird: direct1: State changed to feed
2020-02-29T12:22:14.215456544Z bird: Mesh_2001_20__8: Starting
2020-02-29T12:22:14.215523493Z bird: Mesh_2001_20__3: Starting
2020-02-29T12:22:14.215716006Z bird: Graceful restart started
2020-02-29T12:22:14.215742864Z bird: Started
2020-02-29T12:22:14.215757668Z bird: direct1: State changed to up
2020-02-29T12:22:14.215770554Z bird: device1: State changed to up
2020-02-29T12:22:14.948167365Z bird: Mesh_2001_20__3: Connected to table
master
2020-02-29T12:22:14.948234648Z bird: Mesh_2001_20__3: State changed to wait
2020-02-29T12:22:15.951638881Z bird: Mesh_2001_20__8: Connected to table
master
2020-02-29T12:22:15.951703218Z bird: Mesh_2001_20__8: State changed to wait
2020-02-29T12:22:16.953066987Z bird: Mesh_2001_20__1: Connected to table
master
2020-02-29T12:22:16.953132676Z bird: Mesh_2001_20__1: State changed to wait
2020-02-29T12:22:16.953154658Z bird: Graceful restart done
2020-02-29T12:22:16.95316785Z bird: Mesh_2001_20__8: State changed to feed
2020-02-29T12:22:16.953180658Z bird: Mesh_2001_20__1: State changed to feed
2020-02-29T12:22:16.953194942Z bird: Mesh_2001_20__3: State changed to feed
2020-02-29T12:22:17.953827137Z bird: Mesh_2001_20__8: State changed to up
2020-02-29T12:22:17.953880171Z bird: Mesh_2001_20__1: State changed to up
2020-02-29T12:22:17.953894204Z bird: Mesh_2001_20__3: State changed to up

(There could be some errors here, because I've manually separated logs from
both BIRD and BIRD6 that were going into the same file, and they both log
with prefix "bird:".)

And on M, from when R was killed, until reaching a steady state following R
restart.  M's peering to R is "Mesh_2001_20__2".

2020-02-29T12:22:08.943087384Z bird: Mesh_2001_20__2: State changed to start
2020-02-29T12:22:16.952861163Z bird: Mesh_2001_20__2: State changed to feed
2020-02-29T12:22:16.952950901Z bird: Mesh_2001_20__2: State changed to up

Can anyone help to explain why I get the IPv6 route flap on node M, and if
there is a way of eliminating it?  The peering config on R is

# Template for all BGP clients
template bgp bgp_template {
  debug { states };
  description "Connection to BGP peer";
  local as 64512;
  multihop;
  gateway recursive;
  import all;
  export filter calico_export_to_bgp_peers;
  source address 2001:20::2;
  add paths on;
  graceful restart;
  connect delay time 2;
  connect retry time 5;
  error wait time 5,30;
}
protocol bgp Mesh_2001_20__1 from bgp_template {
  neighbor 2001:20::1 as 64512;
}

and the same on M but with ::1 and ::2 swapped, and __2 instead of __1.

By the way, my setup also has exactly parallel IPv4 config and routes, and
I reliably do _not_ see a similar flap for the corresponding IPv4 route.

Many thanks,
    Neil

[Attachment #3 (text/html)]

<div dir="ltr">I am struggling to understand and ideally eliminate an unwanted flap \
(i.e. delete and re-add) of an IPv6 route on node M, when a neighbouring node R \
restarts, and R is configured to advertise that IPv6 route \
statically.<div><br></div><div>Here is the config on R to advertise the \
route:</div><div><br></div><div>protocol static {<br>     # IP blocks for this \
host.<br>     route fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122 \
blackhole;<br>}<br></div><div><br></div><div>BIRD6 on node R is killed (with -9) at  \
12:22:08, and restarts (with the -R flag) at 12:22:14.</div><div><br></div><div>The \
flap (in the kernel routing table) is detected by running &quot;ip -ts monitor \
route&quot; on the monitor  node M.   It reports this at \
12:22:17:</div><div><br></div><div>[2020-02-29T12:22:17.954263] Deleted \
fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122 via 2001:20::2 dev eth0 proto bird metric 1024 \
pref medium<br>[2020-02-29T12:22:17.954470] fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122 via \
2001:20::2 dev eth0 proto bird metric 1024 pref \
medium<br></div><div><br></div><div>Here is the BIRD6 log on R, from when it \
restarted, until it reached a steady state.   R&#39;s peering to M is \
&quot;Mesh_2001_20__1&quot;.</div><div><br></div><div>2020-02-29T12:22:14.214961381Z \
bird: device1: Initializing<br>2020-02-29T12:22:14.215018144Z bird: direct1: \
Initializing<br>2020-02-29T12:22:14.215035765Z bird: Mesh_2001_20__8: \
Initializing<br>2020-02-29T12:22:14.215047665Z bird: Mesh_2001_20__1: \
Initializing<br>2020-02-29T12:22:14.215057649Z bird: Mesh_2001_20__3: \
Initializing<br>2020-02-29T12:22:14.215144477Z bird: device1: \
Starting<br>2020-02-29T12:22:14.215369797Z bird: device1: Connected to table \
master<br>2020-02-29T12:22:14.215402544Z bird: device1: State changed to \
feed<br>2020-02-29T12:22:14.215431056Z bird: direct1: \
Starting<br>2020-02-29T12:22:14.215439899Z bird: direct1: Connected to table \
master<br>2020-02-29T12:22:14.215447877Z bird: direct1: State changed to \
feed<br>2020-02-29T12:22:14.215456544Z bird: Mesh_2001_20__8: \
Starting<br>2020-02-29T12:22:14.215523493Z bird: Mesh_2001_20__3: \
Starting<br>2020-02-29T12:22:14.215716006Z bird: Graceful restart \
started<br>2020-02-29T12:22:14.215742864Z bird: \
Started<br>2020-02-29T12:22:14.215757668Z bird: direct1: State changed to \
up<br>2020-02-29T12:22:14.215770554Z bird: device1: State changed to \
up<br>2020-02-29T12:22:14.948167365Z bird: Mesh_2001_20__3: Connected to table \
master<br>2020-02-29T12:22:14.948234648Z bird: Mesh_2001_20__3: State changed to \
wait<br>2020-02-29T12:22:15.951638881Z bird: Mesh_2001_20__8: Connected to table \
master<br>2020-02-29T12:22:15.951703218Z bird: Mesh_2001_20__8: State changed to \
wait<br>2020-02-29T12:22:16.953066987Z bird: Mesh_2001_20__1: Connected to table \
master<br>2020-02-29T12:22:16.953132676Z bird: Mesh_2001_20__1: State changed to \
wait<br>2020-02-29T12:22:16.953154658Z bird: Graceful restart \
done<br>2020-02-29T12:22:16.95316785Z bird: Mesh_2001_20__8: State changed to \
feed<br>2020-02-29T12:22:16.953180658Z bird: Mesh_2001_20__1: State changed to \
feed<br>2020-02-29T12:22:16.953194942Z bird: Mesh_2001_20__3: State changed to \
feed<br>2020-02-29T12:22:17.953827137Z bird: Mesh_2001_20__8: State changed to \
up<br>2020-02-29T12:22:17.953880171Z bird: Mesh_2001_20__1: State changed to \
up<br>2020-02-29T12:22:17.953894204Z bird: Mesh_2001_20__3: State changed to \
up<br></div><div><br></div><div>(There could be some errors here, because I&#39;ve \
manually separated logs from both BIRD and BIRD6 that were going into the same file, \
and they both log with prefix &quot;bird:&quot;.)</div><div><br></div><div>And on M, \
from when R was killed, until reaching a steady state following R restart.   M&#39;s \
peering to R is &quot;Mesh_2001_20__2&quot;.</div><div><br></div>2020-02-29T12:22:08.943087384Z \
bird: Mesh_2001_20__2: State changed to start<br>2020-02-29T12:22:16.952861163Z bird: \
Mesh_2001_20__2: State changed to feed<br>2020-02-29T12:22:16.952950901Z bird: \
Mesh_2001_20__2: State changed to up<br><div>  </div><div>Can anyone help to explain \
why I get the IPv6 route flap on node M, and if there is a way of eliminating it?   \
The peering config on R is</div><div><br></div><div># Template for all BGP \
clients<br>template bgp bgp_template {<br>   debug { states };<br>   description \
&quot;Connection to BGP peer&quot;;<br>   local as 64512;<br>   multihop;<br>   \
gateway recursive;<br>   import all;<br>   export filter \
calico_export_to_bgp_peers;<br>   source address 2001:20::2;<br>   add paths on;<br>  \
graceful restart;<br>   connect delay time 2;<br>   connect retry time 5;<br>   error \
wait time 5,30;<br>}<br></div><div>protocol bgp Mesh_2001_20__1 from bgp_template \
{<br>   neighbor 2001:20::1 as 64512;<br>}<br></div><div><br></div><div>and the same \
on M but with ::1 and ::2 swapped, and __2 instead of \
__1.</div><div><br></div><div>By the way, my setup also has exactly parallel IPv4 \
config and routes, and I reliably do _not_ see a similar flap for the corresponding \
IPv4 route.</div><div><br></div><div>Many thanks,</div><div>      \
Neil</div><div><br></div></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic