[prev in list] [next in list] [prev in thread] [next in thread] 

List:       bird-users
Subject:    Re: netlink filtering to avoid clostly FNHE table dumps on Linux
From:       Tomas Hlavacek <tmshlvck () gmail ! com>
Date:       2022-01-09 0:43:36
Message-ID: CAEB7QLChwfG_S7FQzFxkxN2dtaSxu_oB3Am6Y4X9yMEOGewMwg () mail ! gmail ! com
[Download RAW message or body]

Hi Ondrej, all,

On Sat, Jan 8, 2022 at 5:56 AM Ondrej Zajicek <santiago@crfreenet.org> wrote:
> > I believe that many different types of Linux tunnels create the PMTU
> > records for all packets transmitted over the tunnel as well. And it
> > works like that for a long time - the code that creates the route
> > cache (at that time, now it is FNHE table) records has been introduced
> > in Linux 3.10 (https://elixir.bootlin.com/linux/v3.10/source/net/ipv4/ip_tunnel.c#L591).
>
> If i understand it correctly, these PMTU records can also be a result of
> regular TCP communication from/to the router even if there are no tunnels?

Yes, but in most cases the kernel should not create that many PMTU
records. Even with 600 s expiration I would expect several thousands
or hundreds of thousands maximum. I still do not fully understand why
I saw over 130M PMTU records received by BIRD in one scan. Either
there is some multiplication within the dump or there was something
very wrong. Anyway, I am going to analyze the kernel part in more
detail and I will address this in LKML.

> > Regardless of what may or may not happen on the kernel side I think
> > that implementing the netlink filter in BIRD to avoid the described
> > situation makes sense. I am almost certain that my experimental fix
> > breaks other things (most likely OSPF) but I would be glad to help
> > make it right.
>
> How could OSPF be affected by filters on netlink socket?

My experimental patch actually broke kif_do_scan(). It turned out that
there are some (all?) missing link records caused by the
NETLINK_GET_STRICT_CHK sockopt. I guess it breaks device protocol,
which in turn breaks OSPF. In any case OSPF did not start on the GRE
interface (it didn't send or receive any messages) until I fixed the
kif_do_scan(). I think there is an easy way out without needing larger
changes: We can enable NETLINK_GET_STRICT_CHK only for krt_do_scan().
I'll send a new RFC patch shortly.

Best regards,
Tomas
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic