[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freebsd-net
Subject:    Re: lagg Interfaces - don't do Gratuitous ARP?
From:       Steven Hartland <killing () multiplay ! co ! uk>
Date:       2016-09-22 19:35:50
Message-ID: c8ce344f-79cf-9f0c-72a4-6b9a2dfb1a0d () multiplay ! co ! uk
[Download RAW message or body]

On 22/09/2016 19:03, Gleb Smirnoff wrote:
> On Thu, Sep 22, 2016 at 05:50:09PM +0100, Steven Hartland wrote:
> S> > S> We could but then what happens when its IPv6 or $other protocol that
> S> > S> needs to know? That would require lagg to be edited with all the special
> S> > S> cases instead of allowing the protocol to handle it they way it needs.
> S> >
> S> > You just said that "without GARP devices can and do ignore", didn't you?
> S> > Let's take this as truth, although I doubt. So, if this is the truth, that
> S> > means that if you are running IPv6 only, the switches won't recondigure
> S> > theirselves due to lack of gratious ARP.
> S> Not sure I follow you, gratuitous ARP is required for IPv4 to work, for
> S> IPv6 you need an unsolicited neighbour announcement.
> S> > Other protocols, where PPPoE is good example simply doesn't have any
> S> > analogs of ARP or ND. So what would your switches do in that case? And
> S> > what other layers are you going to hack, if you are going to run PPPoE
> S> > service with lagg failover?
> S> Good question, surely that's a good reason to have each protocol handle
> S> it and not to teach LAGG about every possible protocol?
>
> No. It is not a good reason to have each protocol handle it. It is a
> demonstration that this must be handled by a lower protocol layer - the L2,
> which is the level where problem exists.
>
> S> > In reality, a layer 2 device must forward layer 2 traffic, and must
> S> > reconfigure its forwarding table based on source addresses seen on ports.
> S> > And that's what all devices I've seen do. So what if we actually try
> S> > the approach, I suggested? I can write the patch for you if you want.
> S> The main problem with LAGG in failover mode is ensuring the traffic is
> S> sent to the correct port.
> S>
> S> When you have the scenario where a switch stack believes MAC XYZ is
> S> accessible by port ABC then unless you tell it otherwise it will
> S> continue to believe that and hence send traffic to said port. I'm sure
> S> we'll agree that the standard for doing this for IPv4 is ARP and for
> S> IPv6 is NA.
>
> No, we don't agree on that. I assert that the ARP is standard to map IPv4
> address to physical address, not to a port. Same for NA. The de-facto
> standard for a switch to believe that MAC XYZ is accessible by port ABC
> is looking at the source address of any packet on a port.
>
> S> When using LAGG and we loose the master port we need correct the
> S> connected devices view (both direct and remote) of the world such that
> S> traffic is now sent to a different physical port.
> S>
> S> Back in the day, when switches weren't so "smart", sending a correctly
> S> address packet from the new port would potentially help, but with
> S> smarter switches and stacking in the mix sticking to the "standards"
> S> helps maintain compatibility and hence functionality with things like LAGG.
> S>
> S> Having tested with a number of vendor switches Cisco, Extreme and more
> S> recently Arista only sending gratuitous ARP for IPv4 and unsolicited NA
> S> for IPv6 reliably resulted in rapid failover between LAGG ports.
> S>
> S> Other methods like sending correctly addressed output from the new port
> S> helped, we tested this with outbound pings from IPMI, but still resulted
> S> in noticeable recovery delay.
>
> This means that switches are "smart" and are violating standards. If you want
> to create a hack to deal with that, better keep this hack inside the module
> that is affected by "smart" switches, in the lagg driver. And not plow through
> all levels of network stack to satisfy demands of standard violators.
>
> So, please send a self made gratious ARP packet right from lagg(4). If the
> switches work as you describe, that would work regardless of the actual
> IPv4/IPv6/whatever configuration.
>
> S> > S> Overall, while the proposed change (https://reviews.freebsd.org/D4111)
> S> > S> does involve changes to multiple layers it still feels like the right
> S> > S> approach as it has the right layer dealing with the change instead of
> S> > S> hard-coded assumptions.
> S> >
> S> > Sorry, it doesn't feel like the right approach. :(
> S> Out of interest why has your opinion changed since your post here:
> S> https://lists.freebsd.org/pipermail/freebsd-net/2012-February/031340.html ?
>
> I'm sorry, I didn't look at D4111, expecting that it is exactly the patch
> that was backed out. I will review D4111.
>
They are similar in approach but incorporated additional feedback.

Essentially it still follows your suggestion from 2012 which was:
 > 1) Network protocols should register theirselves on the ifnet_link_event
 >   EVENTHANDLER(9).
 > 2) The inet4 should send gratutious ARP on this event.
 > 3) The inet6 should send NA.

Hence my confusion ;-)

     Regards
     Steve
_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic