Neutron with linuxbridge - mac address timeout

asked 2020-07-15 06:16:08 -0600

We stumbled upon a weird behaviour (well, everything works according to specifications but it should be somehow tunable and/or I would like to know how to properly set it up) in linux bridge agent of OpenStack. The problem was observed using pure linux bridges and provider networks (which should be the simplest).

Problem: MAC address of gateway ageing out of the bridge's mac address table (on the Linux hypervisor but could be any other box), due to the low number of frames observed with the virtual MAC of the gateway (as packets incoming from different subnets come from the hardware MAC address of the gateway). The only way a virtual MAC address can be observed is the ARP (either request or response from GW). The ARP from the linux virtual machines is the only way a MAC entry is refreshed, but the ARP is never issued when in use (from the Linux machines). The gateway is set to request ARPs only for 'existing' IPs - lowering the number of packets seen (and the requests happening in 75% of timeout value are unicasted instead of broadcasted). Resulting in bridge flooding all interfaces increasing the Rx packets on virtual machines. The problem occurs less often on hypervisors with the higher number of virtual machines and is less of a problem on hypervisors with lesser number of hypervisors.

I would like to know whether there is any systematic way to evade future occurrence of this. Some kind of configuration option either for Neutron or the hypervisor. And/Or if this behaviour would affect even the OvS/OVN deployments.

Disclaimer: I understand that this topic is not OpenStack-specific. I think that the problem is interesting enough that we should discuss it here.

edit retag flag offensive close merge delete