Ask Your Question

Revision history [back]

Floating IP interface (fg) in FIP namespace duplicates packets

Hi guys, I haven't been able to get my head around this for a few days now. I am using Openvswitch networking with ML2 plugin, ARP responder and L2population are on, setup is DVR. Maybe I should also point out the kernel version (3.19.0-42) and network interface kernel module (i40e), because at this point I have no clue what's going on.

The symptomps were unreachable floating IP addressed of the instances and intermittent outside connectivity from inside the instances. First round of debugging led to an interesting result: the bridge I use to connect to external network (br-ex) had the fg- interface MAC address on the wrong side - that is, in the port connecting to the outside world instead of the phy-br-ex port connected to integration bridge (br-int):

(dev) root@computenode:~$ ovs-ofctl show br-ex
OFPT_FEATURES_REPLY (xid=0x2): dpid:00001c40242b758a
n_tables:254, n_buffers:256
 2(vlan420): addr:1c:40:24:2b:75:8a
     config:     0
     state:      0
     current:    10GB-FD
     speed: 10000 Mbps now, 0 Mbps max
 5(phy-br-ex): addr:7a:04:1f:e5:f2:90
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 LOCAL(br-ex): addr:1c:40:24:2b:75:8a
     config:     PORT_DOWN
     state:      LINK_DOWN
     speed: 0 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0

(dev) root@computenode:~$ ovs-appctl fdb/show br-ex
 port  VLAN  MAC                Age
    2     0  fa:16:3e:33:9b:4b    2
    2     0  00:08:e3:ff:fd:90    2

(dev) root@computenode:~$ ip netns

(dev) root@computenode:~$ ip netns exec fip-8b87c295-c6f0-46d4-b6b1-a13b6f50a1fa ifconfig
fg-bd6cc674-ab Link encap:Ethernet  HWaddr fa:16:3e:33:9b:4b
          inet addr:  Bcast:  Mask:
          inet6 addr: fe80::f816:3eff:fe33:9b4b/64 Scope:Link
          RX packets:7885 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1604 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:524844 (524.8 KB)  TX bytes:138281 (138.2 KB)

Note: vlan420 is the interface with access to external network.

As you can see, the fg port's MAC (fa:16:3e:33:9b:4b) is on FDB port 2 of br-ex as opposed to expected port 5. Thus, no packet destined to the floating IP makes it past this point, as its destination MAC is to be found on the port that the packet came from (which results in the packet being dropped).

So I speculated further; something must have looped back a packet going out from FIP namespace through br-ex and vlan420 and forward it back to br-ex - and the poor bridge learned that fg's MAC is on the other side. I started shutting down any redundancy networking and other compute nodes until there was nothing but a switch, one patch cable and the server. The issue persisted.

Sending an ARP or ICMP from inside the FIP namespace to the outside world illustrates the issue. There are always two (almost) identical packets on the external network interface, but one is TX'ed and the other is RX'ed. How on Earth is that possible, I don't know.

(dev) root@computenode:~$ ip netns exec fip-8b87c295-c6f0-46d4-b6b1-a13b6f50a1fa arping -A -I fg-bd6cc674-ab -c 1 -w 1
ARPING from fg-bd6cc674-ab
Sent 1 probes (1 broadcast(s))

.. run in parallel:
(dev) root@computenode:~$ tcpdump -i vlan420 -e -n arp and host
02:59:00.603955 fa:16:3e:33:9b:4b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Reply is-at fa:16:3e:33:9b:4b, length 28
02:59:00.603976 fa:16:3e:33:9b:4b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Reply is-at fa:16:3e:33:9b:4b, length 46

.. the return packet immediatelly hit br-ex from the other side
(dev) root@computenode:~$ ovs-appctl fdb/show br-ex
 port  VLAN  MAC                Age
    2     0  00:08:e3:ff:fd:90    5
    2     0  fa:16:3e:33:9b:4b    2

See the few-microsecond difference and the packet size? That makes me think the packet was looped by some internal mechanism locally, but passed the link layer (since the second one has the minimal Ethernet frame size). But maybe I'm wrong and I will be grateful for any advice.

Blocking incoming packets with fg's source MAC in br-ex's flows gets the job done, but the fact that all traffic from VMs to outside world gets duplicated for the rest of time drives me mad. Thank you for any help.