Ask Your Question
1

ESX neutron bridging and ethernet padding issue

asked 2016-05-26 04:45:52 -0500

sherv gravatar image

updated 2016-05-27 12:54:35 -0500

darragh-oreilly gravatar image

I have a Mitaka lab setup. One of the nodes in this setup is Neutron running on Ubuntu 14.04LTS on a VMware ESXi guest. It has two network interfaces - eth0 for external connectivity and eth1 for internal management and VXLAN termination. On eth0 there's a promiscuos mode enabled in a VMware vswitch. I'm using classic approach with linux-bridging agent.

Theres' a virtual router(r1) residing on this network node which was provisioned by standart neutron utility. It's connected to a project and external network, and has a public IP address from an external subnet. It's connections are managed by a bridge, which consists of two interfaces - eth0 and a tap interface.

When I try to send trafic from r1 to any external IP, it first sends and ARP who-has for a gateway MAC. This ARP packet is switched by the bridge, and bridge updates it's MAC table so r1 MAC address is assosiated with tap interface. But this ARP packet is also 48 bytes long, so it has to be padded to 64 bytes(60 without a CRC) according to the rules of ethernet. The issue is, that padding happens somewhere in a system, so a new 60byte packet gets reinserted in the network stack of a host, so a brdige now sees this new 60byte packet on it's eth0 interface. After that the bridge updates it's MAC table so the r1 MAC address is now assosiated with an eth0 interface of a bridge. Everything that follows is a normal operation. ARP who-has packet gets sent to a network, response ARP is-at packet comes with a dst-MAC of an r1, bridge makes a lookup, sees this MAC address on incoming interface, and drops the packet accoring to a split horizon rule.

Any ideas how to fix this behavior? Packet MUST be padded, but it MUST NOT be reiserted in a network stack in such a fashion that bridge sees this packet again on eth0. Does a fact that network node works on VMware guest has something to do with it? Maybe its VMware e1000 driver behavior?

edit retag flag offensive close merge delete

Comments

can you try a different vmware nic type - like a paravirtualized one?

darragh-oreilly gravatar imagedarragh-oreilly ( 2016-05-26 14:54:17 -0500 )edit

I'm not exactly sure I fully understand you(vitrual machines is not my area of expertise). By parvirtualisation do you mean to try using VMXNET3? If yes, it's planned for tomorrow. By the way, I tried setting up a neutron node on a physical machine, and it works perfectly fine. Looks like it's E1000

sherv gravatar imagesherv ( 2016-05-26 17:34:30 -0500 )edit

yes - try VMXNET3

darragh-oreilly gravatar imagedarragh-oreilly ( 2016-05-27 01:22:10 -0500 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2016-05-27 10:01:22 -0500

sherv gravatar image

VMXNET3 didn't help. I tried this without any Openstack packages - wiped the guest clean, reinstalled Ubuntu, created a bridge, namespace, and connected them with veth interface - still had the same issue. Coudn't reproduce this on ESXi 6.0, or VMware Workstation 12 though, but it's partially reproducing on another ESXi 5.1 host. It's seen only when promiscuos mode is enabled.

Anyway, this is not an Openstack issue, but some weird combination of VMware networking and linux bridging. Issue is not seen on a physical server(without hypervisor software), neutron node works perfecly fine there, so I'll stick with this option for now.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

Stats

Asked: 2016-05-26 04:45:52 -0500

Seen: 225 times

Last updated: May 27 '16