# neutron/ovs intermittent network failures

We're having some very strange networking issues. After a fresh install of OpenStack (using Fuel) everything seems to work alright for a couple of weeks. After running for a while, however, new instances will randomly fail to get an IP and after a while longer no instance will have any connectivity. Rebooting the servers sometimes helps, sometimes not (as I write this I've just rebooted into a "working" state).

Using tcdump I've seen that the dhcp packet arrives on the "tap" interface of the VM on the compute node but then it stops there and never reaches the physical interface -- so I'm guessing the problem lies here but I have no idea how/why.

I've been trying to compare the different outputs from ovs commands/logs from a working/non-working case but I can't really see any difference.

How can I go about troubleshooting this issue? What might actually cause the packets from not going past the bridge?

Currently we have a small setup with one controller node and one compute node, and we use vxlan for tunneling. Example output from "ovs-vsctl show" (not sure what else might be of interest):

root@node-13:~# ovs-vsctl show
fb7a1361-742a-4389-bc8c-c05b285a6ff6
Manager "tcp:192.168.3.3:6640"
is_connected: true
Bridge br-ex
Port "p_aec4d661-0"
Interface "p_aec4d661-0"
type: internal
Port br-ex
Interface br-ex
type: internal
Bridge br-int
Controller "tcp:192.168.3.3:6653"
is_connected: true
fail_mode: secure
Port "tap5bab7f58-46"
Interface "tap5bab7f58-46"
Port "vxlan-192.168.16.2"
Interface "vxlan-192.168.16.2"
type: vxlan
options: {key=flow, local_ip="192.168.16.3", remote_ip="192.168.16.2"}
Port "tap184ce9d7-47"
Interface "tap184ce9d7-47"
Port "tap87de89b5-22"
Interface "tap87de89b5-22"
Port "tap962f2f6d-09"
Interface "tap962f2f6d-09"
Port br-int
Interface br-int
type: internal
Port "tape4f821b8-9b"
Interface "tape4f821b8-9b"
ovs_version: "2.5.90"

edit retag close merge delete