# Compute Nodes Added After Network Creation have dead OVS tap devices

In my deployment of openstack-icehouse, when I configure compute nodes before creating my flat network, and then restart the neutron-dhcp-agent on that compute node, I get results like this with ovs:

[root@vbit10 ~]# ovs-vsctl show
d6fa53f6-df49-47f8-ae8c-92e72eacdefc
Bridge br-vm
Port phy-br-vm
Interface phy-br-vm
Port "eth2"
Interface "eth2"
Port br-vm
Interface br-vm
type: internal
Bridge br-int
fail_mode: secure
Port int-br-vm
Interface int-br-vm
Port br-int
Interface br-int
type: internal
Port "tapf0048ae4-6f"
tag: 1
Interface "tapf0048ae4-6f"
type: internal
ovs_version: "2.1.3"


If I then attempt to configure a compute node in the exact same way after the network is created, the OVS tap device is tagged as 4095, otherwise known as dead:

[root@vbit11 ~]# ovs-vsctl show
4f11a547-421e-49c3-ba81-9c403cab0955
Bridge br-int
fail_mode: secure
Port int-br-vm
Interface int-br-vm
Port "tapda35c485-be"
tag: 4095
Interface "tapda35c485-be"
type: internal
Port br-int
Interface br-int
type: internal
Bridge br-vm
Port br-vm
Interface br-vm
type: internal
Port "eth2"
Interface "eth2"
Port phy-br-vm
Interface phy-br-vm
ovs_version: "2.1.3"


Any VM hosted on the broken compute node cannot be SSH'd to. You can fix this problem by recreating the subnet/network, and then recreating the ovs bridges, and restarting neutron-dhcp-agent and neutron-ovs-agent. However, I am trying to create an ansible playbook that just gets an instance of openstack-icehouse up and running with one shot (and this is the last issue I'm dealing with!). I've followed the openstack-icehouse installation guide fairly accurately aside from using a tenant network, ext-network.

I've done some diagnosing and what I've found is that the binding is failing on these devices. When neutron-server goes to bind the port, it finds the the segment is "None." It looks for the "segment" in the neutron db in the ml2_port_bindings table. If it can't find the "segment" uuid, it fails binding and is tagged as 4095. I tried to find other neutron files that modified the "models.PortBinding" table to see what initially populates the db with information, but I can't seem to find anything.

Here's the table I'm talking about: http://pastebin.com/P2z59hVp

Any help on this? Thanks!

edit retag close merge delete

Sort by » oldest newest most voted

The problem arised because we had the DHCP agent and the OVS agent running on the compute node (we don't have a network node). The DHCP agent on the compute node was telling neutron-server running on the controller to attempt binding before the OVS agent had an opportunity to report it's status to neutron-server.

The solution?

Start the OVS agent, wait 10 seconds, and then start the DHCP agent. There's been a bug open for this for a long time and it's being pushed off to M: https://bugs.launchpad.net/neutron/+bug/1399249 (https://bugs.launchpad.net/neutron/+b...)

more