kolla/ubuntu-source-neutron-openvswitch-agent:pike keeps restarting

asked 2018-08-23 09:42:34 -0500

Timm gravatar image

Dear Kolla-Users/Developers,

I am new to Kolla and OpenStack, but I could not find my problem answered.

I installed kolla-ansible as all-in-one on a node as first step for a multinode deployment, yesterday. The installation finished without errors. Today, I noticed messages like "could not add network device br-tun to ofproto (Too many open files)" in "ovs-vsctl show". Lsof or netstat/ss were not working in reasonable time, probably due to too many open files. I stopped docker and openvswitch to return to working lsof&co. After restart of OVS and docker everything came up again, but neutron_openvswitch_agent keeps restarting every few seconds (don't know, if it did before). The messages in "ovs-vsctl show" remain.

In /var/log/kolla/neutron/neutron-openvswitch-agent.log on fluentd, I find:

2018-08-23 15:54:21.746 8 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp [-] Agent main thread died of an exception: TimeoutException: Commands [<ovsdbapp.schema.open_vswitch.commands.AddBridgeCommand object at 0x7f83771c2ad0>, <ovsdbapp.schema.open_vswitch.commands.DbAddCommand object at 0x7f83771c2bd0>] exceeded timeout 10 seconds
2018-08-23 15:54:21.746 8 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp Traceback (most recent call last):
2018-08-23 15:54:21.746 8 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp   File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_ryuapp.py", line 40, in agent_main_wrapper
...
2018-08-23 15:54:21.752 8 ERROR ovsdbapp.schema.open_vswitch.impl_idl TimeoutException: Commands [<ovsdbapp.schema.open_vswitch.commands.AddBridgeCommand object at 0x7f83771c2ad0>, <ovsdbapp.schema.open_vswitch.commands.DbAddCommand object at 0x7f83771c2bd0>] exceeded timeout 10 seconds
2018-08-23 15:54:21.752 8 ERROR ovsdbapp.schema.open_vswitch.impl_idl 
2018-08-23 15:54:21.757 8 CRITICAL neutron [-] Unhandled error: TimeoutException: Commands [<ovsdbapp.schema.open_vswitch.commands.AddBridgeCommand object at 0x7f83771c2ad0>, <ovsdbapp.schema.open_vswitch.commands.DbAddCommand object at 0x7f83771c2bd0>] exceeded timeout 10 seconds

Why is RYU started/required? I could not find anything about it in the Kolla documentation. Maybe my problem is due to having an OVS bridge already installed on the host prior to installing Kolla. It is a MLAG-bond with two VLANs for Ceph storage (storage_interface) and SDN (tunnel_interface) in the planned multinode setup. As far as I can tell, my OVS configuration is overwritten by Kolla, which is fine for the all-in-one installation, but probably requires further steps for the multi-node setup.

So my questions are:

  • How can I get neutron-openvswitch-agent properly running?
  • Is the "too many open files" problem a consequence of too many restarts of this docker container?
  • Can I use OVS to configure bonds/vlans to be used in Kolla as network interfaces?

Further details: Server: Dell PowerEdge R640, 40 cores, 128GB RAM, 128GB SSD; Ubuntu 18.04 LTS Kolla/Kolla-Ansible from GitHub, branch pike; minor changes not to install outdated version of Docker without support for Ubuntu 18.04 and installed docker-ce (18.06.1~ce~3-0~ubuntu). Pre-build ubuntu-source images of Kolla for Pike.

Thank you!

Timm

edit retag flag offensive close merge delete