Instances are not reachable after restarting one of the servers
After restarting one of the servers while doing some physical tests the instances are not reachable, not even with ping. I accessed to one of them via the OpenStack web tool console and realized that they are not receiving any configuration from OpenStack and as a consequence they can not reach the gateway but they can ping each other because are in the same LAN.
This solution of OpenStack was installed with the Ubuntu autopilot guide from the home web page.
Using juju GUI I realized that the applications that are installed in the server I restarted are the next ones: Neutron-gateway Base-machine Juju-gui Ceph-osd Ceph-radosgw Ceilometer Neutron-api Keystone Rabbitmq-sever
I found there are two services that are not running, neutron-openvswitch-agent and neutron-vpn-agent. Next are the logs.
for neutron-openvswitch-agent:
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [-] Failed reporting state!
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last):
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 322, in _report_state
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent True)
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/dist-packages/neutron/agent/rpc.py", line 87, in report_state
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent return method(context, 'report_state', **kwargs)
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent retry=self.retry)
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent timeout=timeout, retry=retry)
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 470, in send
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent retry=retry)
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 459, in _send
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent result = self._waiter.wait(msg_id, timeout)
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 342, in wait
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent message = self.waiters.get(msg_id, timeout=timeout)
2017-02-08 18:23:25.312 137224 ERROR neutron.plugins.ml2 ...
Both services don't get a reply from the message queue. Can be a networking problem or timing (NTP server). VPN also can't access the DB at 10.232.213.67. I'd check the networking first; try reaching DB and MQ manually.
I tried reaching DB and MQ, and they are reachable. Both server are synchronized also. The problem may be another.
AMQP server on 10.232.213.58:5672 is unreachable: [Errno 111] ECONNREFUSED
is quite explicit. You need to find out why. So your services run in containers? Perhaps container networking is a different cup of tea than normal networking.I have a group of servers and all the applications are distributed there. In the installation process many containers were created automatically and everything was working before I restarted one of the server. I can see all the applications distributed in the containers using juju GUI.