Queens minimal install nova-compute dies after timeout

asked 2019-03-14 20:22:43 -0500

kyleaschmitt gravatar image

When spinning up a new cluster, minimal queens deployment on ubuntu 16.04, the nova-compute process on the compute nodes error out and never attach to the cloud-controller. nova-compute.log spits out some xml describing the capabilities of the node, and then this:

2019-03-14 17:46:21.942 1670563 WARNING nova.virt.libvirt.driver [req-6c949149-1dc1-47d3-8be1-fdc6bd56e750 - - - - -] Cannot update service status on host "compute2" due to an unexpected exception.: MessagingTimeout: Timed out waiting for a reply to message ID a8a2e90c3cec4ffcbdc9940abc219025
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver Traceback (most recent call last):
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 3779, in _set_host_enabled
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver     service = objects.Service.get_by_compute_host(ctx, CONF.host)
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver   File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 177, in wrapper
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver     args, kwargs)
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver   File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 240, in object_class_action_versions
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver     args=args, kwargs=kwargs)
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 174, in call
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver     retry=self.retry)
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver   File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 131, in _send
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver     timeout=timeout, retry=retry)
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in send
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver     retry=retry)
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 548, in _send
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver     result = self._waiter.wait(msg_id, timeout)
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 440, in wait
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver     message = self.waiters.get(msg_id, timeout=timeout)
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 328, in get
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver     'to message ID %s' % msg_id)
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver MessagingTimeout: Timed out waiting for a reply to message ID a8a2e90c3cec4ffcbdc9940abc219025
2019-03-14 17:46:21.942 1670563 ERROR nova.virt.libvirt.driver
2019-03-14 17:46:21.945 1670563 ERROR oslo_service.service [req-2ff6bfc9-020d-4a93-9b68-62f2a9030485 ...
(more)
edit retag flag offensive close merge delete

Comments

Did you synchronize time? Rabbit MQ and other message queues are very time-sensitive.

Bernd Bausch gravatar imageBernd Bausch ( 2019-03-15 01:18:34 -0500 )edit

Yup. Synchronized with ntp for now. All to a stratum 1.

I see successful connections (no warnings) from neutron processes on the same box, but the nova-compute ones all end with a "WARNING REPORT <blah> client unexpectedly closed TCP connection"

kyleaschmitt gravatar imagekyleaschmitt ( 2019-03-15 10:35:03 -0500 )edit

This was a wild guess. I don't have an answer, but I found this rather interesting slideset on RabbitMQ troubleshooting: https://www.slideshare.net/michaelkli.... Caution: It's a few years old.

Bernd Bausch gravatar imageBernd Bausch ( 2019-03-15 17:36:39 -0500 )edit