nova services fails intermittently to connect to rabbitmq with timeout error

asked 2018-05-31 02:23:11 -0500

VVM gravatar image

I am using Openstack Pike with RabbitMQ 3.6.5 on Erlang 18.3.4.4 as the messaging queue. When the nova services are restarted, one of the nova services (mostly nova-scheduler) will have error in the logs as shown below. It says the service cannot connect to the rabbitmq and the error appears on and on until it is restarted. Instance launch also fails due to this error. The service openstack-nova-<service> start commands exits successfully.

2018-05-31 06:19:44.737 5510 INFO nova.service [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Starting scheduler node (version 16.1.1-1.el7) 2018-05-31 06:19:44.745 5510 DEBUG nova.service [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Creating RPC server for service scheduler start /usr/lib/python2.7/site-packages/nova/service.py:179

2018-05-31 06:19:44.749 5510 DEBUG oslo.messaging._drivers.pool [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Pool creating new connection create /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/pool.py:143

2018-05-31 06:19:44.753 5510 DEBUG oslo.messaging._drivers.impl_rabbit [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] [57700d60-c761-4072-b4a3-d33706864086] Connecting to AMQP server on localhost:5672 __init__ /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py:597

2018-05-31 06:19:44.761 5510 DEBUG nova.scheduler.host_manager [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Found 2 cells: 00000000-0000-0000-0000-000000000000, 4963dd3d-246d-4f9b-9d4b-0cb1ab3032ae _load_cells /usr/lib/python2.7/site-packages/nova/scheduler/host_manager.py:642 2018-05-31 06:19:44.761 5510 DEBUG nova.scheduler.host_manager [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] START:_async_init_instance_info _async_init_instance_info /usr/lib/python2.7/site-packages/nova/scheduler/host_manager.py:421

2018-05-31 06:19:44.763 5510 DEBUG oslo_concurrency.lockutils [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Lock "00000000-0000-0000-0000-000000000000" acquired by "nova.context.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:270

2018-05-31 06:19:44.764 5510 DEBUG oslo_concurrency.lockutils [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Lock "00000000-0000-0000-0000-000000000000" released by "nova.context.get_or_set_cached_cell_and_set_connections" :: held 0.001s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:282

2018-05-31 06:19:49.761 5510 DEBUG oslo.messaging._drivers.impl_rabbit [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] [57700d60-c761-4072-b4a3-d33706864086] Received recoverable error from kombu: on_error /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py:744

2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit Traceback (most recent call last): 2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 494, in _ensured 2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit return fun(args, *kwargs) 2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 569, in __call__ 2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit self.revive(self.connection.default_channel) 2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 819, in default_channel 2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit self.connection 2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 802, in connection 2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit self._connection = self._establish_connection() 2018-05-31 06:19:49.761 5510 ERROR oslo.messaging ... (more)

edit retag flag offensive close merge delete