Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

nova [Errno 111] ECONNREFUSED after migrating rabbitmq to a separate host

Having a memory shortage on the controlnode (controller), I was forced to move the rabbit message queue service to a separate server (rabbit). The message queue on the new server shows quite many queues (150) but not as many as on the control node before (191).

The nova-conductor.log shows now connection problems during creation attempt of new instances:

2020-03-29 15:55:22.884 5502 ERROR oslo.messaging._drivers.impl_rabbit [req-bfb76a17-41c6-45bb-b397-c1c575f43c10 e685349ef4ec43cba929131ccd7b81fa 89dd119cc70a4e35b2cb6a2dafcb2d02 - default default] Connection failed: [Errno 111] ECONNREFUSED (retrying in 32.0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED

My understanding is that I need to change "only" the parameter "transport_url" in /etc/nova etc. ...

root@control:~# find /etc -type f -exec grep ^transport_url {} \;
transport_url = rabbit://openstack:hsrw12@master
transport_url = rabbit://openstack:hsrw12@master
transport_url = rabbit://openstack:hsrw12@master
transport_url = rabbit://openstack:hsrw12@master
root@control:~# find /etc -type f -exec grep -l ^transport_url {} \;
/etc/heat/heat.conf
/etc/cinder/cinder.conf
/etc/nova/nova.conf
/etc/neutron/neutron.conf
root@control:~#

... but this is obviously not enough.

Any hint which config change I forgot?

nova [Errno 111] ECONNREFUSED after migrating rabbitmq to a separate host

Having a memory shortage on the controlnode (controller), I was forced to move the rabbit message queue service to a separate server (rabbit). The message queue on the new server shows quite many queues (150) but not as many as on the control node before (191).

The nova-conductor.log shows now connection problems during creation attempt of new instances:

2020-03-29 15:55:22.884 5502 ERROR oslo.messaging._drivers.impl_rabbit [req-bfb76a17-41c6-45bb-b397-c1c575f43c10 e685349ef4ec43cba929131ccd7b81fa 89dd119cc70a4e35b2cb6a2dafcb2d02 - default default] Connection failed: [Errno 111] ECONNREFUSED (retrying in 32.0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED

My understanding is that I need to change "only" the parameter "transport_url" in /etc/nova etc. ...

root@control:~# find /etc -type f -exec grep ^transport_url {} \;
transport_url = rabbit://openstack:hsrw12@master
transport_url = rabbit://openstack:hsrw12@master
transport_url = rabbit://openstack:hsrw12@master
transport_url = rabbit://openstack:hsrw12@master
rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
root@control:~# find /etc -type f -exec grep -l ^transport_url {} \;
/etc/heat/heat.conf
/etc/cinder/cinder.conf
/etc/nova/nova.conf
/etc/neutron/neutron.conf
root@control:~#

... but this is obviously not enough.

Any hint which config change I forgot?

nova [Errno 111] ECONNREFUSED after migrating rabbitmq to a separate host

Having a memory shortage on the controlnode (controller), I was forced to move the rabbit message queue service to a separate server (rabbit). The message queue on the new server shows quite many queues (150) but not as many as on the control node before (191).

The nova-conductor.log shows now connection problems during creation attempt of new instances:

2020-03-29 15:55:22.884 5502 ERROR oslo.messaging._drivers.impl_rabbit [req-bfb76a17-41c6-45bb-b397-c1c575f43c10 e685349ef4ec43cba929131ccd7b81fa 89dd119cc70a4e35b2cb6a2dafcb2d02 - default default] Connection failed: [Errno 111] ECONNREFUSED (retrying in 32.0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED

My understanding is that I need to change "only" the parameter "transport_url" in /etc/nova etc. ...

root@control:~# find /etc -type f -exec grep ^transport_url {} \;
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
root@control:~# find /etc -type f -exec grep -l ^transport_url {} \;
/etc/heat/heat.conf
/etc/cinder/cinder.conf
/etc/nova/nova.conf
/etc/neutron/neutron.conf
root@control:~#

... but this is obviously not enough.

Any hint which config change I forgot?

nova [Errno 111] ECONNREFUSED after migrating rabbitmq to a separate host

Having a memory shortage on the controlnode (controller), I was forced to move the rabbit message queue service to a separate server (rabbit). The message queue on the new server shows quite many queues (150) but not as many as on the control node before (191).

The nova-conductor.log shows now connection problems during creation attempt of new instances:

2020-03-29 15:55:22.884 5502 ERROR oslo.messaging._drivers.impl_rabbit [req-bfb76a17-41c6-45bb-b397-c1c575f43c10 e685349ef4ec43cba929131ccd7b81fa 89dd119cc70a4e35b2cb6a2dafcb2d02 - default default] Connection failed: [Errno 111] ECONNREFUSED (retrying in 32.0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED

My understanding is that I need to change "only" the parameter "transport_url" in /etc/nova etc. ...

root@control:~# find /etc -type f -exec grep ^transport_url {} \;
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
root@control:~# find /etc -type f -exec grep -l ^transport_url {} \;
/etc/heat/heat.conf
/etc/cinder/cinder.conf
/etc/nova/nova.conf
/etc/neutron/neutron.conf
root@control:~#

... but this is obviously not enough.

On the compute host I find such messages in /var/log/nova/nova-compute.log:

2020-03-29 14:05:11.973 7780 ERROR oslo_service.periodic_task [req-1692dd36-2dd8-4a69-9372-50c190ad67db - - - - -] Error during ComputeManager._sync_scheduler_instance_info: oslo_messaging.exceptions.MessageDeliveryFailure: Unable to connect to AMQP server on rabbit:5672 after inf tries: Basic.publish: (404) NOT_FOUND - no exchange 'scheduler_fanout' in vhost '/'
2020-03-29 14:05:11.973 7780 ERROR oslo_service.periodic_task Traceback (most recent call last):

Any hint which config change I forgot?

nova [Errno 111] ECONNREFUSED after migrating rabbitmq to a separate host

Having a memory shortage on the controlnode (controller), I was forced to move the rabbit message queue service to a separate server (rabbit). The message queue on the new server shows quite many queues (150) but not as many as on the control node before (191).

The nova-conductor.log shows now connection problems during creation attempt of new instances:

2020-03-29 15:55:22.884 5502 ERROR oslo.messaging._drivers.impl_rabbit [req-bfb76a17-41c6-45bb-b397-c1c575f43c10 e685349ef4ec43cba929131ccd7b81fa 89dd119cc70a4e35b2cb6a2dafcb2d02 - default default] Connection failed: [Errno 111] ECONNREFUSED (retrying in 32.0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED

My understanding is that I need to change "only" the parameter "transport_url" in /etc/nova etc. ...

root@control:~# find /etc -type f -exec grep ^transport_url {} \;
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
root@control:~# find /etc -type f -exec grep -l ^transport_url {} \;
/etc/heat/heat.conf
/etc/cinder/cinder.conf
/etc/nova/nova.conf
/etc/neutron/neutron.conf
root@control:~#

... but this is obviously not enough.

On the compute host I find such messages in /var/log/nova/nova-compute.log:

2020-03-29 14:05:11.973 7780 ERROR oslo_service.periodic_task [req-1692dd36-2dd8-4a69-9372-50c190ad67db - - - - -] Error during ComputeManager._sync_scheduler_instance_info: oslo_messaging.exceptions.MessageDeliveryFailure: Unable to connect to AMQP server on rabbit:5672 after inf tries: Basic.publish: (404) NOT_FOUND - no exchange 'scheduler_fanout' in vhost '/'
2020-03-29 14:05:11.973 7780 ERROR oslo_service.periodic_task Traceback (most recent call last):

The rabbitmq on the new server does not seem to touch limits

root@master:~# rabbitmqctl status | grep 'file_descriptors' -A 10                                                                                                {file_descriptors,
     [{total_limit,65436},
      {total_used,136},
      {sockets_limit,58890},
      {sockets_used,134}]},
 {processes,[{limit,1048576},{used,2003}]},
 {run_queue,0},
 {uptime,673680},
 {kernel,{net_ticktime,60}}]

Any hint which config change I forgot?

nova [Errno 111] ECONNREFUSED after migrating rabbitmq to a separate host

Having a memory shortage on the controlnode (controller), I was forced to move the rabbit message queue service to a separate server (rabbit). The message queue on the new server shows quite many queues (150) but not as many as on the control node before (191).

The nova-conductor.log shows now connection problems during creation attempt of new instances:

2020-03-29 15:55:22.884 5502 ERROR oslo.messaging._drivers.impl_rabbit [req-bfb76a17-41c6-45bb-b397-c1c575f43c10 e685349ef4ec43cba929131ccd7b81fa 89dd119cc70a4e35b2cb6a2dafcb2d02 - default default] Connection failed: [Errno 111] ECONNREFUSED (retrying in 32.0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED

My understanding is that I need to change "only" the parameter "transport_url" in /etc/nova etc. ...

root@control:~# find /etc -type f -exec grep ^transport_url {} \;
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
transport_url = rabbit://openstack:***@rabbit
root@control:~# find /etc -type f -exec grep -l ^transport_url {} \;
/etc/heat/heat.conf
/etc/cinder/cinder.conf
/etc/nova/nova.conf
/etc/neutron/neutron.conf
root@control:~#

... but this is obviously not enough.

On the compute host I find such messages in /var/log/nova/nova-compute.log:

2020-03-29 14:05:11.973 7780 ERROR oslo_service.periodic_task [req-1692dd36-2dd8-4a69-9372-50c190ad67db - - - - -] Error during ComputeManager._sync_scheduler_instance_info: oslo_messaging.exceptions.MessageDeliveryFailure: Unable to connect to AMQP server on rabbit:5672 after inf tries: Basic.publish: (404) NOT_FOUND - no exchange 'scheduler_fanout' in vhost '/'
2020-03-29 14:05:11.973 7780 ERROR oslo_service.periodic_task Traceback (most recent call last):

The rabbitmq on the new server does not seem to touch limits

root@master:~# rabbitmqctl status | grep 'file_descriptors' -A 10                                                                                                {file_descriptors,
     [{total_limit,65436},
      {total_used,136},
      {sockets_limit,58890},
      {sockets_used,134}]},
 {processes,[{limit,1048576},{used,2003}]},
 {run_queue,0},
 {uptime,673680},
 {kernel,{net_ticktime,60}}]

Any hint which config change I forgot?

Additional information requested in the - much appreciated! - comments:

root@master:~#
root@master:~# rabbitmqctl list_vhosts
Listing vhosts
/
root@master:~#
root@master:~# rabbitmqctl list_permissions -p openstack
Listing permissions in vhost "openstack"
Error: no_such_vhost: openstack

root@master:~#
root@master:~# rabbitmqctl list_queues -p openstack | grep scheduler_fanout
root@master:~#

Interesting - in the queue on the new server the scheduler fanout is missing!

In the queue list before, it was there:

root@control:~# cat rabbitmq_queues.before.sorted | grep scheduler_fanout
cinder-scheduler_fanout_f6e79fd5faec42b7b39d340c40203a1d        0
scheduler_fanout_7d3b075d00b54bdd89f9a482adc76c2d       0
scheduler_fanout_d03c3305050c4e249c79e9e153dea10d       0
root@control:~#

To check the rabbitmq connection, I followed the instructions here for a check client - successfully:

root@control:~# ./check_rabbitmq_connection.py --server master --username openstack --password ****
OK

I'll switch back to the old config in order to have more before/after information.