RabbitMQ: "vhost '/nova' is down" ("handshake_error")

asked 2020-03-02 17:07:30 -0500

CKi gravatar image

After a complete system failure due to a broken switch, RabbitMQ no longer works. Nova and Neutron both report errors on Rabbit:

Nova Scheduler: ERROR oslo.messaging._drivers.impl_rabbit [req-df5ed105-8ace-4b46-8338-bef2703b53ba - - - - -] Unable to connect to AMQP server on 172.29.236.254:5671 after None tries: Connection.open: (541) INTERNAL_ERROR - access to vhost '/nova' refused for user 'nova': vhost '/nova' is down: InternalError: Connection.open: (541) INTERNAL_ERROR - access to vhost '/nova' refused for user 'nova': vhost '/nova' is down

Nova Conductor: ERROR oslo_service.service [req-966eb127-b744-4aff-8cef-d99c88180494 - - - - -] Error starting thread.: MessageDeliveryFailure: Unable to connect to AMQP server on 172.29.236.254:5671 after None tries: Connection.open: (541) INTERNAL_ERROR - access to vhost '/nova' refused for user 'nova': vhost '/nova' is down

Nova Auth: ERROR oslo_service.service MessageDeliveryFailure: Unable to connect to AMQP server on 172.29.236.254:5671 after None tries: Connection.open: (541) INTERNAL_ERROR - access to vhost '/nova' refused for user 'nova': vhost '/nova' is down

Neutron DHCP: ERROR oslo_messaging._drivers.amqpdriver [-] Failed to process incoming message, retrying...: MessageDeliveryFailure: Unable to connect to AMQP server on 172.29.236.254:5671 after None tries: Connection.open: (541) INTERNAL_ERROR - access to vhost '/neutron' refused for user 'neutron': vhost '/neutron' is down

Obviously these errors all have in common that the "vhost is down". Rabbit MQ reports similar errors: 2020-03-02 22:45:14.003 [error] <0.9336.110> Error on AMQP connection <0.9336.110> (172.29.238.25:41662 -> 172.29.236.254:5671 - neutron-l3-agent:19718:d44d78bb-3d13-45aa-8956-2c820feea111, vhost: 'none', user: 'neutron', state: opening), channel 0: {handshake_error,opening, {amqp_error,internal_error, "access to vhost '/neutron' refused for user 'neutron': vhost '/neutron' is down", 'connection.open'}}

The "handshake_error" makes me think there is a problem with SSL. I tried deactivating SSL in nova.conf but it did not solve the errors. What can I do to solve the problem?

edit retag flag offensive close merge delete

Comments

Is that rabbit error from trying to restart rabbitmq? Is it one control node running rabbitmq or several? Changing anything in nova etc won’t help if rabbitmq isn’t running.

eblock gravatar imageeblock ( 2020-03-03 03:10:47 -0500 )edit

@eblock I have three nodes which all run RabbitMQ. rabbitmqctl cluster_status reports all three as running_nodes. For some reason, node3 (and only node3) report an alarm for node2's rabbit: {badrpc,nodedown}. Removing node3 from the cluster does not resolve the problem.

CKi gravatar imageCKi ( 2020-03-03 04:31:32 -0500 )edit

Maybe you can inject the vhost config again, I don’t have it at hand right now but it should be in the deployment guide.

eblock gravatar imageeblock ( 2020-03-03 23:43:50 -0500 )edit