RabbitMQ: Queues filling up - No consumers

asked 2016-05-11 14:05:40 -0500

mathias gravatar image

Hi, I am setting up Mitaka right now. I noticed that RabbitMQ keeps growing and growing and I could see that there are certain queues that are concerned:

q-agent-notifier-tunnel-update_fanout*
q-agent-notifier-port-update_fanout*
q-agent-notifier-security_group-update_fanout*

None of these queues have any consumers but keep receiving messages. The security_group-update queues receive messages from the

q-agent-notifier-security_group-update_fanout

exchange which is filled with messages by my neutron-server services. Interestingly, I have three neutron-server services but only 2 of which are listed under "Incoming" for that exchange.

I am getting the feeling that something is really fishy in the entire RabbitMQ setup, so I'll continue describing that: I have three nodes running RabbitMQ called amqp00, amqp01 and amqp02 running in a cluster:

Cluster status of node rabbit@amqp00 ...
[{nodes,[{disc,[rabbit@amqp00]},{ram,[rabbit@amqp02,rabbit@amqp01]}]},
 {running_nodes,[rabbit@amqp01,rabbit@amqp02,rabbit@amqp00]},
 {partitions,[]}]
...done.

This is the configuration of amqp01 and amqp02:

% This file managed by Puppet
% Template Path: rabbitmq/templates/rabbitmq.config
[
  {rabbit, [
    {cluster_nodes, {['rabbit@amqp00', 'rabbit@amqp01', 'rabbit@amqp02'], ram}},
    {cluster_partition_handling, ignore},
    {tcp_listen_options,
         [binary,
         {packet,        raw},
         {reuseaddr,     true},
         {backlog,       128},
         {nodelay,       true},
         {exit_on_close, false}]
    },
    {default_user, <<"root">>},
    {default_pass, <<"****************">>}
  ]},
  {kernel, [

  ]}
,
  {rabbitmq_management, [
    {listener, [
      {port, 15672}
    ]}
  ]}
].
% EOF

Amqp00 looks a bit different and I am not sure if that's supposed to be:

% This file managed by Puppet
% Template Path: rabbitmq/templates/rabbitmq.config
[
  {rabbit, [
    {cluster_nodes, {[], ram}},
    {cluster_partition_handling, ignore},
    {tcp_listen_options,
         [binary,
         {packet,        raw},
         {reuseaddr,     true},
         {backlog,       128},
         {nodelay,       true},
         {exit_on_close, false}]
    },
    {default_user, <<"root">>},
    {default_pass, <<"****************">>}
  ]},
  {kernel, [

  ]}
,
  {rabbitmq_management, [
    {listener, [
      {port, 15672}
    ]}
  ]}
].
% EOF

I also set a policy like this:

vhost: openstack
name: ha-all pattern: .* apply to: all
ha-mode: all ha-sync-mode: automatic priority: 0

My neutron-server Services access these via the following configuration:

[DEFAULT]
...
rpc_backend = rabbit
control_exchange = neutron

[oslo_messaging_amqp]

[oslo_messaging_notifications]

[oslo_messaging_rabbit]
rabbit_hosts = amqp00,amqp01,amqp02
rabbit_userid = neutron
rabbit_password = ****************
rabbit_virtual_host = openstack
rabbit_ha_queues = True
heartbeat_timeout_threshold = 20
heartbeat_rate = 2

Other neutron components such as l3 agents, openvswitch agents, etc can only access RabbitMQ via 3-node HAproxy cluster. This cluster uses Keepalived to failover the VIP of 10.10.32.20 which is dedicated for RabbitMQ. HAproxy listens there and forwards traffic to amqp00 through 02:

listen amqp
  bind 10.10.32.20:5672
  mode tcp
  balance roundrobin
  option tcpka
  option tcplog
  timeout client 3h
  timeout server 3h
  server amqp00 192.168.0.20:5672 check fall 3 inter 5s rise 2
  server amqp01 192.168.0.21:5672 check fall 3 inter 5s rise 2
  server amqp02 192.168.0.22:5672 check fall 3 inter 5s rise 2

Configuration of those services looks like this:

[DEFAULT]
...
rpc_backend = rabbit
control_exchange = neutron

[oslo_messaging_amqp]

[oslo_messaging_notifications]

[oslo_messaging_rabbit]
rabbit_host = 10.10.32.20 # VIP of the HAproxy Cluster
rabbit_userid = neutron
rabbit_password = ***************
rabbit_virtual_host = openstack
rabbit_ha_queues = True
heartbeat_timeout_threshold = 20
heartbeat_rate = 2
rabbit_hosts=10.10.32.20

At this point, I have read everything I was abled to find on the internet about this or related problems but I failed to find a solution. Any advice is appreciated ;)

cheers Mathias

PS: As of writing this the q-agent-notifier-security_group-update_fanout* queues suddently experienced a "wild" consumer ;) and messages disappeared. Anything that is q-server-resource-versions_fanout* hangs ... (more)

edit retag flag offensive close merge delete