Error synchronizing routers on DVR l3_agent
I've configured an Openstack Juno under Ubuntu 14.04 and I've configured distributed router (DVR) in all compute nodes. So, in every compute node and in the neutron node (l3-sdvr) there seem to be some errors related to RPC tmeouts.
I think this problem is related to some strange behaviours with floating IPs, once I associate a virtual IP with an instance it takes quite long time to respond. It seems to work, but 5, 10 or even 20 minutes later.
"neutron-server" proccess uses about 90% CPU. I have almost 40 routers and over 60 networks.
The error message I can read several times this message:
2015-04-13 08:21:02.335 33360 ERROR neutron.agent.l3_agent [-] Failed synchronizing routers due to RPC error
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent Traceback (most recent call last):
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/dist-packages/neutron/agent/l3_agent.py", line 1896, in _sync_routers_task
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent context, router_ids)
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/dist-packages/neutron/agent/l3_agent.py", line 105, in get_routers
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent router_ids=router_ids))
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/dist-packages/neutron/common/log.py", line 34, in wrapper
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent return method(*args, **kwargs)
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/dist-packages/neutron/comg/_drivers/amqpdriver.py", line 408, in send
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent retry=retry)
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 397, in _send
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent result = self._waiter.wait(msg_id, timeout)
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 285, in wait
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent reply, ending = self._poll_connection(msg_id, timeout)
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 235, in _poll_connection
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent % msg_id)
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent MessagingTimeout: Timed out waiting for a reply to message ID caa0f2dc36c54787b91fb2f3c9df8259mon/rpc.py", line 161, in call
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent context, msg, rpc_method='call', **kwargs)
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/dist-packages/neutron/common/rpc.py", line 187, in __call_rpc_method
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent return func(context, msg['method'], **msg['args'])
2015-04-13 08:21:02.335 33360 TRACE neutron.agent.l3_agent File "/usr ...
It seems that adding "neutron-servers" the problem allevaites a bit.