Though compute service is running in compute node, it says nova compute is "down" in controller

asked 2019-01-22 01:46:09 -0500

Manab gravatar image

One compute node which was working fine, stopped working suddenly showing "down" in controller node.

In Compute node

  1. Compute services are running good.
  2. Neutron services are running good.
  3. Able to connect rabbitmq servers.
  4. All nodes are ntp synced.

In controller

All other compute nodes are working fine except one compute

Logs in Controller node

2019-01-03 11:49:56.413 2935 WARNING nova.scheduler.host_manager [req-d817453d-bb7a-40bd-9588-4b5373eb5556 3612616185a748f788fb0532729ccc3a 5896939c506840c2bf083cb1100bee2b - - -] Host xxxx has more disk space than database expected (8284gb > 8108gb)

2019-01-04 12:05:28.396 2935 DEBUG nova.servicegroup.drivers.db [req-38a21fe3-2ac4-44a6-bc2c-c4fa5e1fe24c 3612616185a748f788fb0532729ccc3a 5896939c506840c2bf083cb1100bee2b - - -] Seems service is down. Last heartbeat was 2019-01-03 22:24:08. Elapsed time is 67280.3963 is_up /usr/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py:80 2019-01-04 12:05:28.396 2935 WARNING nova.scheduler.filters.compute_filter [req-38a21fe3-2ac4-44a6-bc2c-c4fa5e1fe24c 3612616185a748f788fb0532729ccc3a 5896939c506840c2bf083cb1100bee2b - - -] (xxxx, xxxx) ram:562562 disk:8302592 io_ops:0 instances:48 has not been heard from in a while

2019-01-03 11:49:56.413 2935 WARNING nova.scheduler.host_manager [req-d817453d-bb7a-40bd-9588-4b5373eb5556 3612616185a748f788fb0532729ccc3a 5896939c506840c2bf083cb1100bee2b - - -] Host xxxx has more disk space than database expected (8284gb > 8108gb)

2019-01-04 12:05:28.396 2935 DEBUG nova.servicegroup.drivers.db [req-38a21fe3-2ac4-44a6-bc2c-c4fa5e1fe24c 3612616185a748f788fb0532729ccc3a 5896939c506840c2bf083cb1100bee2b - - -] Seems service is down. Last heartbeat was 2019-01-03 22:24:08. Elapsed time is 67280.3963 is_up /usr/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py:80 2019-01-04 12:05:28.396 2935 WARNING nova.scheduler.filters.compute_filter [req-38a21fe3-2ac4-44a6-bc2c-c4fa5e1fe24c 3612616185a748f788fb0532729ccc3a 5896939c506840c2bf083cb1100bee2b - - -] (xxxx, xxxx) ram:562562 disk:8302592 io_ops:0 instances:48 has not been heard from in a while

Not seeing any error message neither in compute nor in controller except above warning message.

Any help is highly appreciated.

edit retag flag offensive close merge delete