nova compute service down while launching instance (VM stuck in spawning state)
I have an all-in-one-setup with my controller and compute services running on the same node(Mitaka).All my nova and other dependent services are up and running. However, when i try to launch an instance the state of the nova-compute process becomes down. Because of this the instance is stuck in spawning state.
[root@localhost rabbitmq(keystone_admin)]# nova service-list
+----+------------------+-----------------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+----+------------------+-----------------------+----------+---------+-------+----------------------------+-----------------+
| 6 | nova-cert | localhost.localdomain | internal | enabled | up | 2016-11-04T10:27:47.000000 | - |
| 7 | nova-consoleauth | localhost.localdomain | internal | enabled | up | 2016-11-04T10:27:53.000000 | - |
| 8 | nova-scheduler | localhost.localdomain | internal | enabled | up | 2016-11-04T10:27:47.000000 | - |
| 9 | nova-conductor | localhost.localdomain | internal | enabled | up | 2016-11-04T10:27:52.000000 | - |
| 11 | nova-compute | localhost.localdomain | nova | enabled | **down** | 2016-11-04T10:11:40.000000 | - |
| 12 | nova-console | localhost.localdomain | internal | enabled | up | 2016-11-04T10:27:53.000000 | - |
=========
However, when i check a systemctl for the process, the process is running fine.
=========
[root@localhost rabbitmq(keystone_admin)]# systemctl status openstack-nova-compute.service -l
● openstack-nova-compute.service - OpenStack Nova Compute Server
Loaded: loaded (/usr/lib/systemd/system/openstack-nova-compute.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2016-11-04 15:36:02 IST; 25min ago
Main PID: 31930 (nova-compute)
CGroup: /system.slice/openstack-nova-compute.service
└─31930 /usr/bin/python2 /usr/bin/nova-compute
==================
My rabbitmq service is also up and running.
=======
[root@localhost rabbitmq(keystone_admin)]# systemctl status rabbitmq-server
● rabbitmq-server.service - RabbitMQ broker
Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/rabbitmq-server.service.d
└─limits.conf
Active: active (running) since Thu 2016-11-03 12:32:08 IST; 1 day 3h ago
Main PID: 1835 (beam.smp)
CGroup: /system.slice/rabbitmq-server.service
├─1835 /usr/lib64/erlang/erts-5.10.4/bin/beam.smp -W w -K true -A30 -P 1048576 -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq --...
├─1964 /usr/lib64/erlang/erts-5.10.4/bin/epmd -daemon
├─5873 inet_gethost 4
└─5875 inet_gethost 4
======= Neutron agent-list o/p
[root@localhost ~(keystone_admin)]# neutron agent-list
+--------------------------------------+--------------------+-----------------------+-------------------+-------+----------------+---------------------------+
| id | agent_type | host | availability_zone | alive | admin_state_up | binary |
+--------------------------------------+--------------------+-----------------------+-------------------+-------+----------------+---------------------------+
| 0a3b40c3-02f0-4866-9e1e-fa5a41d29bf0 | Metadata agent | localhost.localdomain | | :-) | True | neutron-metadata-agent |
| 2b73904f-6d14-433a-9876-42acf53d7bac | L3 agent | localhost.localdomain | nova | :-) | True | neutron-vpn-agent |
| 4b0b3d0e-1cd1-4b3a-809c-88457dd89120 | Open vSwitch agent | localhost.localdomain | | :-) | True | neutron-openvswitch-agent |
| a4d75bc1-ef40-4665-9911-7ee38bbcc004 | Metering agent | localhost.localdomain | | :-) | True | neutron-metering-agent |
| bfca8e8d-46c3-4943-957d-88b0b890e03c | Loadbalancer agent | localhost.localdomain | | :-) | True | neutron-lbaas-agent |
| c6dc21d4-ad15-4c43-94c6-8bca45370283 | DHCP agent | localhost.localdomain | nova | :-) | True | neutron-dhcp-agent |
+--------------------------------------+--------------------+-----------------------+-------------------+-------+----------------+---------------------------+
======= I am not getting any error messages in any of the nova logs.(compute, scheduler, conductor or api logs)
There are some errors in the neutron server logs and the openvswitch-agent.log
/var/log/neutron/server.log:2016-11-10 17:05:21.784 27793 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last): /var/log/neutron/server.log:2016-11-10 17:05:21.784 27793 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply /var/log/neutron/server.log:2016-11-10 17:05:21.784 27793 ERROR oslo_messaging.rpc.dispatcher incoming.message)) /var/log/neutron/server.log:2016-11-10 17:05:21.784 27793 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 100, in reply /var/log ...
A service is marked down when nova-api hasn't heard from it for a while (75 secs or so). You should see this mentioned in the nova-api.log. The nova-compute.log should complain that it can't access the message queue. Hard to say if this is linked to the Neutron problem.