nova compute service down while launching instance (VM stuck in spawning state)

asked 2016-11-04 05:40:29 -0600

DarkKnight gravatar image

updated 2016-11-10 12:10:21 -0600

I have an all-in-one-setup with my controller and compute services running on the same node(Mitaka).All my nova and other dependent services are up and running. However, when i try to launch an instance the state of the nova-compute process becomes down. Because of this the instance is stuck in spawning state.

    [root@localhost rabbitmq(keystone_admin)]# nova service-list
    +----+------------------+-----------------------+----------+---------+-------+----------------------------+-----------------+
    | Id | Binary           | Host                  | Zone     | Status  | State | Updated_at                 | Disabled Reason |
    +----+------------------+-----------------------+----------+---------+-------+----------------------------+-----------------+
    | 6  | nova-cert        | localhost.localdomain | internal | enabled | up    | 2016-11-04T10:27:47.000000 | -               |
    | 7  | nova-consoleauth | localhost.localdomain | internal | enabled | up    | 2016-11-04T10:27:53.000000 | -               |
    | 8  | nova-scheduler   | localhost.localdomain | internal | enabled | up    | 2016-11-04T10:27:47.000000 | -               |
    | 9  | nova-conductor   | localhost.localdomain | internal | enabled | up    | 2016-11-04T10:27:52.000000 | -               |
    | 11 | nova-compute     | localhost.localdomain | nova     | enabled | **down**  | 2016-11-04T10:11:40.000000 | -               |
    | 12 | nova-console     | localhost.localdomain | internal | enabled | up    | 2016-11-04T10:27:53.000000 | -               |
=========

However, when i check a systemctl for the process, the process is running fine.

=========

[root@localhost rabbitmq(keystone_admin)]# systemctl status openstack-nova-compute.service -l
● openstack-nova-compute.service - OpenStack Nova Compute Server
   Loaded: loaded (/usr/lib/systemd/system/openstack-nova-compute.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2016-11-04 15:36:02 IST; 25min ago
 Main PID: 31930 (nova-compute)
   CGroup: /system.slice/openstack-nova-compute.service
           └─31930 /usr/bin/python2 /usr/bin/nova-compute

==================

My rabbitmq service is also up and running.

=======

[root@localhost rabbitmq(keystone_admin)]# systemctl status rabbitmq-server
● rabbitmq-server.service - RabbitMQ broker
   Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/rabbitmq-server.service.d
           └─limits.conf
   Active: active (running) since Thu 2016-11-03 12:32:08 IST; 1 day 3h ago
 Main PID: 1835 (beam.smp)
   CGroup: /system.slice/rabbitmq-server.service
           ├─1835 /usr/lib64/erlang/erts-5.10.4/bin/beam.smp -W w -K true -A30 -P 1048576 -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq --...
           ├─1964 /usr/lib64/erlang/erts-5.10.4/bin/epmd -daemon
           ├─5873 inet_gethost 4
           └─5875 inet_gethost 4

======= Neutron agent-list o/p

[root@localhost ~(keystone_admin)]# neutron agent-list
+--------------------------------------+--------------------+-----------------------+-------------------+-------+----------------+---------------------------+
| id                                   | agent_type         | host                  | availability_zone | alive | admin_state_up | binary                    |
+--------------------------------------+--------------------+-----------------------+-------------------+-------+----------------+---------------------------+
| 0a3b40c3-02f0-4866-9e1e-fa5a41d29bf0 | Metadata agent     | localhost.localdomain |                   | :-)   | True           | neutron-metadata-agent    |
| 2b73904f-6d14-433a-9876-42acf53d7bac | L3 agent           | localhost.localdomain | nova              | :-)   | True           | neutron-vpn-agent         |
| 4b0b3d0e-1cd1-4b3a-809c-88457dd89120 | Open vSwitch agent | localhost.localdomain |                   | :-)   | True           | neutron-openvswitch-agent |
| a4d75bc1-ef40-4665-9911-7ee38bbcc004 | Metering agent     | localhost.localdomain |                   | :-)   | True           | neutron-metering-agent    |
| bfca8e8d-46c3-4943-957d-88b0b890e03c | Loadbalancer agent | localhost.localdomain |                   | :-)   | True           | neutron-lbaas-agent       |
| c6dc21d4-ad15-4c43-94c6-8bca45370283 | DHCP agent         | localhost.localdomain | nova              | :-)   | True           | neutron-dhcp-agent        |
+--------------------------------------+--------------------+-----------------------+-------------------+-------+----------------+---------------------------+

======= I am not getting any error messages in any of the nova logs.(compute, scheduler, conductor or api logs)

There are some errors in the neutron server logs and the openvswitch-agent.log

  /var/log/neutron/server.log:2016-11-10 17:05:21.784 27793 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last): /var/log/neutron/server.log:2016-11-10 17:05:21.784 27793 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply /var/log/neutron/server.log:2016-11-10 17:05:21.784 27793 ERROR oslo_messaging.rpc.dispatcher     incoming.message)) /var/log/neutron/server.log:2016-11-10 17:05:21.784 27793 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 100, in reply /var/log ...
(more)
edit retag flag offensive close merge delete

Comments

A service is marked down when nova-api hasn't heard from it for a while (75 secs or so). You should see this mentioned in the nova-api.log. The nova-compute.log should complain that it can't access the message queue. Hard to say if this is linked to the Neutron problem.

Bernd Bausch gravatar imageBernd Bausch ( 2016-11-10 20:35:07 -0600 )edit