Ask Your Question
0

Image stuck in spawning state

asked 2019-04-25 09:50:50 -0500

number9 gravatar image

updated 2019-04-26 12:03:10 -0500

I have a new openstack install on a compute node and controller node, bare metal. It all seems to work until I try to instantiate and instance. It just hangs in spawning.

On the compute node, I can find the node in the /var/lib/nova/ directory, but the disk file is like 200KB. I have read the faqs, searched and read forum posts, etc... most of them point to a lack of resources, but I have 900GB of disk and countless CPU cores and memory free. I have logged a bit at the time of spawning, and I will reproduce it here. Any help would be greatly appreciated.

Steps taken so far:

Checked resources on the compute node, has nearly 1TB of free storage, 628GB of ram free and 76 CPUs free. Checked permissions on the directories. Tried to debug by turning on debugging on nova on both controller and compute node. I can not see anything totally out of the ordinary. Logs do not show any errors, only warnings.

Followed some advice from here to no avail also:

https://stackoverflow.com/questions/47504867/openstack-vm-instance-stuck-in-spawning-state (https://stackoverflow.com/questions/4...)

I removed the original debugs, as I think they are a red herring. The network (I think) was not properly setup, so I went back and set it up as per these docs: https://computingforgeeks.com/creating-openstack-network-and-subnets/ (https://computingforgeeks.com/creatin...) Now at least when I spawn an instance the dashboard shows the correct IPs.

Now there are no libvirt logs as user eblock was asking for, but I do see this in the neutron-linuxbridge-agent.log:

2019-04-26 11:56:11.394 2417 ERROR neutron.plugins.ml2.drivers.agent._common_agent MessagingTimeout:    Timed out waiting for a reply to message ID 900a069b1c5144ce810da1bae9066159
2019-04-26 11:56:11.394 2417 ERROR neutron.plugins.ml2.drivers.agent._common_agent 
2019-04-26 11:56:11.395 2417 WARNING oslo.service.loopingcall [-] Function 'neutron.plugins.ml2.drivers.agent._common_agent.CommonAgentLoop._report_state' run outlasted interval by  30.00 sec
2019-04-26 11:57:11.396 2417 ERROR neutron.plugins.ml2.drivers.agent._common_agent [-] Failed reporting state!: MessagingTimeout: Timed out waiting for a reply to message ID 7bbd26ab7c624dbeb24f60a1f827f119
2019-04-26 11:57:11.396 2417 ERROR neutron.plugins.ml2.drivers.agent._common_agent Traceback (most recent call last):
2019-04-26 11:57:11.396 2417 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 129, in _report_state
2019-04-26 11:57:11.396 2417 ERROR neutron.plugins.ml2.drivers.agent._common_agent     True)
2019-04-26 11:57:11.396 2417 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/dist-packages/neutron/agent/rpc.py", line 97, in report_state
2019-04-26 11:57:11.396 2417 ERROR neutron.plugins.ml2.drivers.agent._common_agent     return method(context, 'report_state', **kwargs)
2019-04-26 11:57:11.396 2417 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 179, in call
2019-04-26 11:57:11.396 2417 ERROR neutron.plugins ...
(more)
edit retag flag offensive close merge delete

Comments

To limit the amount of information, first search the logs for the instance's UUID. "stuck in spawning state" indicates, to me, that the nova-compute log file is the right place. Also check the libvirt log on the compute host.

A disk file of 200K tells me that storage has not yet been created.

Bernd Bausch gravatar imageBernd Bausch ( 2019-04-25 21:11:37 -0500 )edit

Is there a corresponding libvirt VM? As admin, run openstack server show and find the VM's name, instance-xxxxxxxx. Then check if this VM exists on the compute node.

Bernd Bausch gravatar imageBernd Bausch ( 2019-04-25 21:13:04 -0500 )edit

Can you check if /var/lib/nova/instances/_base/abc261a53dd1e62f06a0d7cd2fc8da800b68383a has a valid size? For file based instances nova downloads the base image to the compute node, then it tries to copy the base image into a instances disk, your logs show this:

eblock gravatar imageeblock ( 2019-04-26 04:28:11 -0500 )edit

qemu-img create -f qcow2 -o backing_file=/var/lib/nova/instances/_base/abc... /var/lib/nova/instances/e501e9b6.../disk. But in the end the logs say Cannot resize image /var/lib/nova/instances/e501e9b6-e4ac-464f-b1d6-9a5439c5c269/disk.swap to a smaller size., so this probably is one hint.

eblock gravatar imageeblock ( 2019-04-26 04:30:09 -0500 )edit

Maybe something's wrong with the base image and its swap? Maybe you find more logs why the root disk can't be created. The glance connection seems fine.

eblock gravatar imageeblock ( 2019-04-26 04:31:50 -0500 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2019-04-28 21:23:41 -0500

number9 gravatar image

Solved this problem. As embarrassing as this sounds, it was the following configuration error (I found no others):

in /etc/default/etcd, the line:

ETC_INITIAL_CLUSTER='default=http://controller:2380'

was to be changed to match your configuration. Unfortunately, I read that as it was telling me to put:

ETC_INITIAL_CLUSTER='default=http://hpccloud:2380'

Where "hpccloud" was my controller. It actually wants the configuration to say:

ETC_INITIAL_CLUSTER='hpccloud=http://hpccloud:2380'

I suppose I read it wrong, or it actually still does not make sense to me why someone would write a configuration line that way, but whatever, my instance does spawn to completion now... I now have to solve a networking issue as the instance is not accessible, but it is spawning with an IP assigned to it from the dhcp range and it is showing up active.

Thanks for all of the help so far.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2019-04-25 09:18:25 -0500

Seen: 110 times

Last updated: Apr 26