I'm trying to get resume_guests_state_on_host_boot to work on my system. I have a 3 computer set up 1 controller with Horizon and some compute nodes. The controller is running the devstack version (version 2013.2) and the compute nodes (2013.1.3 - Grizzly).
I added the line
resume_guests_state_on_host_boot = True
and re-started my nova-compute. On the Horizon dashboard I started and instance and made sure (via the admin) that it was on the host I wanted. I then logged into that host and issued a reboot command (Ubuntu 12.04 LTS). On Horizon the instance went to 'Error' and 'Shutdown'. Normally, without the resume_guests line I would could go into the host and issue a nova reboot command, but with the instance at an error state I could only terminate the instance.
In the Nova compute logs I got 2 libvir errors. The first says 'no domain' but then makes the instance - so I don't get that...
libvir: QEMU Driver error : Domain not found: no domain with matching name 'instance-0000024b' 2013-10-17 17:32:25.557 |AUDIT nova.compute.manager [|req-ca4374f5-9d90-4ae2-b8a8-d75dec18140d |admin demo|] |[instance: 2b015a45-8901-4d42-bb2c-64e419a0d33c] |Starting instance... 2013-10-17 17:32:25.966 |AUDIT nova.compute.claims [|req-ca4374f5-9d90-4ae2-b8a8-d75dec18140d |admin demo|] |[instance: 2b015a45-8901-4d42-bb2c-64e419a0d33c] |Attempting claim: memory 512 MB, disk 0 GB, VCPUs 1 ... 2013-10-17 17:32:33.710 |INFO nova.virt.libvirt.driver |[instance: 2b015a45-8901-4d42-bb2c-64e419a0d33c] |Instance spawned successfully.
The other prevents the instance from being restarted after the SIGTERM event and causes the error.
2013-10-17 17:32:49.255 |INFO nova.service ||Caught SIGTERM, exiting 2013-10-17 17:33:20.844 |INFO nova.manager ||Skipping periodic task _periodic_update_dns because its interval is negative 2013-10-17 17:33:20.940 |INFO nova.virt.driver ||Loading compute driver 'libvirt.LibvirtDriver' 2013-10-17 17:33:23.982 *ERROR nova.openstack.common.rpc.common [|req-ec003c5e-a5ec-496a-b94d-60c6c4806caa |None None* *mAMQP server on 99.999.23.200:5672 is unreachable: [Errno 113] EHOSTUNREACH. Trying again in 1 seconds.* 2013-10-17 17:33:24.983 |INFO nova.openstack.common.rpc.common [|req-ec003c5e-a5ec-496a-b94d-60c6c4806caa |None None|] ||Reconnecting to AMQP server on 99.999.23.200:5672 2013-10-17 17:33:25.986 |INFO nova.openstack.common.rpc.common [|req-ec003c5e-a5ec-496a-b94d-60c6c4806caa |None None|] ||Connected to AMQP server on 99.999.23.200:5672 2013-10-17 17:33:26.084 |AUDIT nova.service ||Starting compute node (version 2013.1.3) 2013-10-17 17:33:26.788 |INFO nova.compute.manager [|req-34b54920-43cf-44ed-aea6-b56210745306 |None None|] |[instance: 2b015a45-8901-4d42-bb2c-64e419a0d33c] |Rebooting instance after nova-compute restart. libvir: QEMU Driver error : Requested operation is not valid: domain is not running 2013-10-17 17:33:26.887 |INFO nova.virt.libvirt.driver |[instance: 2b015a45-8901-4d42-bb2c-64e419a0d33c] |Instance destroyed successfully.
I've got logs at
- http://www.kevinmeek.ca/openstack/libvir_error.txt (nova_compute log)
Can anyone help me understand why the libvirt is preventing the guest host from automatically coming up?