Ask Your Question
0

How do I restart a suspended VM after a reboot?

asked 2013-08-12 15:25:59 -0500

netflare gravatar image

updated 2013-08-14 12:54:31 -0500

Jobin gravatar image

My setup is: Ubuntu 13.04 / grizzly with grizzly and grizzly-backports in

etc/apt/sources.list.d/grizzly.list

I was experimenting with managing server recovery after a reboot and put several vms in suspend status. To my horror I discovered that I cannot resume them. The problem appears to be described here ( https://bugs.launchpad.net/nova/+bug/1052696 ), but does not offer any suggestion how to recover from this scenario. I have tried resume instance'from horizon - errors in nova-api.log:

2013-08-12 21:16:40.069 INFO nova.osapi_compute.wsgi.server [req-58ab0c59-7070-4ae3-af51-e12853cd6c30 3036f84a152d4b9e9a8e456e386c2d7d d31d9dff59e34b0d86dac87abcb05a0f] 172.30.1.24 "GET /v2/d31d9dff59e34b0d86dac87abcb05a0f/flavors/2542733c-e219-4e17-8830-136ba823190d HTTP/1.1" status: 200 len: 703 time: 0.0099411

I have tried to manually reset the state ( http://docs.openstack.org/trunk/openstack-compute/admin/content/reset-state.html ) nova reset-state <id></id>

I have tried using virsh shutdown <ID>

but so far VM instances are in status of Shutdown or Suspended and Power status is Shutdown - cannot get them to boot.

edit retag flag offensive close merge delete

3 answers

Sort by ยป oldest newest most voted
1

answered 2013-08-12 21:06:42 -0500

updated 2013-08-12 21:07:58 -0500

For my suggestion, I will:

virsh destroy [instance_id]

go to the targart instance folder,eg: /instances-000000xxx, there is a file named libvirt.xml, then:

virsh create libvirt.xml

virsh list

you will see the instance is maunally hard rebooted

edit flag offensive delete link more

Comments

Thanks for the suggestion. Unfortunately system returns errors. Here's how I found the instance_id: 1) nova list | cambridge | SUSPENDED 2) select id, display_name from instances | 17 | cambridge 3) convert 17 to hex: 11 2) virsh list --all - instance-00000011 shut off 4) virsh destroy instance-00000011 error: Failed to destroy domain instance-00000011 error: Requested operation is not valid: domain is not running

netflare gravatar imagenetflare ( 2013-08-13 02:21:04 -0500 )edit
1

answered 2013-08-16 04:01:20 -0500

netflare gravatar image

I have managed to recover a supended instance after a reboot - what an in incredible journey! My lessons learned are: NEVER PUT YOUR INSTANCE IN SUSPEND STATE!!!

My solution is outlined below fixes the problems: a) getting instances out of suspend state b) getting the network configuration back

but I am still suffering from residual unsolved problems: i) cinder volumes previously attached to a suspended instance are stuck in detaching state after attempting to detach using horizon, ii) noVnc now only works on chrome from a workstation in the office

Here are the steps I took, and why.... 1) Removing the saved state of the instance: Firstly I did some research on the virsh commands to see whether anything can be done directly from the command line. I discovered that the virsh destroy / virsh start command wouldn't work because I had a saved-state. So I found the command: virsh list --all --managed-save This command lists the instances and tells you whether it has a saved state

virsh list --all --managed-save

Id Name State

1 instance-00000050 running 2 instance-00000051 running - instance-00000004 shut off - instance-00000011 shut off - instance-00000012 shut off - instance-0000002d saved

So I removed the saved-state using the command:

virsh managedsave-remove <instance-name></instance-name>

This changes the state from 'saved' to 'shut off'

2) Synchronising the nova state in mysql: On the instance list on horizon, the status still said 'Suspended' so having nothing to lose, I logged into mysql (use nova database) and changed the status:

SELECT id, vm_state, hostname from instances; Find the instance in question and change the vm_state to stopped: UPDATE instances set vm_state = 'stopped' where id = <id from="" obtained="" previous="" query=""> Now the list of instance on horizon shows a status of 'Shutoff'</id>

3) Tear down the network: Before attempting to hard-boot the instance, I checked the log files and discovered that quantum wasn't working and had errors in the logfile. It contained a myriad of errors I had no experience in resolving. Because I knew from the clues from the posted bug (the original one I mentioned) I suspected the network configuration was screwed because the saved state could not be rebuilt. So I deleted all l3 and l2 configuration (routers, subnets, ports) using horizon where possible and then removed all ports using the command line: quantum delete-port <port-id> Yep - I deteled everything. Just to be sure, I then dropped and re-created the database:</port-id>

DROP quantum; CREATE DATABASE quantum; GRANT ALL ON quantum.* TO 'quantumUser'@'%' IDENTIFIED BY 'quantumPass';

Then restart quantum:

cd /etc/init.d/; for i in $( ls quantum-* ); do sudo service $i restart; done

service dnsmasq restart

Checking the logs, all seemed good again, so rebuilt my network using horizon

4) Hard reboot the instance (from horizon) and create a snapshot Even though the instance had no port allocation - no ip address or mac address, I clicked 'Hard reboot instance'. This allowed me to click on 'Create snapshot' even though I couldn't access the instance.

5) Launch new instance ... (more)

edit flag offensive delete link more
0

answered 2015-07-30 22:58:01 -0500

virsh dompmwakeup vm_name might help. At least this is what allowed me to unlock the machine, force off it and then start.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

Stats

Asked: 2013-08-12 15:25:59 -0500

Seen: 5,997 times

Last updated: Aug 16 '13