Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

I have managed to recover a supended instance after a reboot - what an in incredible journey! My lessons learned are: NEVER PUT YOUR INSTANCE IN SUSPEND STATE!!!

My solution is outlined below fixes the problems: a) getting instances out of suspend state b) getting the network configuration back

but I am still suffering from residual unsolved problems: i) cinder volumes previously attached to a suspended instance are stuck in detaching state after attempting to detach using horizon, ii) noVnc now only works on chrome from a workstation in the office

Here are the steps I took, and why.... 1) Removing the saved state of the instance: Firstly I did some research on the virsh commands to see whether anything can be done directly from the command line. I discovered that the virsh destroy / virsh start command wouldn't work because I had a saved-state. So I found the command: virsh list --all --managed-save This command lists the instances and tells you whether it has a saved state

virsh list --all --managed-save

Id Name State

1 instance-00000050 running 2 instance-00000051 running - instance-00000004 shut off - instance-00000011 shut off - instance-00000012 shut off - instance-0000002d saved

So I removed the saved-state using the command:

virsh managedsave-remove <instance-name>

This changes the state from 'saved' to 'shut off'

2) Synchronising the nova state in mysql: On the instance list on horizon, the status still said 'Suspended' so having nothing to lose, I logged into mysql (use nova database) and changed the status:

SELECT id, vm_state, hostname from instances; Find the instance in question and change the vm_state to stopped: UPDATE instances set vm_state = 'stopped' where id = <id obtained="" from="" previous="" query=""> Now the list of instance on horizon shows a status of 'Shutoff'

3) Tear down the network: Before attempting to hard-boot the instance, I checked the log files and discovered that quantum wasn't working and had errors in the logfile. It contained a myriad of errors I had no experience in resolving. Because I knew from the clues from the posted bug (the original one I mentioned) I suspected the network configuration was screwed because the saved state could not be rebuilt. So I deleted all l3 and l2 configuration (routers, subnets, ports) using horizon where possible and then removed all ports using the command line: quantum delete-port <port-id> Yep - I deteled everything. Just to be sure, I then dropped and re-created the database:

DROP quantum; CREATE DATABASE quantum; GRANT ALL ON quantum.* TO 'quantumUser'@'%' IDENTIFIED BY 'quantumPass';

Then restart quantum:

cd /etc/init.d/; for i in $( ls quantum-* ); do sudo service $i restart; done

service dnsmasq restart

Checking the logs, all seemed good again, so rebuilt my network using horizon

4) Hard reboot the instance (from horizon) and create a snapshot Even though the instance had no port allocation - no ip address or mac address, I clicked 'Hard reboot instance'. This allowed me to click on 'Create snapshot' even though I couldn't access the instance.

5) Launch new instance from the snapshot: I selected the same flavor, security networking and launched the new instance. I gave it a floating IP - all looked good on the horizon dashboard I setup the access and security to allow port 22 for ssh (this was wiped when I reset the databse)

6) Login from the console [ noVnc seems to have problems now, but it works locally in Chrome for some reason] I could see the server had no network configuration, saying eth0 does not seem to be present. My images do not the the 70-persistent-net.rules file, so I copied one in from a working 'proper' server and obtained the ipaddress from the command:

quantum port-list

and set the mac address in the 70-persistent-net.rules file with the port that had the same ipaddress. Then I rebooted..... and it worked! I can now ssh back in!

now to solve the stuck cinder volumes and work out why noVnc isn't working.... - any pointers would be grateful.