Unable to start instances after host reboot
I have a 2 node environment where I'm using one host for the Controller, Network node and as a Compute node. I have a second host that acts as a compute node. I'm using CentOS7 for both hosts.
At the moment, I have excluded the second node from the setup to simplify troubleshooting.
I am able to create and start instances. I use "Boot from image (creates new volume)" when creating these.
After a reboot, none of the existing instances that were created before the reboot start anymore. When I try to start them in the dashboard, I get an exception in the nova-api log and the status remains 'Shutoff'. The full output is at http://pastebin.com/CFgb5HbY but what I believe to be the key parts are:
2014-12-24 13:38:37.778 3058 ERROR oslo.messaging.rpc.dispatcher [req-695fe6a8-245b-443e-9d4c-28563c4e31ba ] Exception during message handling: Unexpected error while running command.
Command: sudo nova-rootwrap /etc/nova/rootwrap.conf iscsiadm -m node -T iqn.2010-10.org.openstack:volume-400c6b9e-4c22-4b2e-80b6-8875771a7f97 -p 192.168.10.120:3260 --rescan
Exit code: 21
Stdout: u''
Stderr: u'iscsiadm: No session found.\n'
If I create a new instance with a new volume, that starts.
After doing that, the error I get when trying to start any of the instances that were created before the reboot changes to http://pastebin.com/sfQvyjux
2014-12-24 13:36:00.113 2871 ERROR oslo.messaging.rpc.dispatcher [req-506b75df-4c7a-4fbd-8c94-b0178d0a109b ] Exception during message handling: iSCSI device not found at /dev/disk/by-path/ip-192.168.10.120:3260-iscsi-iqn.2010-10.org.openstack:volume-78fcae28-99b5-41fb-aba8-6a56f7bb04cf-lun-0
Both volumes are visible under Admin -> Volumes
I don't see anything in the cinder api log or cinder server log while attempting to restart the instances.
target status is as follows:
service target status -l
Redirecting to /bin/systemctl status -l target.service
target.service - Restore LIO kernel target configuration
Loaded: loaded (/usr/lib/systemd/system/target.service; enabled)
Active: active (exited) since Wed 2014-12-24 13:37:36 GMT; 6h ago
Process: 1028 ExecStart=/usr/bin/targetctl restore (code=exited, status=0/SUCCESS)
Main PID: 1028 (code=exited, status=0/SUCCESS)
CGroup: /system.slice/target.service
Dec 24 13:37:35 controller.penguinpowered.org systemd[1]: Starting Restore LIO kernel target configuration...
Dec 24 13:37:36 controller.penguinpowered.org target[1028]: No saved config file at /etc/target/saveconfig.json, ok, exiting
Dec 24 13:37:36 controller.penguinpowered.org systemd[1]: Started Restore LIO kernel target configuration.
Please, post
on Storage node
Sorry - that exceeds comment limit again ... Pastebin is at http://pastebin.com/S4md4akB
I'll also amend the question to include that.
I need :-
targetcli ls is at http://pastebin.com/SAgnYCFV - that's from the storage node
iscsid status is at http://pastebin.com/t9J9u4mU As I'm just running the controller / storage / compute node right now, that was run from the controller node. But it's also a compute node.
I asked
not iscsi