instances with nfs-backed cinder volumes hanges after network fail

asked 2016-10-28 04:11:01 -0500

elenhil gravatar image

Hello guys!
We have currently an installation of Openstack Liberty+KVM+Ububntu 14.04, mostly default
Cinder and Nova are backed with qcow2 over NFS, cinder volumes resides on the same servers running nova-compute The problem is – we do have power issues in our building, and not always we can turn off servers properly, when this happens
So, when the power is down, and network is down, all instances that has cinder volumes connected hangs. Simply nfs share becomes unavailable, even if the share is mounted locally (from 192.168.1.29 onto 192.168.1.29, for example), I can’t even ls there, it hangs.
And it never goes up, until the server reboot, even if the network is back.
I can’t even restart instance from KVM locally, it says that device is busy
So, every instance which has cinder volume mounted hangs and wouldn’t go up until hypervisor rebooted

Is there a way to fix this issue?
At least make instances with local cinder volumes (the same host where nova runs this instance) not hang?

edit retag flag offensive close merge delete

Comments

Any ERROR messages in NFS Server log located on Compute node ? Same question NFS Client log content. Some where in NFS related or /var/log/messages ( dmesg ) I would expect detected issues up on reboot . In general, your Compute Nodes FS should have incorrect inodes ( fsck might help out ).

dbaxps gravatar imagedbaxps ( 2016-10-28 07:31:02 -0500 )edit