After host crash ceph vm volumes locked

asked 2019-08-13 04:16:19 -0500

ecki gravatar image

We had a crash of one of our OpenStack hosts. After the reboot the VMs on that machine have not been able to start because of filesystem errors. Some investigations later we noticed that the VM ephemeral volumes (which we store in Ceph RBD) had still a write lock.

Manually remove the lock allowed us to start the VMs again, but I wonder if there should be an automated process for this. Should the ceph client normally detect that it is the same host retrying the lock? (We do use some docker containers for the OpenStack and Ceph, so it might be a problem with new „host“ names)

edit retag flag offensive close merge delete