Cinder volumes all of the sudden go read-only.
So we had a power outage at work that ran past the battery time so the Openstack cluster hard shut-down. Everything comes back up fine but now after a week of the cluster running the Cinder volumes automatically go into read-only mode (this has happened 3 times already). When I list the volumes in cinder they show that they are not read-only. The only way I have been able to resolve the issue is by logging into every VM and remounting the volumes. I checked the cinder logs and they have no errors or strange output. Anyone seen this before or have any idea where I can start debugging this?
Just wanted to bump this. I have checked the integrity of the drives serving up volumes with smartctl and the short tests passed. The drives don't seem to be degrading (they are very new). I have put cinder in DEBUG log mode although I haven't seen anything funny.
Hi Glen, Are you using the LVM driver? Might try looking in the n-cpu and KVM logs on the system. Sounds like the Instance/KVM is detecting something here and setting the state independent of Cinder. Looking on my side to try and repro.
Im pretty sure i'm using the LVM driver. My cinder config is pointing to a volume group and all the volumes are attached to it. The volume group is called cinder-volumes. Here is an output of lsblk - https://pastee.org/4zpw2
Also I think the kernel logs on the failure day pretty much sum it up. Can you take a look at this log and confirm? https://pastee.org/knmdh. Looks like the volume is corrupt? I rebooted controller node which is where the volumes are and running fsck shows no errors.
@glenbot Where you able to fix this error ?