Ask Your Question

How could adding a compute node cause my VMs disks to go RO?

asked 2015-01-09 18:25:12 -0500

Joshua Miller gravatar image

A few days back I added another compute node to one of my OS clusters, and immediately afterwards all of my VMs had their drives go read only, i.e. - "touch: cannot touch `/tmp/testing123456789': Read-only file system".

My environment has shared storage, using NFS. Hypervisors are ubuntu running KVM. All VMs had this same problem, at roughly the same time (I can infer by looking at the timestamps of the last logs they successfully wrote), across all of the compute hypervisors. I can't find any indication that the hypervisors themselves were ever unable to access the storage. The same storage array serves other services and none of them were impacted.

I was able to reboot the VMs and they all came up without issue. The problem seems to have occurred point-in-time, and only during the initial addition of a new compute node to the cluster. The new compute node has been a happy, functional member of the cluster with no issues following the initial add. Anybody have ideas on why this happened?

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2015-01-10 18:26:26 -0500

This scenario really sounds like something that should be traced down on the communication between the compute nodes and the NFS storage. This is not likely an issue with OpenStack but more likely related to the NFS solution you are using. I would recommend confirming on the storage side there was no major event ( I am thinking fail-over events like with Netapp's and/or cluster NFS file servers). Then if you still have inconclusive evidence, I would setup packet traces when you go to add the next compute node to capture the NFS activity happening on an existing node with the NFS storage and also collect it from the new compute node being added. You might find that their is some initial file locking going on that is tripping up the entire environment.

edit flag offensive delete link more

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower


Asked: 2015-01-09 18:25:12 -0500

Seen: 195 times

Last updated: Jan 10 '15