Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Occassional permission denied creating /var/lib/instances/.directio.test during live migration

I'm setting up a shared GlusterFS filesystem that nova (Kilo) will backend into. This will give us live migrations and some data redundancy. We're using the RDO release on CentOS 7.

The migration usually works, but every once in a while we'll get this error:

   ERROR oslo_messaging.rpc.dispatcher [req-b8b14414-c7a5-4e06-b3e7-7ff9cbb72016
76a2e5e3849646a0bf525d632ba15836 e010a6ef41fd4c08a2e8f3b5d63c6210 - - -] Exception during message handling: [Errno 13] Permission denied: '/var/lib/nova/instances/.directio.test'

We can do fifty live migrations back and forth without issue, and then this will happen and the instance will go into a weird state where the database thinks it's running on one hypervisor, but the hypervisor thinks it's running somewhere else. Of course, the instance goes down at that point.

I've made the directory world writeable (for testing) and the uids are the same across all servers. Any ideas?

Also, for what it's worth, is there any problem with me just removing the directio test from /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py in our case? We support directio, so I'm fine just setting 'hasDirectIO = True' and bypassing the problematic code. Thoughts?

Thanks!

Occassional permission denied creating /var/lib/instances/.directio.test during live migration

I'm setting up a shared GlusterFS filesystem that nova (Kilo) will backend into. This will give us live migrations and some data redundancy. We're using the RDO release on CentOS 7.

The migration usually works, but every once in a while we'll get this error:

   ERROR oslo_messaging.rpc.dispatcher [req-b8b14414-c7a5-4e06-b3e7-7ff9cbb72016
76a2e5e3849646a0bf525d632ba15836 e010a6ef41fd4c08a2e8f3b5d63c6210 - - -] Exception during message handling: [Errno 13] Permission denied: '/var/lib/nova/instances/.directio.test'

We can do fifty live migrations back and forth without issue, and then this will happen and the instance will go into a weird state where the database thinks it's running on one hypervisor, but the hypervisor thinks it's running somewhere else. Of course, the instance goes down at that point.

I've made the directory world writeable (for testing) and the uids are the same across all servers. Any ideas?

Also, for what it's worth, is there any problem with me just removing the directio test from /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py in our case? We support directio, so I'm fine just setting 'hasDirectIO = True' and bypassing the problematic code. Thoughts?

Thanks!