cold migration on Queens with local disks not working due to permission issues

asked 2019-10-21 13:21:40 -0600

updated 2019-10-22 14:12:40 -0600

Hello,

I am running OpenStack Queens on a Fedora 28 environment as a test environment to understand how it works.

One of the items I have been struggling with is cold migration with local disks.

Here are my steps:

  1. Launch a VM with 20GB local disk and 20GB ephemeral storage. The VM comes up and is functional.
  2. Initiate a cold migration using nova migrate <instance-uuid> --poll
  3. The VM itself is assigned to a new hyper-visor, the disk is copied as well, however, the permissions for the disk are different on the destination hyper-visor.

Here is the output of openstack server show <uuid> after launching the VM:

[root@control2015 ~]# openstack server show ebde3173-cc7f-401f-8d6c-16f89621a285 -f json | jq .
{
  "OS-EXT-STS:task_state": null,
  "addresses": "network-72=10.189.72.102",
  "image": "OL-7 (b5add240-b20b-452e-855f-ef01ed49d138)",
  "OS-EXT-STS:vm_state": "active",
  "OS-EXT-SRV-ATTR:instance_name": "instance-00000024",
  "OS-SRV-USG:launched_at": "2019-10-21T18:00:40.000000",
  "flavor": "test-flavor (7ec79eb8-dc43-4b4a-8ac2-81e18b67ae82)",
  "id": "ebde3173-cc7f-401f-8d6c-16f89621a285",
  "security_groups": "name='open'",
  "volumes_attached": "",
  "user_id": "c17f28d0bd654d9ba04671ca72ee625f",
  "OS-DCF:diskConfig": "AUTO",
  "accessIPv4": "",
  "accessIPv6": "",
  "progress": 0,
  "OS-EXT-STS:power_state": "Running",
  "OS-EXT-AZ:availability_zone": "devstack2",
  "config_drive": "",
  "status": "ACTIVE",
  "updated": "2019-10-21T18:00:40Z",
  "hostId": "674690363f457e075023e145885db5f1b8f174a516891854fcc1c7f0",
  "OS-EXT-SRV-ATTR:host": "compute2004",
  "OS-SRV-USG:terminated_at": null,
  "key_name": "kkanjee-general",
  "properties": "",
  "project_id": "8bba4dea354648d0ada7c4781c6306a5",
  "OS-EXT-SRV-ATTR:hypervisor_hostname": "compute2004",
  "name": "test",
  "created": "2019-10-21T18:00:28Z"
}

The disk permissions and ownership on compute2004 where the VM was originally launched:

[root@compute2004 ~]# ls -alrt /var/lib/nova/instances/ebde3173-cc7f-401f-8d6c-16f89621a285
total 41943100
-rw-r--r-- 1 nova nova       162 Oct 21 14:00 disk.info
drwxrwxr-x 7 nova nova       193 Oct 21 14:00 ..
drwxr-xr-x 2 nova nova        71 Oct 21 14:00 .
-rw------- 1 root root     55760 Oct 21 14:01 console.log
-rw-r--r-- 1 qemu qemu    393216 Oct 21 14:01 disk.eph0
-rw-r--r-- 1 qemu qemu 121438208 Oct 21 14:06 disk

The migration command fails:

[root@control2015 ~]# nova migrate ebde3173-cc7f-401f-8d6c-16f89621a285 --poll

Server migrating... 0% complete
Error migrating server
ERROR (ResourceInErrorState):

The VM is scheduled on a different hypervisor as can be seen below, the disk is also transfered but the permissions and ownerships are different causing the permission denied error:

[root@control2015 ~]# openstack server show ebde3173-cc7f-401f-8d6c-16f89621a285 -f json | jq .
{
  "OS-EXT-STS:task_state": null,
  "addresses": "network-72=10.189.72.102",
  "image": "OL-7 (b5add240-b20b-452e-855f-ef01ed49d138)",
  "OS-EXT-STS:vm_state": "error",
  "OS-EXT-SRV-ATTR:instance_name": "instance-00000024",
  "OS-SRV-USG:launched_at": "2019-10-21T18:00:40.000000",
  "flavor": "test-flavor (7ec79eb8-dc43-4b4a-8ac2-81e18b67ae82)",
  "id": "ebde3173-cc7f-401f-8d6c-16f89621a285",
  "security_groups": "name='open'",
  "volumes_attached": "",
  "user_id": "c17f28d0bd654d9ba04671ca72ee625f",
  "OS-DCF:diskConfig": "AUTO",
  "accessIPv4": "",
  "accessIPv6": "",
  "OS-EXT-STS:power_state": "Running",
  "OS-EXT-AZ:availability_zone": "devstack2",
  "config_drive": "",
  "status": "ERROR",
  "updated": "2019-10-21T18:08:54Z",
  "hostId": "1209685e863f9c7368ac3407e593289ef88a9cc2aa3f12f6f7037fbd",
  "OS-EXT-SRV-ATTR:host": "compute2001",
  "OS-SRV-USG:terminated_at": null,
  "key_name": "kkanjee-general",
  "properties": "",
  "project_id": "8bba4dea354648d0ada7c4781c6306a5",
  "OS-EXT-SRV-ATTR:hypervisor_hostname": "compute2001",
  "name": "test",
  "created": "2019-10-21T18:00:28Z",
  "fault": {
    "message": "libvirtError",
    "code": 500,
    "details": "Traceback (most recent call last):\n  File \"/usr/lib/python2.7/site-packages/nova/compute/manager.py\", line 203, in decorated_function\n    return function(self, context, *args, **kwargs)\n  File \"/usr/lib/python2.7/site-packages/nova/compute/manager.py\", line 4570, in finish_resize\n    self._revert_allocation(context, instance, migration)\n  File \"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py\", line 220, in __exit__\n    self.force_reraise()\n  File \"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py\", line 196, in force_reraise\n    six.reraise(self.type_ ...
(more)
edit retag flag offensive close merge delete

Comments

I made some progress by changing nova.conf and using rsync as the remote_file_transport driver:

[libvirt]
...
remote_filesystem_transport = rsync

The above allows cold migrations to complete successfully. However, the permission on the destination are not the same as the source.

Komail Kanjee gravatar imageKomail Kanjee ( 2019-10-28 08:24:40 -0600 )edit

Check that the SELinux permissions are correct for the /var/lib/nova directory on the dest.

srelf gravatar imagesrelf ( 2019-11-15 13:49:25 -0600 )edit