Cold migration only works from one host to the other one, but not in the other way around [closed]

asked 2016-10-18 05:16:57 -0500

AlvaroG gravatar image

Hi,

I have a setup with two compute hosts, node-10 and node-12. I'm trying 'nova migrate' and I've seen that migration only works from node-12 to node-10, but not from node-10 to node-12. In this case, there is no error message, but resizing for migration never happens and the VM is kept in active state.

I feel that my problem is a misconfiguration of host aggregates and availability zones, but as long as I read more documentation of these two concepts, the less I understand.

This is my setup:

root@node-5:~# nova hypervisor-list
+----+---------------------+-------+---------+
| ID | Hypervisor hostname | State | Status |
+----+---------------------+-------+---------+
| 1 | node-12.domain.tld | up | enabled |
| 4 | node-10.domain.tld | up | enabled |
+----+---------------------+-------+---------+

root@node-5:~# nova aggregate-list
+----+---------+--------------------+
| Id | Name | Availability Zone |
+----+---------+--------------------+
| 1 | node-10 | node-10.domain.tld |
| 4 | node-12 | node-12.domain.tld |
+----+---------+--------------------+

root@node-5:~# nova aggregate-details 1
+----+---------+--------------------+-------+----------------------------------------+
| Id | Name | Availability Zone | Hosts | Metadata |
+----+---------+--------------------+-------+----------------------------------------+
| 1 | node-10 | node-10.domain.tld | | 'availability_zone=node-10.domain.tld' |
+----+---------+--------------------+-------+----------------------------------------+

root@node-5:~# nova aggregate-details 4
+----+---------+--------------------+--------------------------------------------+----------------------------------------+
| Id | Name | Availability Zone | Hosts | Metadata |
+----+---------+--------------------+--------------------------------------------+----------------------------------------+
| 4 | node-12 | node-12.domain.tld | 'node-12.domain.tld', 'node-10.domain.tld' | 'availability_zone=node-12.domain.tld' |
+----+---------+--------------------+--------------------------------------------+----------------------------------------+

root@node-5:~# nova availability-zone-list
+-----------------------+----------------------------------------+
| Name | Status |
+-----------------------+----------------------------------------+
| internal | available |
| |- node-5.domain.tld | |
| | |- nova-conductor | enabled :-) 2016-10-18T09:15:13.000000 |
| | |- nova-consoleauth | enabled :-) 2016-10-18T09:15:25.000000 |
| | |- nova-scheduler | enabled :-) 2016-10-18T09:15:29.000000 |
| | |- nova-cert | enabled :-) 2016-10-18T09:15:14.000000 |
| |- node-6.domain.tld | |
| | |- nova-conductor | enabled :-) 2016-10-18T09:15:35.000000 |
| | |- nova-consoleauth | enabled :-) 2016-10-18T09:15:42.000000 |
| | |- nova-scheduler | enabled :-) 2016-10-18T09:15:41.000000 |
| | |- nova-cert | enabled :-) 2016-10-18T09:15:42.000000 |
| |- node-7.domain.tld | |
| | |- nova-conductor | enabled :-) 2016-10-18T09:15:20.000000 |
| | |- nova-scheduler | enabled :-) 2016-10-18T09:15:24.000000 |
| | |- nova-consoleauth | enabled :-) 2016-10-18T09:15:29.000000 |
| | |- nova-cert | enabled :-) 2016-10-18T09:15:29.000000 |
| node-12.domain.tld | available |
| |- node-10.domain.tld | |
| | |- nova-compute | enabled :-) 2016-10-18T09:15:33.000000 |
| |- node-12.domain.tld | |
| | |- nova-compute | enabled :-) 2016-10-18T09:15:37.000000 |
+-----------------------+----------------------------------------+

I've also seen the following exception in nova logs:

<182>Oct 18 10:10:43 node-10 nova-compute: 2016-10-18 10:10:43.301 22177 INFO nova.compute.manager [req-83dab701-cd15-4873-999c-fc9f311d6021 a2efad47d7b6475f9e28d876d8d24891 1f6ab78398a9494ba1b92864ecbb8fd1 - - -] [instance: 17b8aaaa-349b-48f5-a195-6b8834f8a7a6] Setting instance back to ACTIVE after: Instance rollback performed due to: Resize error: not able to execute ssh command: Unexpected error while running command.
Command: ssh 192.168.0.3 mkdir -p /var/lib/nova/instances/17b8aaaa-349b-48f5-a195-6b8834f8a7a6
Exit code: 255
Stdout: u''
Stderr: u"Warning: Permanently added '192.168.0.3' (ECDSA) to the list of known hosts.\r\nPermission denied (publickey).\r\n"
<179>Oct 18 10:10:43 node-10 nova-compute: 2016-10-18 10:10:43.656 22177 ERROR oslo_messaging.rpc.dispatcher [req-83dab701-cd15-4873-999c-fc9f311d6021 a2efad47d7b6475f9e28d876d8d24891 1f6ab78398a9494ba1b92864ecbb8fd1 - - -] Exception during message handling: Resize error: not able to execute ssh command: Unexpected error while running command.
Command: ssh 192.168.0.3 mkdir -p /var/lib/nova/instances/17b8aaaa-349b-48f5-a195-6b8834f8a7a6
Exit code: 255
Stdout: u''
Stderr: u"Warning: Permanently added '192.168.0.3' (ECDSA) to the list of known hosts.\r\nPermission denied (publickey).\r\n"

Thanks, Alvaro

edit retag flag offensive reopen merge delete

Closed for the following reason duplicate question by AlvaroG
close date 2016-10-18 07:22:21.700488