Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Instance in error status after failed migrate

I have a Juno environment with four nodes: controller node (in a VM) and neutron, nova-compute-1 and nova-compute-2 on real servers.

I'm able to live migrate instances from compute-1 to compute-2 and vice-versa, but when I choose the "migrate" option in dashboard (instead of "live migrate"), the migrate fails and the instance goes into an error status that I can't find a way to recover from.

For example, starting with an instance running on node compute-2:

This is the error: "Error: Failed to launch instance "test-c97c5b3d-fb91-4e7f-8981-61d1743a0edb": Please try again later [Error: Unexpected error while running command. Command: ssh <ip of="" compute-1=""> mkdir -p /var/lib/nova/instances/c97c5b3d-fb91-4e7f-8981-61d1743a0edb Exit code: 1 Stdout: u'This account is currently not available.\n' Stderr: u'**************]." <-- this seems to be because the default shell for nova is /sbin/nologin

I tried nova reset-state and that won't recover the status. I can still ssh and see the console, so the VM is up, but because the instance is now in error state, the dashboard options are limited.

nova show c97c5b3d-fb91-4e7f-8981-61d1743a0edb

+--------------------------------------+--------------------------------------------------------------------------------------------------------+ | Property | Value | +--------------------------------------+--------------------------------------------------------------------------------------------------------+ | OS-DCF:diskConfig | AUTO | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-STS:power_state | 1 | | OS-EXT-STS:task_state | - | | OS-EXT-STS:vm_state | error | | OS-SRV-USG:launched_at | 2014-11-21T17:41:04.000000 | | OS-SRV-USG:terminated_at | - | | accessIPv4 | | | accessIPv6 | | | config_drive | | | created | 2014-11-21T17:40:46Z | | dev-internal network | 192.168.1.12, 10.x.x.x | | fault | {"message": "Unexpected error while running command. | | | Command: ssh 10.x.x.x mkdir -p /var/lib/nova/instances/c97c5b3d-fb91-4e7f-8981-61d1743a0edb | | | Exit code: 1 | | | Stdout: u'This account is currently not available.\ | | | ' | | | Stderr: u'**************", "code": 500, "created": "2014-11-21T17:42:54Z"} | | flavor | m1.tiny (1) | | hostId | 3d65aa27b9b5b98e90d25ac3cd3ffa5abeb826e7500f8e4f3696c13c | | id | c97c5b3d-fb91-4e7f-8981-61d1743a0edb | | image | cirros-0.3.3-x86_64 (f577a8fc-6cfe-4a3e-9cb1-f78d6880dde1) | | key_name | - | | metadata | {} | | name | test-c97c5b3d-fb91-4e7f-8981-61d1743a0edb | | os-extended-volumes:volumes_attached | [] | | security_groups | default | | status | ERROR | | tenant_id | d125de30bb1a42ec8586615b5eb27ab7 | | updated | 2014-11-21T17:42:54Z | | user_id | 68c51e8e08f84bb1b84101dcf45396c8 | +--------------------------------------+--------------------------------------------------------------------------------------------------------+

Has anyone seen this migrate issue before? Or, how can I recover the instance from the error status?

Thanks!

Instance in error status after failed migrate

I have a Juno environment with four nodes: controller node (in a VM) and neutron, nova-compute-1 and nova-compute-2 on real servers.

I'm able to live migrate instances from compute-1 to compute-2 and vice-versa, but when I choose the "migrate" option in dashboard (instead of "live migrate"), the migrate fails and the instance goes into an error status that I can't find a way to recover from.

For example, starting with an instance running on node compute-2:

This is the error: "Error: Failed to launch instance "test-c97c5b3d-fb91-4e7f-8981-61d1743a0edb": Please try again later [Error: Unexpected error while running command. Command: ssh <ip of="" compute-1=""> mkdir -p /var/lib/nova/instances/c97c5b3d-fb91-4e7f-8981-61d1743a0edb Exit code: 1 Stdout: u'This account is currently not available.\n' Stderr: u'**************]." <-- this seems to be because the default shell for nova is /sbin/nologin

I tried nova reset-state and that won't recover the status. I can still ssh and see the console, so the VM is up, but because the instance is now in error state, the dashboard options are limited.

nova show c97c5b3d-fb91-4e7f-8981-61d1743a0edb

+--------------------------------------+--------------------------------------------------------------------------------------------------------+ | Property | Value | +--------------------------------------+--------------------------------------------------------------------------------------------------------+ | OS-DCF:diskConfig | AUTO | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-STS:power_state | 1 | | OS-EXT-STS:task_state | - | | OS-EXT-STS:vm_state | error | | OS-SRV-USG:launched_at | 2014-11-21T17:41:04.000000 | | OS-SRV-USG:terminated_at | - | | accessIPv4 | | | accessIPv6 | | | config_drive | | | created | 2014-11-21T17:40:46Z | | dev-internal network | 192.168.1.12, 10.x.x.x | | fault | {"message": "Unexpected error while running command. | | | Command: ssh 10.x.x.x mkdir -p /var/lib/nova/instances/c97c5b3d-fb91-4e7f-8981-61d1743a0edb | | | Exit code: 1 | | | Stdout: u'This account is currently not available.\ | | | ' | | | Stderr: u'**************", u'******************************************", "code": 500, "created": "2014-11-21T17:42:54Z"} | | flavor | m1.tiny (1) | | hostId | 3d65aa27b9b5b98e90d25ac3cd3ffa5abeb826e7500f8e4f3696c13c | | id | c97c5b3d-fb91-4e7f-8981-61d1743a0edb | | image | cirros-0.3.3-x86_64 (f577a8fc-6cfe-4a3e-9cb1-f78d6880dde1) | | key_name | - | | metadata | {} | | name | test-c97c5b3d-fb91-4e7f-8981-61d1743a0edb | | os-extended-volumes:volumes_attached | [] | | security_groups | default | | status | ERROR | | tenant_id | d125de30bb1a42ec8586615b5eb27ab7 | | updated | 2014-11-21T17:42:54Z | | user_id | 68c51e8e08f84bb1b84101dcf45396c8 | +--------------------------------------+--------------------------------------------------------------------------------------------------------+

+--------------------------------------+--------------------------------------------------------------------------------------------------------+

Has anyone seen this migrate issue before? Or, how can I recover the instance from the error status?

Thanks!

Instance in error status after failed migrate

I have a Juno environment with four nodes: controller node (in a VM) and neutron, nova-compute-1 and nova-compute-2 on real servers.

I'm able to live migrate instances from compute-1 to compute-2 and vice-versa, but when I choose the "migrate" option in dashboard (instead of "live migrate"), the migrate fails and the instance goes into an error status that I can't find a way to recover from.

For example, starting with an instance running on node compute-2:

This is the error: error:

"Error:  Failed to launch instance "test-c97c5b3d-fb91-4e7f-8981-61d1743a0edb": Please try again later [Error: Unexpected error while running command. Command: ssh <ip of="" compute-1=""> of compute-1> mkdir -p /var/lib/nova/instances/c97c5b3d-fb91-4e7f-8981-61d1743a0edb Exit code: 1 Stdout: u'This account is currently not available.\n' Stderr: u'**************]." u'******************************************]."   <-- this seems to be because the default shell for nova is /sbin/nologin

/sbin/nologin

I tried nova reset-state and that won't recover the status. I can still ssh and see the console, so the VM is up, but because the instance is now in error state, the dashboard options are limited.

nova show c97c5b3d-fb91-4e7f-8981-61d1743a0edb

+--------------------------------------+--------------------------------------------------------------------------------------------------------+
| Property                             | Value                                                                                                  |
+--------------------------------------+--------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig                    | AUTO                                                                                                   |
| OS-EXT-AZ:availability_zone          | nova                                                                                                   |
| OS-EXT-STS:power_state               | 1                                                                                                      |
| OS-EXT-STS:task_state                | -                                                                                                      |
| OS-EXT-STS:vm_state                  | error                                                                                                  |
| OS-SRV-USG:launched_at               | 2014-11-21T17:41:04.000000                                                                             |
| OS-SRV-USG:terminated_at             | -                                                                                                      |
| accessIPv4                           |                                                                                                        |
| accessIPv6                           |                                                                                                        |
| config_drive                         |                                                                                                        |
| created                              | 2014-11-21T17:40:46Z                                                                                   |
| dev-internal network                 | 192.168.1.12, 10.x.x.x                                                                             |
| fault                                | {"message": "Unexpected error while running command.                                                   |
|                                      | Command: ssh 10.x.x.x mkdir -p /var/lib/nova/instances/c97c5b3d-fb91-4e7f-8981-61d1743a0edb        |
|                                      | Exit code: 1                                                                                           |
|                                      | Stdout: u'This account is currently not available.\                                                    |
|                                      | '                                                                                                      |
|                                      | Stderr: u'******************************************", "code": 500, "created": "2014-11-21T17:42:54Z"} |
| flavor                               | m1.tiny (1)                                                                                            |
| hostId                               | 3d65aa27b9b5b98e90d25ac3cd3ffa5abeb826e7500f8e4f3696c13c                                               |
| id                                   | c97c5b3d-fb91-4e7f-8981-61d1743a0edb                                                                   |
| image                                | cirros-0.3.3-x86_64 (f577a8fc-6cfe-4a3e-9cb1-f78d6880dde1)                                             |
| key_name                             | -                                                                                                      |
| metadata                             | {}                                                                                                     |
| name                                 | test-c97c5b3d-fb91-4e7f-8981-61d1743a0edb                                                              |
| os-extended-volumes:volumes_attached | []                                                                                                     |
| security_groups                      | default                                                                                                |
| status                               | ERROR                                                                                                  |
| tenant_id                            | d125de30bb1a42ec8586615b5eb27ab7                                                                       |
| updated                              | 2014-11-21T17:42:54Z                                                                                   |
| user_id                              | 68c51e8e08f84bb1b84101dcf45396c8                                                                       |
+--------------------------------------+--------------------------------------------------------------------------------------------------------+

Has anyone seen this migrate issue before? Or, how can I recover the instance from the error status?

Thanks!