Ask Your Question
2

Why is live migration failing with 'Domain not found'?

asked 2014-03-27 11:21:23 -0500

tuxtard gravatar image

updated 2014-03-31 08:46:21 -0500

Hi guys. I am trying to live migrate an instance from nod1 (stacktest) to nod2 (ctest) and it is failing with the following message.

nod1:

2014-03-27 16:45:21.027 3478 INFO nova.compute.manager [-] Lifecycle event 2 on VM e8ae2a2c-3bdf-4322-ba2b-feb0d13ba391
2014-03-27 16:45:21.205 3478 INFO nova.compute.manager [-] [instance: e8ae2a2c-3bdf-4322-ba2b-feb0d13ba391] During sync_power_state the instance has a pending task. Skip.
2014-03-27 16:45:21.341 3478 INFO nova.compute.manager [-] Lifecycle event 3 on VM e8ae2a2c-3bdf-4322-ba2b-feb0d13ba391
2014-03-27 16:45:21.350 3478 ERROR nova.virt.libvirt.driver [-] [instance: e8ae2a2c-3bdf-4322-ba2b-feb0d13ba391] Live Migration failure: Domain not found: no domain with matching name 'instance-00000003'

nod2

{"vendor": "AMD", "model": "qemu64", "arch": "x86_64", "features": ["sse4a", "abm", "lahf_lm", "hypervisor", "popcnt"], "topology": {"cores": 1, "threads": 1, "sockets": 1}}
2014-03-27 16:44:15.393 414 INFO nova.virt.libvirt.firewall [req-c39dd3e3-8502-4ed8-9dc6-cb79289635d8 748d046631224150acbb31deda725d50 d42f7f63f51340af9a80f6d5554ef20c] [instance: e8ae2a2c-3bdf-4322-ba2b-feb0d13ba391] Called setup_basic_filtering in nwfilter
2014-03-27 16:44:15.394 414 INFO nova.virt.libvirt.firewall [req-c39dd3e3-8502-4ed8-9dc6-cb79289635d8 748d046631224150acbb31deda725d50 d42f7f63f51340af9a80f6d5554ef20c] [instance: e8ae2a2c-3bdf-4322-ba2b-feb0d13ba391] Ensuring static filters
2014-03-27 16:44:17.989 414 INFO nova.compute.manager [-] Lifecycle event 0 on VM e8ae2a2c-3bdf-4322-ba2b-feb0d13ba391
2014-03-27 16:44:18.601 414 INFO nova.compute.manager [req-0e8d567f-7a2c-49eb-8240-190bdc6abeab None None] [instance: e8ae2a2c-3bdf-4322-ba2b-feb0d13ba391] During the sync_power process the instance has moved from host ctest to host stacktest
2014-03-27 16:44:18.896 414 INFO nova.compute.manager [-] Lifecycle event 1 on VM e8ae2a2c-3bdf-4322-ba2b-feb0d13ba391
2014-03-27 16:44:19.220 414 INFO nova.compute.manager [-] [instance: e8ae2a2c-3bdf-4322-ba2b-feb0d13ba391] During the sync_power process the instance has moved from host ctest to host stacktest

There are no errors in libvirt log on both sides, or any other errors whatsoever. I also get the similar behavior when I try block migration. Here is my setup:

Distro: openSuSE 13.1 
Kernel: 3.11.10-7-default
OpenStack: 2013.2
Libvirt version: 1.2.2-379.2
Qemu version: 1.7.90-221.1
Network: OVS with GRE tunneling
Environment: nested KVM

/etc/libvirt/libvirt.conf:

listen_tls = 0
listen_tcp = 1
auth_tcp = "none"

/etc/sysconfig/libvirtd:

LIBVIRTD_CONFIG=/etc/libvirt/libvirtd.conf
LIBVIRTD_ARGS="--listen"
LIBVIRTD_NOFILES_LIMIT=2048

migration related nova.conf stuff:

live_migration_flag = VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE
vncserver_listen = 0.0.0.0

Any help is greatly appreciated.

EDIT I am attaching openstack and libvirt debug logs.

Nova logs:

Source nod (stacktest): http://pastebin.com/vidp3QUA
Destination nod (compute): http://pastebin.com/XF5BJvmh

Libvirt logs:

Source nod (stacktest): http://pastebin.com/GgB0iZET
Destination nod (ctest): http://pastebin.com/qhNYJY7B

It seems that live migration is failing because of the following libvirt error:

"error": {"class": "DeviceNotActive", "desc": "No active block job on device 'drive-virtio-disk0'"}}

I am unaware what is causing it and I am still unable to find a solution.

edit retag flag offensive close merge delete

Comments

<code> is not a good way to format the blocks of code. Use proper markdown syntax (the formatting bar helps, and there is also a live preview.

smaffulli gravatar imagesmaffulli ( 2014-03-27 16:26:44 -0500 )edit

Ok thanks, I will keep that in mind.

tuxtard gravatar imagetuxtard ( 2014-03-28 03:13:24 -0500 )edit

Can you please provide me a document to achieve live migration

ritesh.singh.aricent@gmail.com gravatar imageritesh.singh.aricent@gmail.com ( 2014-03-28 06:06:59 -0500 )edit

I don't know what exactly are you asking, but if you are looking for a manual on how to setup a system for live migration, you can take a look at this:

http://www.mirantis.com/blog/tutorial...

tuxtard gravatar imagetuxtard ( 2014-03-28 07:54:07 -0500 )edit

2 answers

Sort by ยป oldest newest most voted
0

answered 2014-03-28 09:08:35 -0500

tuxtard gravatar image

Thank you for your reply.

I initialized migration with:

nova live-migration instance_name host_name

and

nova live-migration instance_id host_name

When initiating these commands instance hangs for a second and then resumes to operate on the same (source) nod.

I have all host names in /etc/hosts and nods are pingable. Firewall and AppArmor are disabled. Shared storage is NFS v4 with no_root_squash and it is writable with users root and openstack-nova. I also did chmod o+x on /var/lib/nova/instances as instructed and I tried to run Qemu as root, but the same problem persists.

edit flag offensive delete link more

Comments

Can you please increase the logging level to verbose=True and debug=true in nova.conf on both nodes "ctest" and "stacktest" and retry the migration? And then post the detailed logs from both servers?

gmi gravatar imagegmi ( 2014-03-28 09:28:18 -0500 )edit

Yes, thank you.

Source nod (stacktest): http://pastebin.com/vidp3QUA

Destination nod (compute): http://pastebin.com/XF5BJvmh

tuxtard gravatar imagetuxtard ( 2014-03-28 10:46:16 -0500 )edit

Unfortunately the logs do not show anything obvious.

gmi gravatar imagegmi ( 2014-03-28 12:52:40 -0500 )edit

Yes, that's because the problem doesn't seem to be in nova, but rather in libvirt. I found few errors in libvirt debug log attached above and I investigating them.

tuxtard gravatar imagetuxtard ( 2014-03-31 08:49:36 -0500 )edit

There could either be a bug in KVM, so updating to latest version might help, or space on the disk could be another possible issue.

gmi gravatar imagegmi ( 2014-04-01 08:34:24 -0500 )edit
0

answered 2014-03-28 08:42:33 -0500

gmi gravatar image

Can you provide more info? What exactly command did you use to initiate the live migration? What happened with the instance after the migration failed, is it still functional on the initial host?

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

2 followers

Stats

Asked: 2014-03-27 11:21:23 -0500

Seen: 1,381 times

Last updated: Mar 31 '14