Ask Your Question
0

live migration fails

asked 2013-08-08 08:56:15 -0500

Ramkumar Raghavan gravatar image

updated 2013-08-08 10:16:41 -0500

Jobin gravatar image

I have configured 1 controller, 1 quantum network and 3 compute nodes. All running on ubuntu 13.04 and openstack grizzly.

Currently I am trying to do a live migration of a VM instance from one compute node (stackcmpt3) to another(stackcmpt2).

I am using shared storage with /var/lib/nova/instances mounted in NFS partition.

As soon the live migration command is executed the source machine goes down with the following CPU panic/exception:

Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357003] general protection fault: 0000 [#1] SMP
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357026] Modules linked in: nfsv3(F) nfsv4(F) vhost_net macvtap(F) macvlan(F) nf_conntrack_ipv6(F) nf_defrag_ipv6(F) iptable_nat(F) nf_nat_ipv4(F) nf_nat(F) xt_mac(F) xt_tcpudp(F) nf_conntrack_ipv4(F) nf_defrag_ipv4(F) xt_state(F) nf_conntrack(F) xt_physdev(F) veth(F) bridge(F) stp(F) llc(F) ip6table_filter(F) ip6_tables(F) iptable_filter(F) ip_tables(F) ebtable_nat(F) ebtables(F) x_tables(F) openvswitch(OF) nbd(F) ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp(F) libiscsi_tcp(F) libiscsi(F) scsi_transport_iscsi(F) nfsd(F) auth_rpcgss(F) nfs_acl(F) nfs(F) lockd(F) sunrpc(F) fscache(F) ext2(F) coretemp kvm_intel kvm ghash_clmulni_intel(F) aesni_intel(F) aes_x86_64(F) xts(F) lrw(F) gf128mul(F) ablk_helper(F) cryptd(F) joydev(F) ppdev(F) gpio_ich parport_pc(F) snd_hda_codec_realtek i915 video(F) drm_kms_helper drm snd_hda_intel mei snd_hda_codec snd_hwdep(F) i2c_algo_bit snd_pcm(F) snd_page_alloc(F) mac_hid snd_timer(F) snd(F) soundcore(F) lpc_ich psmouse(F)
Aug  8 19:19:33 stackcmpt3 kernel: serio_raw(F) microcode(F) lp(F) parport(F) dm_multipath(F) scsi_dh(F) hid_generic usbhid hid 8139too(F) 8139cp(F) e1000e(F)
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357384] CPU 0
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357392] Pid: 6109, comm: sudo Tainted: GF          O 3.8.0-27-generic #40-Ubuntu                  /DH61WW
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357419] RIP: 0010:[<ffffffff81153931>]  [<ffffffff81153931>] anon_vma_interval_tree_remove+0x141/0x250
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357449] RSP: 0018:ffff8801ed5f3af8  EFLAGS: 00010282
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357464] RAX: ffff880200000020 RBX: ffff8801b0519a80 RCX: ff8801b0594a1021
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357483] RDX: 0000000000000001 RSI: ffff880206f9b330 RDI: ffff8801b0519a80
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357501] RBP: ffff8801ed5f3b08 R08: 00007f4c4d2d8000 R09: ffff880200000020
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357520] R10: ffff8802115f8c60 R11: ffff8801b0519aa0 R12: ffff880206f9b300
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357538] R13: ffff880206f9b300 R14: ffff880206d4c400 R15: ffff880206f9b300
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357557] FS:  0000000000000000(0000) GS:ffff88021f200000(0000) knlGS:0000000000000000
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357578] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357593] CR2: 00007fcc897e5274 CR3: 0000000206f2d000 CR4: 00000000000427f0
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357612] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357630] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug  8 19:19:33 stackcmpt3 kernel: [ 6767.357649] Process sudo ...
(more)
edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
0

answered 2013-09-01 15:14:42 -0500

Ramkumar Raghavan gravatar image

I was able to complete the live migration successfully.

There could be possibly 2 reasons that could have caused the above issues while performing live migration of vm's:

  1. The uid/gid of nova, quantum, nfs etc were different on the 2 compute nodes

  2. The compute nodes had different CPU configuration.

What I have done now is:

  1. The 2 compute noded has the same CPU configuration
  2. Copied the /etc/passwd file from controller to the 2 compute nodes and then installed any services. This ensured al l the compute noded has the same uid/gid.

Thanks Ramkumar Raghavan

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2013-08-08 08:56:15 -0500

Seen: 379 times

Last updated: Sep 01 '13