An interesting behavior in VM live migration

asked 2016-08-17

fifi

updated 2016-08-17

I conducted some experiments with VM block live migration. I have two VMs. VM1 is the one which has been launched from an image (Ubuntu 14.04 server 64bit image). VM 2 is the one which has been created from an snapshot of VM 1. Now, when I live migrate vm 1, the total migration time is something around 100 ms, while the total migration time for the VM 2 is around 10ms. I repeated the migration for both VM 1 and VM 2 and each time the results are almost similar. Both VM1 and VM2 are exactly the same. the only difference is the way they have been launched. I also installed more applications on VM2 and then repeated the migration. Again, the VM2 migrated in shorter time than what VM1 did.

Does anyone knows what causes this difference in migration time and how it can be explained?


answered 2016-08-21

I suppose your instances are based on nova storage and not on cinder, so instances on nova storage are based on two qcow2 files, one is the glance image, and the second one is instances differences from glance image. When you start live migration the destination nova-compute (hypervisor), if haven't glance image on his cache -> for example because there aren't any instances running on it based on your glance image, the nova-compute node first have to download from glance this image and then it could migrate your instances.

But your environment could be different... HTH Amedeo

Could you please clarify a little bit more? As far as I understand, you're gonna say that downloading a snapshot takes less time than downloading an image from glance? is that what you mean?

fifi ( 2016-08-21 )

no, snapshot are also your private (in your tenant) glance image, what I mean is only how nova storage works behind the scenes, to better explain to live migrate your instance, the destination nova-compute has to download first your image (private or public).

amedeo-salvati ( 2016-08-21 )

Thanks for your reply. It's helpful, but it still does not explain why live migration takes more time when I use an image and it takes less time when I use the snapshot. Any idea on that?

fifi ( 2016-08-21 )

probably because one is cached on hypervisor and the other one not

amedeo-salvati ( 2016-08-21 )

Asked: 2016-08-17

Seen: 98 times

Last updated: Aug 21 '16