Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

How to reduce downtime for vm live migrations ?

I set up a Openstack System with four nodes by kolla-ansible. One controller node, one network node, and two compute nodes. The two compute nodes share a NFS. I created an instance named ubuntu0 by ubuntu16 cloud version image. Then I migration the instance between two compute nodes. During the migration, I would ping the instance.

sudo ping 10.0.2.159 -i 0.01 >> ping.log

When migration finished, I will caculate the downtime by total time and package loss. Theoretically, the downtime should less then 1 second. However, my downtime alway more then 4 second. This result confused me. I hope someone who is good at it could give me some advice. Thank you very much!

How to reduce downtime for vm live migrations ?

I set up a Openstack System with four nodes by kolla-ansible. One controller node, one network node, and two compute nodes. The two compute nodes share a NFS. I created an instance named ubuntu0 by ubuntu16 cloud version image. Then I migration migrate the instance between two compute nodes. During the migration, I would ping the instance.

sudo ping 10.0.2.159 -i 0.01 >> ping.log

When migration finished, I will caculate the downtime by total time and package loss. Theoretically, the downtime should less then 1 second. However, my downtime alway more then 4 second. This result confused me. I hope someone who is good at it could give me some advice. Thank you very much!

How to reduce downtime for vm live migrations ?

I set up a Openstack System with four nodes by kolla-ansible. One controller node, one network node, and two compute nodes. The two compute nodes share a NFS. I created an instance named ubuntu0 by ubuntu16 cloud version image. Then I migrate the instance between two compute nodes. During the migration, I would ping the instance.

sudo ping 10.0.2.159 -i 0.01 >> ping.log

When migration finished, I will caculate the downtime by total time and package loss. Theoretically, the downtime should less then 1 second. However, my downtime alway more then 4 second. This result confused me. I hope someone who is good at it could give me some advice. Thank you very much!

@Bernd Bausch Thanks for your answer. I thought about your point of view carefully. It sounds reasonable. However, others did the same experiment too. Like the article https://blog.zhaw.ch/icclab/an-analysis-of-the-performance-of-live-migration-in-openstack/ (link text) . If the migration requires message queue and API communication, how can they get the downtime which is less than 1s ? They caculated the downtime by ping and I took the same method. Maybe you are right, but I don't know how to delete the API communication time. Any advice? For your questions: I reviewed the ping.log and found that the lost icmq_seq are concentrated. I.e. the lost time is a continuous time. When the instance is not migrating, the package loss will be zero.