Booting multiple VMs concurrently increases latency of boot

asked 2015-12-15 04:06:59 -0600

We have observed during our scale tests, when we increase the concurrency in rally boot-vm test case for given number of iterations, we see that overall nova boot latency is also getting increased. For example, for concurrency 8, the boot time is 6-7 secs, for concurrency 64, boot time is 21-22secs. Note: We have 8 compute nodes and 6 controller nodes totally.

We figured out that the latency variation is on compute side in a method create_image(https://github.com/openstack/nova/blob/master/nova/virt/libvirt/imagebackend.py)

The create_image method is a member of class Qcow2 which creates a cow image for an instance from a given base image file. This method is synchronized over the base image object. To our understanding the synchronization is to prevent multiple processes/users trying to create or modify the base image at the same time while someone is trying to create a qcow2 layer on it.

Has anybody observed the same? is there a possibility of using reader/writer lock, so that only creating base image case or modifying the size of the base image would hold the writer lock and all the qcow2 creates could go in parallel with reader lock?

edit retag flag offensive close merge delete