Nova can't provision instance after image name change with a ceph backend [closed]
This is a weird issue that does not happen in our Juno setup, but happens in our Kilo setup. The configuration between the two setups is pretty much the same, with only kilo-specific changes done (namely, moving lines around to new sections).
Here's how to reproduce: 1.Upload an image into glance. 2.Rename that image either through command line or horizon. 3.Try to provision an instance with that image using the boot from image option.
Now, here's what the default behavior is supposed to be from my observations: -When the image is uploaded into ceph, a snapshot is created automatic inside ceph (this is NOT an instance snapshot per say, but a ceph internal snapshot). -When an instance is booted from image in nova, this snapshot gets a clone in the nova ceph pool. Nova then uses that clone as the instance's disk. This is called copy-on-write cloning.
Here's when things get funky: -After an image is renamed and an instance is booted from it, the copy-on-write cloning does not happen. Nova looks for the disk and, of course, fails to find it in its pool, thus failing to provision the instance . There's no trace anywhere of the copy-on-write clone failing (In part because ceph doesn't log client commands, from what I see).
Here's the error in the compute logs:
/var/log/upstart/nova-compute.log:2015-06-18 18:34:06.507 10070 DEBUG nova.compute.manager [req-4e2ab669-abdb-419a-b9ca-1740af96e154 - - - - -] [instance: b8aa993c-6526-4565-b6e6-ee0a82d0a73a] Synchronizing instance power state after lifecycle event "Stopped"; current vm_state: active, current task_state: rebuilding, current DB power_state: 1, VM power_state: 4 handle_lifecycle_event /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1241
/var/log/upstart/nova-compute.log:2015-06-18 18:34:06.539 10070 INFO nova.compute.manager [req-4e2ab669-abdb-419a-b9ca-1740af96e154 - - - - -] [instance: b8aa993c-6526-4565-b6e6-ee0a82d0a73a] During sync_power_state the instance has a pending task (rebuilding). Skip.
/var/log/upstart/nova-compute.log:2015-06-18 18:34:06.802 10070 DEBUG nova.virt.libvirt.vif [req-d0ff8267-74b1-4d7e-9d8e-5df65f5782b4 8ca1f02ca85e4211ad943acce23488ac 5f4ecb18ba284692baae27b87abf1f2b - - -] vif_type=ovs instance=Instance(access_ip_v4=None,access_ip_v6=None,architecture=None,auto_disk_config=False,availability_zone=None,cell_name=None,cleaned=False,config_drive='',created_at=2015-06-18T22:10:32Z,default_ephemeral_device=None,default_swap_device=None,deleted=False,deleted_at=None,disable_terminate=False,display_description='snapme',display_name='bug',ephemeral_gb=0,ephemeral_key_uuid=None,fault=<?>,flavor=Flavor(58),host='compute4.prod.cloud.gtcomm.net',hostname='snapme',id=331,image_ref='3fc7d726-2863-475a-8b3d-2eecfee32b32',info_cache=InstanceInfoCache,instance_type_id=58,kernel_id='',key_data=None,key_name=None,launch_index=0,launched_at=2015-06-18T22:10:41Z,launched_on='compute4.prod.cloud.gtcomm.net',locked=False,locked_by=None,memory_mb=2048,metadata={meta_var='example'},new_flavor=None,node='compute4.prod.cloud.gtcomm.net',numa_topology=<?>,old_flavor=None,os_type=None,pci_devices=<?>,pci_requests=<?>,power_state=1,progress=0,project_id='5f4ecb18ba284692baae27b87abf1f2b',ramdisk_id='',reservation_id='r-f7lxpv8y',root_device_name='/dev/vda',root_gb=20,scheduled_at=None,security_groups=SecurityGroupList,shutdown_terminate=False,system_metadata={image_base_image_ref='14d59af7-8e21-4e26-82dc-290308edd899',image_container_format='bare',image_disk_format='raw',image_image_location='snapshot',image_image_state='available',image_image_type='snapshot',image_instance_uuid='b8aa993c-6526-4565-b6e6-ee0a82d0a73a',image_min_disk='20',image_min_ram='0',image_network_allocated='True',image_owner_id='5f4ecb18ba284692baae27b87abf1f2b',image_user_id='8ca1f02ca85e4211ad943acce23488ac',network_allocated='True'},tags=<?>,task_state='rebuilding',terminated_at=None,updated_at=2015-06-18T22:34:04Z,user_data='I2Nsb3VkLWNvbmZpZwpkaXNhYmxlX3Jvb3Q6IDAKc3NoX3B3YXV0aDogMQpwYXNzd29yZDogdGVzdHRlc3QKY2hwYXNzd2Q6IHsgZXhwaXJlOiBGYWxzZSB9CmJvb3RjbWQ6CiAtIGlmY29uZmlnIGV0aDEgMTczLjIwOS40NC4xMjcgbmV0bWFzayAyNTUuMjU1LjI1NS4wCiAtIHJvdXRlIGFkZCBkZWZhdWx0IGd3IDE3My4yMDkuNDQuMQogLSByb3V0ZSBkZWwgZGVmYXVsdCBndyAxMC4wLjAuMQogLSBlY2hvICduYW1lc2VydmVyIDguOC44LjgnID4gL2V0Yy9yZXNvbHYuY29uZg==',user_id='8ca1f02ca85e4211ad943acce23488ac',uuid=b8aa993c-6526-4565-b6e6-ee0a82d0a73a,vcpu_model=<?>,vcpus=2,vm_mode=None,vm_state='active') vif=VIF({'profile ...
I can also add that booting an instance from a snapshot does not work anymore.