Ask Your Question
3

Instance pauses while taking snapshot

asked 2015-09-09 08:46:36 -0500

jgalvin2015 gravatar image

updated 2015-09-10 09:34:34 -0500

Openstack Kilo release Ceph storage backend

I have run into an issue lately where I create a snapshot of a running instance, while the snapshot state is "image pending upload" And "Queued" state in the image service, I cant access my Instance

I cant access it via the console, ssh and I ran a continuous ping to the floating IP and it times out, but the instance still shows running on the dashboard

As soon as the instance state is "Image uploading" and in the image service "saving" The instance becomes available again,

Is this a know issue ? All of this is done via Horizon dashboard

I can see the following in the logs on the compute:

2015-09-09 14:33:39.265 23261 INFO nova.compute.manager [req-0252f823-73f5-4c37-aa86-efbe6536e4f6 d2b1cc9566d44a909de46689569118e3 3b5e03b8a83e44dd9a7140d868d28a9e - - -] [instance: a0e855e5-c205-4d61-bd48-99384d6310f5] instance snapshotting
2015-09-09 14:33:39.877 23261 INFO nova.compute.manager [req-1a3446d4-c183-4729-86b3-d63e15fe38d7 - - - - -] [instance: a0e855e5-c205-4d61-bd48-99384d6310f5] VM Paused (Lifecycle Event)
2015-09-09 14:33:40.049 23261 INFO nova.compute.manager [req-1a3446d4-c183-4729-86b3-d63e15fe38d7 - - - - -] [instance: a0e855e5-c205-4d61-bd48-99384d6310f5] During sync_power_state the instance has a pending task (image_snapshot). Skip.
2015-09-09 14:33:50.756 23261 INFO nova.virt.libvirt.driver [req-0252f823-73f5-4c37-aa86-efbe6536e4f6 d2b1cc9566d44a909de46689569118e3 3b5e03b8a83e44dd9a7140d868d28a9e - - -] [instance: a0e855e5-c205-4d61-bd48-99384d6310f5] Beginning cold snapshot process
2015-09-09 14:33:50.759 23261 INFO nova.compute.manager [req-1a3446d4-c183-4729-86b3-d63e15fe38d7 - - - - -] [instance: a0e855e5-c205-4d61-bd48-99384d6310f5] VM Stopped (Lifecycle Event)
2015-09-09 14:33:50.939 23261 INFO nova.compute.manager [req-1a3446d4-c183-4729-86b3-d63e15fe38d7 - - - - -] [instance: a0e855e5-c205-4d61-bd48-99384d6310f5] During sync_power_state the instance has a pending task (image_snapshot). Skip.
2015-09-09 14:35:57.831 23261 INFO nova.compute.manager [req-1a3446d4-c183-4729-86b3-d63e15fe38d7 - - - - -] [instance: a0e855e5-c205-4d61-bd48-99384d6310f5] VM Started (Lifecycle Event)
2015-09-09 14:35:57.990 23261 INFO nova.compute.manager [req-1a3446d4-c183-4729-86b3-d63e15fe38d7 - - - - -] [instance: a0e855e5-c205-4d61-bd48-99384d6310f5] During sync_power_state the instance has a pending task (image_pending_upload). Skip.
2015-09-09 14:35:57.991 23261 INFO nova.compute.manager [req-1a3446d4-c183-4729-86b3-d63e15fe38d7 - - - - -] [instance: a0e855e5-c205-4d61-bd48-99384d6310f5] VM Resumed (Lifecycle Event)
2015-09-09 14:35:58.151 23261 INFO nova.compute.manager [req-1a3446d4-c183-4729-86b3-d63e15fe38d7 - - - - -] [instance: a0e855e5-c205-4d61-bd48-99384d6310f5] During sync_power_state the instance has a pending task (image_pending_upload). Skip.
2015-09-09 14:35:58.498 23261 INFO nova.virt.libvirt.driver [req-0252f823-73f5-4c37-aa86-efbe6536e4f6 d2b1cc9566d44a909de46689569118e3 3b5e03b8a83e44dd9a7140d868d28a9e - - -] [instance: a0e855e5-c205-4d61-bd48-99384d6310f5] Snapshot extracted, beginning image upload

Any help with this would be appreciated :)

Thanks

edit retag flag offensive close merge delete

3 answers

Sort by ยป oldest newest most voted
0

answered 2015-12-01 11:04:16 -0500

capsali gravatar image

Is ceph used as ephemeral storage backend aswell? Because there is a problem when snapshotting with ceph as backend. Instead of relying on ceph snapshot, nova instead calls qemu for snapshots. So it first downloads the instance image on local storage then uploads it into glance. It's a long and bandwidght intense process. I read somewhere that this is fixed in liberty but i cannot confirm it because we do not use ephemeral storaget for our instances, only cinder volumes.

For a workaround, you could use cinder volumes backed by ceph rbd for instance boot volume. When taking a snapshot of an instance the process is intant, because of cephs copy on write future. So you will see no downtime on instance!

edit flag offensive delete link more
0

answered 2015-12-01 08:25:25 -0500

jgalvin2015 gravatar image

Me again,

I have tried all the above options on instance live snapshot to no avail :(

freezing the instance with linux-utils and syncing etc. etc.

Also libvirt and qemu versions are fine,

No matter what i try the instance pauses and stops the Instance momentarily before uploading the image to glance.

My question is , is it just my having this issue or is this supposed to happen?

We are going into production shortly and i seem to be stuck on this issue.

nova-compute.log

2015-12-01 14:13:53.010 3762 INFO nova.compute.manager [req-76473785-012f-4618-b1a9-36a9611ef622 - - - - -] [instance: 610726f8-757b-4c32-9990-e605eb4acd42] VM Paused (Lifecycle Event) 2015-12-01 14:13:53.324 3762 INFO nova.compute.manager [req-76473785-012f-4618-b1a9-36a9611ef622 - - - - -] [instance: 610726f8-757b-4c32-9990-e605eb4acd42] During sync_power_state the instance has a pending task (image_snapshot). Skip. 2015-12-01 14:13:57.295 3762 INFO nova.compute.manager [req-76473785-012f-4618-b1a9-36a9611ef622 - - - - -] [instance: 610726f8-757b-4c32-9990-e605eb4acd42] VM Stopped (Lifecycle Event) 2015-12-01 14:13:57.296 3762 INFO nova.virt.libvirt.driver [req-8491c6b2-840e-434d-baab-b3a0a6776d35 d2b1cc9566d44a909de46689569118e3 3b5e03b8a83e44dd9a7140d868d28a9e - - -] [instance: 610726f8-757b-4c32-9990-e605eb4acd42] Beginning cold snapshot process 2015-12-01 14:13:57.504 3762 INFO nova.compute.manager [req-76473785-012f-4618-b1a9-36a9611ef622 - - - - -] [instance: 610726f8-757b-4c32-9990-e605eb4acd42] During sync_power_state the instance has a pending task (image_snapshot). Skip. 2015-12-01 14:14:38.324 3762 INFO nova.compute.resource_tracker [req-faf5b055-117c-49df-abf3-4137ded40be6 - - - - -] Auditing locally available compute resources for node kloud-compute6 2015-12-01 14:14:38.946 3762 INFO nova.compute.resource_tracker [req-faf5b055-117c-49df-abf3-4137ded40be6 - - - - -] Total usable vcpus: 8, total allocated vcpus: 4 2015-12-01 14:14:38.947 3762 INFO nova.compute.resource_tracker [req-faf5b055-117c-49df-abf3-4137ded40be6 - - - - -] Final resource view: name=kloud-compute6 phys_ram=24110MB used_ram=10752MB phys_disk=33337GB used_disk=100GB total_vcpus=8 used_vcpus=4 pci_stats=<nova.pci.stats.pcidevicestats object="" at="" 0x7f132eecc310=""> 2015-12-01 14:14:39.021 3762 INFO nova.scheduler.client.report [req-faf5b055-117c-49df-abf3-4137ded40be6 - - - - -] Compute_service record updated for ('kloud-compute6', 'kloud-compute6') 2015-12-01 14:14:39.021 3762 INFO nova.compute.resource_tracker [req-faf5b055-117c-49df-abf3-4137ded40be6 - - - - -] Compute_service record updated for kloud-compute6:kloud-compute6 2015-12-01 14:15:05.390 3762 INFO nova.compute.manager [req-76473785-012f-4618-b1a9-36a9611ef622 - - - - -] [instance: 610726f8-757b-4c32-9990-e605eb4acd42] VM Started (Lifecycle Event) 2015-12-01 14:15:05.571 3762 INFO nova.compute.manager [req-76473785-012f-4618-b1a9-36a9611ef622 - - - - -] [instance: 610726f8-757b-4c32-9990-e605eb4acd42] During sync_power_state the instance has a pending task (image_pending_upload). Skip. 2015-12-01 14:15:05.572 3762 INFO nova.compute.manager [req-76473785-012f-4618-b1a9-36a9611ef622 - - - - -] [instance: 610726f8-757b-4c32-9990-e605eb4acd42] VM Resumed (Lifecycle Event) 2015-12-01 14:15:05.739 3762 INFO nova.compute.manager [req-76473785-012f-4618-b1a9-36a9611ef622 - - - - -] [instance: 610726f8-757b-4c32-9990-e605eb4acd42] During sync_power_state the instance has a pending task (image_pending_upload). Skip. 2015-12-01 14:15:05.959 3762 INFO nova.virt.libvirt.driver [req-8491c6b2-840e-434d-baab-b3a0a6776d35 d2b1cc9566d44a909de46689569118e3 3b5e03b8a83e44dd9a7140d868d28a9e - - -] [instance: 610726f8-757b-4c32-9990-e605eb4acd42] Snapshot extracted, beginning image upload

libvirtd.log

2015-12-01 14:07:42.278+0000: 3372: error : virStorageFileBackendForTypeInternal:1229 : internal error: missing storage backend for network files using rbd protocol 2015-12-01 14:08:14.493+0000: 25128: warning : processNicRxFilterChangedEvent:4221 : ignore NIC_RX_FILTER_CHANGED event for network device net0 in domain instance-00001751 2015-12-01 14:13:53.491+0000: 3371: warning : AppArmorSetFDLabel:966 : could not find path for descriptor /proc/self/fd/33, skipping 2015-12-01 14:13:56.622+0000: 3371: warning : virFileWrapperFdClose:329 : iohelper reports:

And pinging the instance from the time i create the snapshot

Reply from 212.147.172.198: bytes=32 time=1ms TTL=59 Reply from 212.147.172.198: bytes ... (more)

edit flag offensive delete link more
3

answered 2015-09-10 08:31:38 -0500

fgorbat gravatar image

There are distinction between Snaphots and Live Snapshots.

Generally speaking, to insure consistent of snapshot, freeze/pausing system is required, otherwise File-System (FS) inconsistency or files corruption may happen.

Check out more info1 and info2

edit flag offensive delete link more

Comments

Thanks I must check my Qemu and Libvirt versions, But in the case of a customer having a production system running and wants to live snapshot it before making a change to say an application they have running on the system. Live snapshots should be available to them.

jgalvin2015 gravatar imagejgalvin2015 ( 2015-09-10 11:01:01 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2015-09-09 08:46:36 -0500

Seen: 3,796 times

Last updated: Dec 01 '15