Nova instance crash after reattaching volume

asked 2018-04-12 07:29:33 -0600

crazik gravatar image

I recently realized, that there is problem with my demo installation (Ocata) I have ceph-backed instances (boot from volume only, no ephemerals).
When I create a new empty volume, attach, detach and attach again - in most cases instance crashes. Sometimes with second or third try. In a few cases it was already at first attach.

Looks like bug in qemu/libvirt connected with Ceph storage.

Anyone had something similar?

edit retag flag offensive close merge delete

Comments

We're using Ceph Luminous with Ocata in a production environment and haven't had any issues with volumes. Although we mostly use ephemeral disks for instances, we have also volume backend instances and I can't remember having any problems. Is there something in the logs (cinder, ceph, nova)?

eblock gravatar imageeblock ( 2018-04-12 07:57:30 -0600 )edit

@crazik What does the ceph.cinder,nova error log says when the attachment fails ?Can you share the logs

Deepa gravatar imageDeepa ( 2018-04-13 00:02:48 -0600 )edit

@eblock, @Deepa: kernel: qemu-system-x86[30720]: segfault at 100 ip 000056258ba78144 sp 00007fca010c1eb0 error 4 in qemu-system-x86_64[56258b47a000+842000] libvirtd: Unable to remove drive drive-virtio-disk1 [...] after failed qemuMonitorAddDevice

crazik gravatar imagecrazik ( 2018-04-13 02:22:25 -0600 )edit

And on ceph monitor: logs from cinder node: e11 handle_command mon_command({ " p r e f i x " : " d f " , " f o r m a t " : " j s o n " } v 0) v1

crazik gravatar imagecrazik ( 2018-04-13 02:22:48 -0600 )edit
1

Could you please update your question (you are not limited in characters there) with more detailed info like a stack trace or something? I'm not able to tell anything by this little bit. Logs from nova, cinder, libvirt etc. would be useful.

eblock gravatar imageeblock ( 2018-04-13 07:59:41 -0600 )edit