Instances reboot and go to kernel panic

asked 2014-10-30 08:07:20 -0500

updated 2014-10-30 09:59:08 -0500

On Icehouse, I have an environment where instances randomly reboot and go in kernel panic.

Nova correctly spawns instances from qcow2 images (Linux 3.0.75 x86_64) but, with a non predictable behaviour, libvirtd gets a signal 15 from time to time and restarts them.

nova.conf:

virt_type=kvm

/var/log/libvirt/qemu/instance-<id>.log:

qemu: terminating on signal 15 from pid 38461
2014-10-27 15:05:45.524+0000: shutting down
2014-10-27 15:06:15.337+0000: starting up

No other relevant information from libvirt logs.

There's no external API intervention nor human one.

I inspected and I couldn't find relevant information of crashes or problems in any of the Openstack programs logs.

Do you have suggestions on how to further troubleshoot such an issue?

/var/log/kern.log of a crashed instance:
Oct 27 13:08:46 lc-20 kernel: ctx4008000f: no IPv6 routers present
Oct 27 13:10:03 lc-20 kernel: tipc: Resetting link <1.1.20:ethSw0-1.1.10:ethSw0>, peer not responding
Oct 27 13:10:03 lc-20 kernel: tipc: Lost link <1.1.20:ethSw0-1.1.10:ethSw0> on network plane A
Oct 27 13:10:03 lc-20 kernel: tipc: Lost contact with <1.1.10>
Oct 27 13:10:14 lc-20 kernel: tipc: Established link <1.1.20:ethSw0-1.1.10:ethSw0> on network plane A
Oct 27 13:11:48 lc-20 kernel: tipc: Resetting link <1.1.20:ethSw0-1.1.10:ethSw0>, peer not responding
Oct 27 13:11:48 lc-20 kernel: tipc: Lost link <1.1.20:ethSw0-1.1.10:ethSw0> on network plane A
Oct 27 13:11:48 lc-20 kernel: tipc: Lost contact with <1.1.10>
Oct 27 13:11:58 lc-20 kernel: tipc: Established link <1.1.20:ethSw0-1.1.10:ethSw0> on network plane A
Oct 27 13:13:56 lc-20 kernel: tipc: Resetting link <1.1.20:ethSw0-1.1.10:ethSw0>, peer not responding
Oct 27 13:13:56 lc-20 kernel: tipc: Lost link <1.1.20:ethSw0-1.1.10:ethSw0> on network plane A
Oct 27 13:13:56 lc-20 kernel: tipc: Lost contact with <1.1.10>
Oct 27 13:14:06 lc-20 kernel: tipc: Established link <1.1.20:ethSw0-1.1.10:ethSw0> on network plane A
Oct 27 13:17:26 lc-20 kernel: tipc: Resetting link <1.1.20:ethSw0-1.1.11:ethSw0>, peer not responding
Oct 27 13:17:26 lc-20 kernel: tipc: Lost link <1.1.20:ethSw0-1.1.11:ethSw0> on network plane A
Oct 27 13:17:26 lc-20 kernel: tipc: Lost contact with <1.1.11>
Oct 27 13:17:36 lc-20 kernel: tipc: Established link <1.1.20:ethSw0-1.1.11:ethSw0> on network plane A
Oct 27 14:00:14 lc-20 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Oct 27 14:00:14 lc-20 kernel: Initializing cgroup subsys cpuset
Oct 27 14:00:14 lc-20 kernel: Linux version 3.0.75-1263-g337a2d1 (gcc version 4.3.2) #2 SMP PREEMPT Wed ...
(more)
edit retag flag offensive close merge delete