Ask Your Question
2

Packets to VM are dropped in case of multiple senders

asked 2014-08-30 08:07:31 -0500

lukas.pustina gravatar image

updated 2014-09-01 05:46:18 -0500

Setup

We are running OpenStack Icehouse on Ubuntu 14.04 with a dedicated network controller for Neutron networking and 4 compute nodes using Open vSwitch and GRE tunneling. On all bare metals, GRO is disabled. All VMs are running Ubuntu 14.04 as well.

Problem

A VM called time01 runs an NTP server. It has a floating IP so it can be reached via the external network. The corresponding security group permits ICMP, UDP, TCP ingress connections on all ports. All bare metals and other VM use this server to update their clocks using NTP triggered by cron at the same time. We observed that the time sync using ntpdate does not work correctly.

Reduced Problem

We were able to reduced the problem to the following scenario. If we use http://www.bitwizard.nl/mtr/ (mtr) using ICMP, TCP or UDP packets from any hosts (VM or bare metal) to traceroute time01 all works fine. If we start a second mtr from another host, the first mtr process suddenly starts to show packet loss. Over time, this happens to the second mtr process as well.

If we run tcpdump on time01 we only see the first ICMP, UDP or TCP packet arrive. Subsequent packets never show up in tcpdump.

It seems like the NAT router for time01 gets confused. We appreciate any help and can provide whatever additional information is necessary to narrow down the problem.

Update 1

  • baremetal network: 10.102.2.0/24
  • os-mgmt network: 10.102.6.0/24
  • os-data network: 10.102.7.0/24
  • os-ext network: 10.102.8.0/24

iptables chain for security group of time01 tenant: neutron-openvswi-ieb60ba28-b

iptables -L -v on network node

Chain INPUT (policy ACCEPT 176M packets, 97G bytes)
 pkts bytes target     prot opt in     out     source               destination         
 176M   97G neutron-openvswi-INPUT  all  --  any    any     anywhere             anywhere            
 176M   97G nova-api-INPUT  all  --  any    any     anywhere             anywhere            
    0     0 ACCEPT     udp  --  virbr0 any     anywhere             anywhere             udp dpt:domain
    0     0 ACCEPT     tcp  --  virbr0 any     anywhere             anywhere             tcp dpt:domain
    0     0 ACCEPT     udp  --  virbr0 any     anywhere             anywhere             udp dpt:bootps
    0     0 ACCEPT     tcp  --  virbr0 any     anywhere             anywhere             tcp dpt:bootps

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 neutron-filter-top  all  --  any    any     anywhere             anywhere            
    0     0 neutron-openvswi-FORWARD  all  --  any    any     anywhere             anywhere            
    0     0 nova-filter-top  all  --  any    any     anywhere             anywhere            
    0     0 nova-api-FORWARD  all  --  any    any     anywhere             anywhere            
    0     0 ACCEPT     all  --  any    virbr0  anywhere             192.168.122.0/24     ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     all  --  virbr0 any     192.168.122.0/24     anywhere            
    0     0 ACCEPT     all  --  virbr0 virbr0  anywhere             anywhere            
    0     0 REJECT     all  --  any    virbr0  anywhere             anywhere             reject-with icmp-port-unreachable
    0     0 REJECT     all  --  virbr0 any     anywhere             anywhere             reject-with icmp-port-unreachable

Chain OUTPUT (policy ACCEPT 167M packets, 41G bytes)
 pkts bytes target     prot opt in     out     source               destination         
 167M   41G neutron-filter-top  all  --  any    any     anywhere             anywhere            
 167M   41G neutron-openvswi-OUTPUT  all  --  any    any     anywhere             anywhere            
 167M   41G nova-filter-top  all  --  any ...
(more)
edit retag flag offensive close merge delete

Comments

also get us ovs dump and ip tables.

SGPJ gravatar imageSGPJ ( 2014-08-31 03:49:53 -0500 )edit

@SGPJ I updated the question with the requested data.

lukas.pustina gravatar imagelukas.pustina ( 2014-08-31 06:05:25 -0500 )edit

1 answer

Sort by ยป oldest newest most voted
1

answered 2014-09-17 04:26:18 -0500

lukas.pustina gravatar image

I found the answer. In short, Ubuntu 14.04 has an unintuitive default setting regarding the use of the para-virtual network driver VHOST_NET.

$ cat /etc/default/qemu-kvm
# To disable qemu-kvm's page merging feature, set KSM_ENABLED=0 and
# sudo restart qemu-kvm
KSM_ENABLED=1
SLEEP_MILLISECS=200
# To load the vhost_net module, which in some cases can speed up
# network performance, set VHOST_NET_ENABLED to 1.
VHOST_NET_ENABLED=0

# Set this to 1 if you want hugepages to be available to kvm under
# /run/hugepages/kvm
KVM_HUGEPAGES=0

To solve the problem, I just had to set VHOST_NET_ENABLED to 1 and restart all qemu-kvm processes, i.e., restart the virtual machines.

I blogged about how I found and fixed the problem in detail here: https://blog.codecentric.de/en/2014/0... https://blog.codecentric.de/en/2014/0... https://blog.codecentric.de/en/2014/0...

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2014-08-30 08:07:31 -0500

Seen: 2,450 times

Last updated: Sep 17 '14