Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Tenant gre network broken

I have a network isolation openstack tripleo deployed. I have a bridge called "br-tun" which I have set up for the 'Tenant' network.

I have one main problem and it's that I cannot ping the instance VMs using the floating IP. I think I have traced this back to the tenant GRE network not working. I think I have further traced it back to something amiss with the bridge but follow me here and you'll see why.

For the instances I have this network layout: Real network switch ---- openstack router---local (GRE network)

I can ping the openstack router which is using 192.168.20.108/24 I have assigned a floating IP to the instance which is 192.168.20.102

When I try and ping the floating IP of the instance, I get a response from the floating IP saying destination host unreachable. Which is saying that the openstack router can't communicate with the instance. As the instance is running on the compute host and the floating ip and external network is connected to the controller node, the gre tunnel allows the connectivity to the instance.

What I found is that I Cannot ping the br-tun interface of the nodes from the other node. IE controller cannot ping compute and compute cannot ping controller. I don't know if this is normal, but doesn't seem normal as I can ping all of the other interfaces with IPs. In addition, each node cannot ping the network switch on the tenant network. I have the switch set up as a layer 3 gateway, although the nodes do not have a gateway on this network. I was using the switch IP to test connectivity. Also, running 'arp -an' shows that all IPs on the br-tun interface are <unknown> which means broadcasts are not traversing.

I find that If I do 'ifdown ifcfg-br-tun' and then ifdown ifcf-eth5' and then followed by 'ifup ifcfg-br-tun' and 'ifup ifcfg-eth5' that I can then ping the network switch and ARP is now populating in the table, but "ovs-vsctl show" shows that there is no gre tunnel.

So I decided to reboot the nodes, thinking that there was no issue and all will be fixed with a mass service restart. But once the nodes were up again i found I Was back to square 1, with no working tenant network.

Before I down and up the interfaces::

Controller ping

[root@overcloud-controller-0 heat-admin]# ping 192.168.12.1
PING 192.168.12.1 (192.168.12.1) 56(84) bytes of data.
^C
--- 192.168.12.1 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms

And the arp:

[root@overcloud-controller-0 heat-admin]# arp -an | grep 192.168.12
? (192.168.12.107) at <incomplete> on br-tun
? (192.168.12.1) at <incomplete> on br-tun

ovs-vsctl show for the br-tun

Bridge br-tun
            Controller "tcp:127.0.0.1:6633"
                is_connected: true
            fail_mode: secure
            Port "gre-c0a80c6b"
                Interface "gre-c0a80c6b"
                    type: gre
                    options: {df_default="true", in_key=flow, local_ip="192.168.12.103", out_key=flow, remote_ip="192.168.12.107"}
            Port "eth8"
                Interface "eth8"
            Port br-tun
                Interface br-tun
                    type: internal
            Port patch-int
                Interface patch-int
                    type: patch
                    options: {peer=patch-tun}

ip addr show:

[root@overcloud-controller-0 heat-admin]# ip addr show br-tun
17: br-tun: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000
    link/ether 00:1a:4a:16:01:cc brd ff:ff:ff:ff:ff:ff
    inet 192.168.12.103/24 brd 192.168.12.255 scope global br-tun
       valid_lft forever preferred_lft forever
    inet6 fe80::21a:4aff:fe16:1cc/64 scope link
       valid_lft forever preferred_lft forever

[root@overcloud-controller-0 heat-admin]# ip addr show eth8
10: eth8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP qlen 1000
    link/ether 00:1a:4a:16:01:cc brd ff:ff:ff:ff:ff:ff
    inet6 fe80::21a:4aff:fe16:1cc/64 scope link
       valid_lft forever preferred_lft forever

So the br-tun is not working it seems. But if I simply down and then up the interfaces it resolves IP connectivity but breaks the gre endpoints:

root@overcloud-controller-0 heat-admin]# ifdown br-tun
[root@overcloud-controller-0 heat-admin]# ifdown eth8
[root@overcloud-controller-0 heat-admin]# ifup eth8
^[[A[root@overcloud-controller-0 heat-admin]# ifup br-tun

Then ping works right away:

[root@overcloud-controller-0 heat-admin]# ping 192.168.12.1
PING 192.168.12.1 (192.168.12.1) 56(84) bytes of data.
64 bytes from 192.168.12.1: icmp_seq=1 ttl=255 time=3.94 ms
64 bytes from 192.168.12.1: icmp_seq=2 ttl=255 time=1.24 ms
^C

But when checking ovs bridge, the gre is not set up there. I am not sure which services to restart to trigger it back to life:

Bridge br-tun
        fail_mode: standalone
        Port br-tun
            Interface br-tun
                type: internal
        Port "eth8"
            Interface "eth8"

I'm at a loss on this one.

I've checked the br-tun interface with the other bridge interfaces and they are identical except name and IP address. Furthermore, this is affecting both nodes and downing and then upping the interfaces restores ip connectivity when you repeat the step on the other node. But likewise the gre info is removed from the ovs-vsctl so still gre is not working.

After doing the ifdown and ifup on both nodes, they can then ping each other. So I know the issue is software related to the node itself rather than network issue per-se.

any ideas how I can proceed on this one?