Ask Your Question
0

Why are packets not being forwarded from physical port to bridge?

asked 2014-10-02 05:56:12 -0500

Krist gravatar image

updated 2014-10-02 07:15:24 -0500

We are having some problems on our openstack networking node. This is a havanna setup, and we are using open vswitch with GRE tunnels.

On this openvswitch is created a bridge "br700". This is connected to an interface, which tags with vlan 700ö This bridge looks like this:

   Bridge "br700"
        Port "qg-67a742dc-dc"
            Interface "qg-67a742dc-dc"
                type: internal
        Port "bond0.700"
            Interface "bond0.700"
        Port "br700"
            Interface "br700"
                type: internal
        Port "phy-br700"
            Interface "phy-br700"
        Port "qg-fe9ca3a1-9f"
            Interface "qg-fe9ca3a1-9f"
                type: internal

I have defined a router in neutron, that has it's gateway set to an external network, and there is an L3 agent to connect this network with this br700 bridge. This router gets created, and when I use ip netns exec qrouter-<id> bash to get in to the namespace this looks like this:

# ip a
245: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
246: qg-67a742dc-dc: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether fa:16:3e:23:4d:7d brd ff:ff:ff:ff:ff:ff
    inet 10.255.10.1/16 brd 10.255.255.255 scope global qg-67a742dc-dc
    inet6 fe80::f816:3eff:fe23:4d7d/64 scope link 
       valid_lft forever preferred_lft forever
247: qr-2c770d48-9d: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether fa:16:3e:fb:70:45 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.2/24 brd 192.168.0.255 scope global qr-2c770d48-9d
    inet6 fe80::f816:3eff:fefb:7045/64 scope link 
       valid_lft forever preferred_lft forever

As far as I can see this router instance has a port in br700.... I can ping hosts in 192.168.0.0/24 without any problem. I can also ping any other routing instances atached to 10.255.0.0/16, and any networks attached to them. So it appears that the br700 bridge is forwarding packets between prots. However, when I try to ping an external host on the 10.255.0.0/16 network this fails

We investigated this with tcpdump I started a ping to 10.255.1.110 on the networking node, while in the router namespace. What we saw was: - ARP packets are being sent out, asking who has 10.255.1.110. These packets are send out over the physical network. - The host 10.255.1.110 does receive these packets, and answers them. - On the network node the packets are seen as well, and when doing a tcpdump on bond0.700 we see them too. However they dont' seem to reach the virtual router. arp -an ? (10.255.1.110) at <incomplete> on qg-67a742dc-dc Adding a static ARP entry didn't solve it.

So I have the impression that somehow packets are not passed back from the interface bond0.700 in to the bridge. I am at ... (more)

edit retag flag offensive close merge delete

1 answer

Sort by » oldest newest most voted
0

answered 2014-10-03 04:39:00 -0500

Krist gravatar image

As I could see in the output of fdb/show what was happening was that the switch was learning the wrong port number for mac addresses of virtual machines connected to it.

I have now found out why.

We have two interfaces on the host, connected to different switches. They are bonded, in active passive. So packets get only send out on one, and whatever traffic comes in on the standby link is ignored. That works.

Now when a VM tries to connect to another host it will send out a an arp packet, which eventually gets send out over the active link of the bonded pair, but because switches will flood broadcast traffic over all their ports it will be received again on the standby link. I always asumed that it would there just get dropped. But ovswitch puts both the interface in promiscuous mode. And there was the root of our problem. This ignores that they are bonded.

So server AAA wants to contact BBB, and sends out an "who has x.x.x.x, tell y.y.y.y" arp packet, with source AAA and destination FFF.

Ovswitch first sees this packet on port 8, and enters "8 AAA" in it's mac table. The packets is flooded out all ports. Via port 1 it ends up on our switch, which floods it out all its ports, and then via the second switch this packet again arrives back on port 1 of the ovswitch, which then changes "8 AAA" in to "1 AAA". And this messes up everything...

I disabled one of the links in the bonded pair and the problem went away. But this is just a workaround.

I will have to do this differently. I'm thinking along the lines of moving the bonding in to the ovswitch. But this requires a reconfiguration of the network, and I therefore need to first enable an out of band connection to it.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2014-10-02 05:56:12 -0500

Seen: 2,417 times

Last updated: Oct 03 '14