Ask Your Question
1

OVS doesn't forward packets from tap-device to br-int

asked 2013-04-12 17:50:06 -0500

markus-sendingthesea gravatar image

Hi all,

I'm really stuck with my network-setup. I'm using quantum with OVS-plugin. After installation everything worked so far. But after reboot it doesn't any more. I can't reach instance IPs since then. So I followed the packets from compute node to network node and could figure out where it breaks.

From compute-node I can see DHCP packets start their journey:

root@compute1:~# tcpdump -nn -i br-int tcpdump: WARNING: br-int: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on br-int, link-type EN10MB (Ethernet), capture size 65535 bytes 19:41:15.067727 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:4a:33:9d, length 286 19:41:18.070965 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:4a:33:9d, length 286 19:41:22.588759 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:c4:29:fa, length 300

Those packets also make it to the network-node...

root@network:~# tcpdump -nn -i br-int tcpdump: WARNING: br-int: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on br-int, link-type EN10MB (Ethernet), capture size 65535 bytes 19:43:11.132307 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:4a:33:9d, length 286 19:43:14.135490 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:4a:33:9d, length 286 19:43:17.434270 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:c4:29:fa, length 300

...and even get to tap66178edc-18 device that is configured for dhcp. As I can see the dhcp-server also answers the dhcp-request:

root@network:~# tcpdump -nn -i tap66178edc-18 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on tap66178edc-18, link-type EN10MB (Ethernet), capture size 65535 bytes 19:44:20.399708 ARP, Request who-has 10.5.5.3 tell 10.5.5.2, length 28 19:44:21.399712 ARP, Request who-has 10.5.5.3 tell 10.5.5.2, length 28 19:44:22.403720 ARP, Request who-has 10.5.5.3 tell 10.5.5.2, length 28 19:44:27.684714 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:c4:29:fa, length 300 19:44:27.685000 IP 10.5.5.2.67 > 10.5.5.3.68: BOOTP/DHCP, Reply, length 308 19:44:32.687714 ARP, Request who-has 10.5.5.3 tell 10.5.5.2, length 28 19:44:33.687725 ARP, Request who-has 10.5.5.3 tell 10.5.5.2, length 28 19:44:34.687720 ARP, Request who-has 10.5 ... (more)

edit retag flag offensive close merge delete

20 answers

Sort by ยป oldest newest most voted
0

answered 2013-04-16 07:34:04 -0500

darragh-oreilly gravatar image

Where are you pinging from? Can you ping the vm from the network node first? To ping from outside will mean configuring quantum to use an external network, setting up a router and floating ip. So it could take a bit of time yet.

I have done an automatic installer for a basic 3-node quantum setup (controller, netnode, compute1) - there is a diagram and a link to the git here http://techbackground.blogspot.ie/ - there are separate branches for Foslom and Grizzy. If you already using VirtualBox and have 2.5GB ram free, then it should not long to install the prereqs (git, Vagrant and Ansible) and have a system up and running.

But if what to continue to debug, then I would need the following on the network node:

ovs-vsctl show ip a route -n iptables-save

It seems the ovs agent didn't start properly after reboot. Maybe the l3 agent needs to be restarted too. And check /var/log/quantum/* to see if the quantum services are ok.

Darragh.

edit flag offensive delete link more
0

answered 2013-04-16 19:18:41 -0500

darragh-oreilly gravatar image

I was wondering how dhcp and router could run on the same node without namespaces and a single route table, so I searched the doc and found this at the end:

http://docs.openstack.org/folsom/open... "If you run both L3 + DHCP services on the same node, you should enable namespaces to avoid conflicts with routes :"

So I think the easiest thing for you is to get another node to run your l3-agent. You will need to set the router_id in quantum.conf.

Darragh.

edit flag offensive delete link more
0

answered 2013-04-15 09:55:30 -0500

darragh-oreilly gravatar image

Hi Dunkel,

that's from the network node - how about the iptables on the compute node?

Darragh.

edit flag offensive delete link more
0

answered 2013-04-15 09:42:50 -0500

markus-sendingthesea gravatar image

Hi Darragh,

thanks a lot for your reply. Unfortunately there isn't any chain containing "quantum-openvswi*". I saved the iptables setting before I did the reboot to a file, so I checked this one, too. As it seems, such a chain had never existed in my setup and before reboot L2 communication had worked. Instance had gotten an ip from DHCP and metadata service was also reachable. So I run "diff" command to compare iptables setting before and after reboot, but couldn't find any differences besides the current counters.

Here is the output from iptables-save:

root@network:~# iptables-save -c

Generated by iptables-save v1.4.12 on Mon Apr 15 11:26:03 2013

*nat :PREROUTING ACCEPT [21:6588] :INPUT ACCEPT [21:6588] :OUTPUT ACCEPT [11987:885036] :POSTROUTING ACCEPT [0:0] :quantum-l3-agent-OUTPUT - [0:0] :quantum-l3-agent-POSTROUTING - [0:0] :quantum-l3-agent-PREROUTING - [0:0] :quantum-l3-agent-float-snat - [0:0] :quantum-l3-agent-snat - [0:0] :quantum-postrouting-bottom - [0:0] [21:6588] -A PREROUTING -j quantum-l3-agent-PREROUTING [11987:885036] -A OUTPUT -j quantum-l3-agent-OUTPUT [11987:885036] -A POSTROUTING -j quantum-l3-agent-POSTROUTING [0:0] -A POSTROUTING -j quantum-postrouting-bottom [11987:885036] -A quantum-l3-agent-POSTROUTING ! -i qg-9ba11a5d-c3 ! -o qg-9ba11a5d-c3 -m conntrack ! --ctstate DNAT -j ACCEPT [0:0] -A quantum-l3-agent-POSTROUTING -s 10.5.5.0/24 -d 192.168.0.231/32 -j ACCEPT [0:0] -A quantum-l3-agent-PREROUTING -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 192.168.0.231:8775 [0:0] -A quantum-l3-agent-snat -j quantum-l3-agent-float-snat [0:0] -A quantum-l3-agent-snat -s 10.5.5.0/24 -j SNAT --to-source 80.237.213.208 [0:0] -A quantum-postrouting-bottom -j quantum-l3-agent-snat COMMIT

Completed on Mon Apr 15 11:26:03 2013

Generated by iptables-save v1.4.12 on Mon Apr 15 11:26:03 2013

*filter :INPUT ACCEPT [673961:174108084] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [508948:124713276] :quantum-filter-top - [0:0] :quantum-l3-agent-FORWARD - [0:0] :quantum-l3-agent-INPUT - [0:0] :quantum-l3-agent-OUTPUT - [0:0] :quantum-l3-agent-local - [0:0] [715645:187312200] -A INPUT -j quantum-l3-agent-INPUT [41684:13204116] -A INPUT -p gre -j ACCEPT [0:0] -A FORWARD -j quantum-filter-top [0:0] -A FORWARD -j quantum-l3-agent-FORWARD [508948:124713276] -A OUTPUT -j quantum-filter-top [508948:124713276] -A OUTPUT -j quantum-l3-agent-OUTPUT [508948:124713276] -A quantum-filter-top -j quantum-l3-agent-local [0:0] -A quantum-l3-agent-INPUT -d 192.168.0.231/32 -p tcp -m tcp --dport 8775 -j ACCEPT COMMIT

Completed on Mon Apr 15 11:26:03 2013

root@network:~#

edit flag offensive delete link more
0

answered 2013-04-15 10:51:19 -0500

darragh-oreilly gravatar image

Hi Dunkel,

I hear what you are saying, but I'm not sure about your configuration. What version are you using? The iptables from the compute node show nova for snat and filters. Is it your intention to mix quantum and nova like that? I don't know if that's possible, but I try to keep everything quantum to make life easier.

Darragh.

edit flag offensive delete link more
0

answered 2013-04-15 08:55:39 -0500

darragh-oreilly gravatar image

I guess the packets are being dropped by the filter. Have a look at 'iptables -vnL' or 'iptables-save -c' and at the chain with name starting with quantum-openvswi-i - is there a rule to allow udp spt:67 dpt:68 ? what does its counters say? or are the packets going to the quantum-openvswi-sg-fallback which does a DROP?

edit flag offensive delete link more
0

answered 2013-04-15 10:19:01 -0500

markus-sendingthesea gravatar image

Here is iptables from my compute-node. But I don't understand why the compute-node should be an suspect for my problem? If the network-node would be working correctly I should see the dhcp reply packet on br-int on my network-node, shouldn't I?

I can see it on device tap66178edc-18 which is member of br-int. So why can I see the packet on that tap device but not on the bridge it is connected to? It does work the other way round. Packets I can see on br-int, I can also see on the tap-device.

That fact makes me believe that the root of that problem lies in OVS. But I'm really new to OpenStack, so please let me know if I misunderstand here anything.

root@compute1:~# iptables-save -c

Generated by iptables-save v1.4.12 on Mon Apr 15 12:02:24 2013

*nat :PREROUTING ACCEPT [52447:9171960] :INPUT ACCEPT [20:1332] :OUTPUT ACCEPT [156238:9580558] :POSTROUTING ACCEPT [208665:18751186] :nova-compute-OUTPUT - [0:0] :nova-compute-POSTROUTING - [0:0] :nova-compute-PREROUTING - [0:0] :nova-compute-float-snat - [0:0] :nova-compute-snat - [0:0] :nova-postrouting-bottom - [0:0] [52106:9123565] -A PREROUTING -j nova-compute-PREROUTING [154492:9463387] -A OUTPUT -j nova-compute-OUTPUT [206590:18586312] -A POSTROUTING -j nova-compute-POSTROUTING [207841:18692599] -A POSTROUTING -j nova-postrouting-bottom [206590:18586312] -A nova-compute-snat -j nova-compute-float-snat [206590:18586312] -A nova-postrouting-bottom -j nova-compute-snat COMMIT

Completed on Mon Apr 15 12:02:24 2013

Generated by iptables-save v1.4.12 on Mon Apr 15 12:02:24 2013

*mangle :PREROUTING ACCEPT [3325771:2192960480] :INPUT ACCEPT [3212526:2171644761] :FORWARD ACCEPT [382617:103977435] :OUTPUT ACCEPT [3700045:1240719998] :POSTROUTING ACCEPT [4082662:1344697433] COMMIT

Completed on Mon Apr 15 12:02:24 2013

Generated by iptables-save v1.4.12 on Mon Apr 15 12:02:24 2013

*filter :INPUT ACCEPT [3035823:2121673103] :FORWARD ACCEPT [357777:95580751] :OUTPUT ACCEPT [3690400:1239848204] :nova-compute-FORWARD - [0:0] :nova-compute-INPUT - [0:0] :nova-compute-OUTPUT - [0:0] :nova-compute-inst-3 - [0:0] :nova-compute-inst-4 - [0:0] :nova-compute-local - [0:0] :nova-compute-provider - [0:0] :nova-compute-sg-fallback - [0:0] :nova-filter-top - [0:0] [3161946:1874674482] -A INPUT -j nova-compute-INPUT [160848:31011194] -A INPUT -p gre -j ACCEPT [382617:103977435] -A FORWARD -j nova-filter-top [356806:95416570] -A FORWARD -j nova-compute-FORWARD [3672308:1233787427] -A OUTPUT -j nova-filter-top [3652067:1227608280] -A OUTPUT -j nova-compute-OUTPUT [0:0] -A nova-compute-inst-3 -m state --state INVALID -j DROP [13905:4635628] -A nova-compute-inst-3 -m state --state RELATED,ESTABLISHED -j ACCEPT [20:2612] -A nova-compute-inst-3 -j nova-compute-provider [4:1364] -A nova-compute-inst-3 -s 10.5.5.2/32 -p udp -m udp --sport 67 --dport 68 -j ACCEPT [10:768] -A nova-compute-inst-3 -s 10.5.5.0/24 -j ACCEPT [1:60] -A nova-compute-inst-3 -p tcp -m tcp --dport 22 -j ACCEPT [5:420] -A nova-compute-inst-3 -p icmp -j ACCEPT [0:0] -A nova-compute-inst-3 -j nova-compute-sg-fallback [0:0] -A nova-compute-inst-4 -m state --state INVALID -j DROP [2:168] -A nova-compute-inst-4 -m state --state RELATED,ESTABLISHED -j ACCEPT [10710:3704874] -A nova-compute-inst-4 -j nova-compute-provider [10707:3704622] -A nova-compute-inst-4 -s 10.5.5.2/32 -p udp -m udp --sport 67 --dport 68 -j ACCEPT [3:252] -A nova-compute-inst-4 -s 10 ... (more)

edit flag offensive delete link more
0

answered 2013-04-15 11:43:41 -0500

markus-sendingthesea gravatar image

Hi Darragh,

No, it's not my intention to mix it. What can I do to fix this. I followed the basic install guide for ubuntu 12.04 http://docs.openstack.org/folsom/basic-install/content/basic-install_intro.html (http://docs.openstack.org/folsom/basi...)

As suggested I used the packages from :

root@network:~# cat /etc/apt/sources.list.d/cloud-archive.list deb http://ubuntu-cloud.archive.canonical.com/ubuntu (http://ubuntu-cloud.archive.canonical...) precise-updates/folsom main

Currently installed on network-node:

root@network:~# dpkg -l|grep -i cloud0 ii python-cliff 1.1.2-0ubuntu2~cloud0 Command Line Interface Formulation Framework ii python-eventlet 0.9.17-0ubuntu1.1~cloud0 concurrent networking library for Python ii python-greenlet 0.3.3-1ubuntu2~cloud0 Lightweight in-process concurrent programming ii python-keystone 2012.2.1-0ubuntu1.3~cloud0 OpenStack identity service - Python library ii python-keystoneclient 1:0.1.3-0ubuntu1.1~cloud0 Client libary for Openstack Keystone API ii python-prettytable 0.6-1ubuntu1~cloud0 library to represent tabular data in visually appealing ASCII tables ii python-quantum 2012.2.1-0ubuntu1~cloud0 Quantum is a virutal network service for Openstack. (python library) ii python-quantumclient 1:2.1-0ubuntu1~cloud0 client - Quantum is a virtual network service for Openstack ii python-sqlalchemy 0.7.8-1ubuntu1~cloud0 SQL toolkit and Object Relational Mapper for Python ii python-sqlalchemy-ext 0.7.8-1ubuntu1~cloud0 SQL toolkit and Object Relational Mapper for Python - C extension ii quantum-common 2012.2.1-0ubuntu1~cloud0 common - Quantum is a virtual network service for Openstack. ii quantum-dhcp-agent 2012.2.1-0ubuntu1~cloud0 Quantum is a virtual network service for Openstack. (dhcp agent) ii quantum-l3-agent 2012.2.1-0ubuntu1~cloud0 Quantum is a virtual network service for Openstack. (l3 agent) ii quantum-plugin-openvswitch 2012.2.1-0ubuntu1~cloud0 Quantum is a virtual network service for Openstack. (openvswitch plugin) ii quantum-plugin-openvswitch-agent 2012.2.1-0ubuntu1~cloud0 Quantum is a virtual network service for Openstack. (openvswitch plugin agent) root@network:~#

And on the compute-node:

root@compute1:~# dpkg -l|grep -i cloud0 ii librados2 0.48.2-0ubuntu2~cloud0 RADOS distributed object store client library ii librbd1 0.48.2-0ubuntu2~cloud0 RADOS block device client library ii libvirt-bin 0.9.13-0ubuntu12.2~cloud0 programs for the libvirt library ii libvirt0 0.9.13-0ubuntu12.2~cloud0 library for interfacing with different virtualization systems ii nova-common 2012.2.1+stable-20121212-a99a802e-0ubuntu1.4~cloud0 OpenStack Compute - common files ii nova-compute 2012.2.1+stable-20121212-a99a802e-0ubuntu1.4~cloud0 OpenStack Compute - compute node ii nova-compute-kvm 2012.2.1+stable-20121212-a99a802e-0ubuntu1.4~cloud0 OpenStack Compute - compute node (KVM) ii python-cinderclient 1:1.0.0-0ubuntu1~cloud0 python bindings to the OpenStack Volume API ii python-cliff 1.1.2-0ubuntu2~cloud0 Command Line Interface Formulation Framework ii python-eventlet 0.9.17-0ubuntu1.1~cloud0 concurrent networking library for Python ii python-glance 2012.2.1-0ubuntu1.2~cloud0 OpenStack Image Registry and Delivery Service - Python library ii python-glanceclient 1:0.5.1-0ubuntu1~cloud0 Client library for Openstack glance server. ii python-greenlet 0.3.3-1ubuntu2~cloud0 Lightweight in-process concurrent programming ii python-jsonschema 0.2-1ubuntu1~cloud0 An(other) implementation of JSON Schema (Draft 3) for Python ii python-keystone 2012.2.1-0ubuntu1.3~cloud0 OpenStack identity service - Python ... (more)

edit flag offensive delete link more
0

answered 2013-04-15 13:12:59 -0500

darragh-oreilly gravatar image

Markus,

sorry - I assumed it was Grizzly. Its probably ok and you have overlapping ips disabled. http://docs.openstack.org/folsom/open...

So when I tried dumping on br-int I got this:

tcpdump -nn -i br-int

tcpdump: br-int: That device is not up

Now this is a system that is working fine.

ip link show dev br-int

7: br-int: <broadcast,multicast> mtu 1500 qdisc noop state DOWN link/ether 5a:78:e3:6f:5f:4a brd ff:ff:ff:ff:ff:ff

So I put it up

ip link set dev br-int up

and booted a vm, but it did not get an IP. So I put it down again and booted another and it worked again. So it seems like br-int needs to be down. I'd give that a try.

Darragh.

edit flag offensive delete link more
0

answered 2013-04-15 14:50:28 -0500

markus-sendingthesea gravatar image

Hi Darragh,

Thanks for that hint. Instance isn't getting an ip yet, but it's getting warmer. I now do the capturing on the physical devices, so that I won't disturb the virtual switches. What I can see so far is, that dhcp offers make it from the compute- to the network-node and also back. I captured on eth1 on both nodes as those interfaces are connected to my data lan:

network-node:

root@network:~# tcpdump -nn -i eth1 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes 16:15:37.546449 IP 10.10.10.233 > 10.10.10.232: GREv0, key=0x1, length 354: IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:ac:91:66, length 300 16:15:37.547182 IP 10.10.10.232 > 10.10.10.233: GREv0, key=0x0, length 362: IP 10.5.5.2.67 > 10.5.5.3.68: BOOTP/DHCP, Reply, length 308 16:15:42.559841 IP 10.10.10.232 > 10.10.10.233: GREv0, key=0x0, length 54: ARP, Request who-has 10.5.5.3 tell 10.5.5.2, length 28

So, dhcp request and offer packets come and go through my GRE tunnel. That looks good for me. I can see the corresponding packets on my compute-node as well:

root@compute1:~# tcpdump -nn -i eth1 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes 16:15:09.561187 IP 10.10.10.233 > 10.10.10.232: GREv0, key=0x1, length 354: IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:ac:91:66, length 300 16:15:09.562266 IP 10.10.10.232 > 10.10.10.233: GREv0, key=0x0, length 362: IP 10.5.5.2.67 > 10.5.5.3.68: BOOTP/DHCP, Reply, length 308

So the dhcp-offer reaches the compute-node, but doesn't make it back to the instance's network interface. BTW: Do you know a document where is described why an instance creates 4 network devices and how they work together?

root@compute1:~# tcpdump -nn -i vnet0 tcpdump: WARNING: vnet0: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on vnet0, link-type EN10MB (Ethernet), capture size 65535 bytes 16:15:37.546014 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:ac:91:66, length 300 16:15:37.546067 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:ac:91:66, length 300 16:15:51.320523 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from fa:16:3e:ac:91:66, length 300 16:15:51.320580 ... (more)

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

2 followers

Stats

Asked: 2013-04-12 17:50:06 -0500

Seen: 1,602 times

Last updated: Apr 28 '13