Multiple routes to external network

asked 2012-10-26 21:04:12 -0600

graham-hemingway gravatar image

I am very close to getting a working Folsom + Quantum, but am having some problems in external network connectivity. I am trying to set up the "Provider Router with Private Networks" case (see http://docs.openstack.org/trunk/openstack-network/admin/content/use_cases_single_router.html (http://docs.openstack.org/trunk/opens...) ). I am also using EmilienM's guide to help me along, though I know there are some differences.

So, here is the issue. I can ping and SSH into the VM, but with lots of issues. First, my SSH connection gets dropped all the time and the VM can not reach back out the external network. No ping no dnslookup.

I think the issue stems from how I have set up my OVS bridges. I think that multiple conflicting routes are being set up. Here is some debug output for reference:

foo@cloud1:~# ip r 10.0.0.0/24 dev eth4 proto kernel scope link src 10.0.0.3 10.5.5.0/24 dev tapdbd89f9a-05 proto kernel scope link src 10.5.5.2 10.5.5.0/24 dev qr-c89b0922-f7 proto kernel scope link src 10.5.5.1 99.59.104.0/23 dev qg-21f69e18-c0 proto kernel scope link src 99.59.105.185 99.59.104.0/23 dev br-ex proto kernel scope link src 99.59.105.184 192.168.49.0/24 dev eth3 proto kernel scope link src 192.168.49.244 192.168.50.0/24 dev eth2 proto kernel scope link src 192.168.50.244

For reference, 99.59.x.x is my public network, 192.168.x.x is my management network, and 10.x.x.x is for VMs.
As far as I can tell these are the following: tapdbd89f9a-05 (10.5.5.2) is the DHCP agent qr-c89b0922-f7 (10.5.5.1) is the port connecting the tenant network to the provider router qg-21f69e18-c0 (99.59.105.185) is the port connecting the provider router to the external network br-ex (99.59.105.184) is obviously the port for the br-ex bridge

Following EmilienM's guide I manually add the 99.59.105.184 address to br-ex, otherwise it would not have it. I don't see this step in any other guides.

Also as reference, here is the OVS config:

foo@cloud1:~# ovs-vsctl show ba000344-9e7f-468d-9eeb-6d455be4938a Bridge br-int Port br-int Interface br-int type: internal Port "tapdbd89f9a-05" tag: 1 Interface "tapdbd89f9a-05" type: internal Port patch-tun Interface patch-tun type: patch options: {peer=patch-int} Port "qr-c89b0922-f7" tag: 1 Interface "qr-c89b0922-f7" type: internal Bridge br-tun Port br-tun Interface br-tun type: internal Port "gre-2" Interface "gre-2" type: gre options: {in_key=flow, out_key=flow, remote_ip="10.0.0.26"} Port patch-int Interface patch-int type: patch options: {peer=patch-tun} Bridge br-ex Port br-ex Interface br-ex type: internal Port "eth0" Interface "eth0" Port "qg-21f69e18-c0" Interface "qg-21f69e18-c0" type: internal ovs_version: "1.4.0+build0"

Any help would be appreciated. I am very sorry I can't be more specific on the issue itself. Thank you, Graham

edit retag flag offensive close merge delete

8 answers

Sort by ยป oldest newest most voted
0

answered 2012-10-29 15:44:33 -0600

graham-hemingway gravatar image

Ha, I meant

ip addr del 99.59.105.184/23 dev br-ex

edit flag offensive delete link more
0

answered 2012-10-29 22:59:19 -0600

danwent gravatar image

there's no explicit timeouts in quantum. my guess would be that you're hitting something with iptables, but 5 seconds is insanely low for that. I think iptables timeout default to days for an established connection.

edit flag offensive delete link more
0

answered 2012-10-30 12:29:02 -0600

graham-hemingway gravatar image

Well, I feel a bit sheepish. Turns out I had allocated a public IP that was already in use by someone else in our network. Once I moved to a free IP all was good.

Thanks Dan.

edit flag offensive delete link more
0

answered 2012-10-26 21:39:45 -0600

graham-hemingway gravatar image

I found this: https://answers.launchpad.net/quantum/+question/208377 (https://answers.launchpad.net/quantum...) And after enough fooling around things started working. I am going to close this question, not because I know how it started working, but because it did start working.

G

edit flag offensive delete link more
0

answered 2012-10-26 22:28:07 -0600

danwent gravatar image

thanks for the detailed write-up.

It seems like you likely has set use_namespaces = False, which is resulting in an overlapping route, which is confusing forwarding.

In particular, the overlapping routes are:

99.59.104.0/23 dev qg-21f69e18-c0 proto kernel scope link src 99.59.105.185 99.59.104.0/23 dev br-ex proto kernel scope link src 99.59.105.184

Basically, if you are not using namespaces, you should not assign an IP address to br-ex. If you are using namespaces, assigning the IP is required.

edit flag offensive delete link more
0

answered 2012-10-29 15:41:13 -0600

graham-hemingway gravatar image

Dan,

Thanks for the response. I am indeed have use_namespaces = False. I am going to try removing the IP address from br-ex. Right now it looks like this (via ip a):

9: br-ex: <broadcast,multicast,up,lower_up> mtu 1500 qdisc noqueue state UNKNOWN link/ether 78:2b:cb:07:27:ed brd ff:ff:ff:ff:ff:ff inet 99.59.105.184/23 scope global br-ex inet6 fe80::7a2b:cbff:fe07:27ed/64 scope link valid_lft forever preferred_lft forever

I am going to run this:

ip addr del 129.59.105.184/23 dev br-ex

Anything else I should need to do? Thanks, Graham

edit flag offensive delete link more
0

answered 2012-10-29 15:55:04 -0600

graham-hemingway gravatar image

I can launch and and associate a floating ip with the instance. I can ping and ssh into the instance, but I can not get back out from the instance to anything outside of the cloud. The instance gets metadata fine. Also, my SSH connection gets dropped frequently.

edit flag offensive delete link more
0

answered 2012-10-29 19:37:42 -0600

graham-hemingway gravatar image

I think that maybe I just needed to let the network settle after I made that ip del call. Now I can SSH into the instance and it has external connectivity.

I still get a broken pipe error if I let the connection sit open for more than 5 seconds without traffic. If I just let it sit there with top running it works fine, so it seems I am hitting a timeout somewhere. I tried setting the timeouts for both the sshd config and for the ipv4/tcp_keepalive_time, but neither of these seemed to help.

Is there a keepalive setting for quantum/OVS somewhere?

Thanks, Graham

edit flag offensive delete link more

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2012-10-26 21:04:12 -0600

Seen: 245 times

Last updated: Oct 30 '12