Ask Your Question
4

No network communication on a 3 node Havana setup

asked 2013-12-23 11:16:24 -0500

Diego Lima gravatar image

updated 2013-12-27 13:20:32 -0500

smaffulli gravatar image

I've been struggling with neutron for the past few days and I can't get instances to receive addresses using DHCP or reach anything outside the same physical host. My setup consists of three nodes, all running Ubuntu Server 12.04 with Havana installed from Cloudarchive (per http://docs.openstack.org/trunk/install-guide/install/apt/content/ ):

  • Network node: Connected to the WAN (eth0) and LAN (eth1, 10.130.10.201) networks running neutron dhcp-agent, l3-agent, metadata-agent and plugin-openvswitch-agent.
  • Management node: Connected only to the LAN network (eth0, 10.130.10.202) and running neutron (server, plugin-openvswitch-agent), postgresql, rabbitmq, keystone, glance, nova (api, conductor, cert, vnc proxy and scheduler) and horizon.
  • Compute node: Connected only to the LAN network (eth0, 10.130.10.11) and running nova-compute and neutron-plugin-openvswitch-agent

Everything aside from networking works as expected so far, and two instances running on the same compute node are able to reach each other if I log on using the console and assign an address manually.

On the neutron server log (management node) I can see the following message whenever I launch an instance:

2013-12-23 12:52:13.151 862 WARNING neutron.db.agentschedulers_db [-] Fail scheduling network {'status': u'ACTIVE', 'subnets': [u'be9659ca-1ee7-4c35-b36c-d082a581495f'], 'name': u'Alpha', 'provider:physical_network': None, 'admin_state_up': True, 'tenant_id': u'04968d54151d4bb29d477f754e099728', 'provider:network_type': u'local', 'router:external': False, 'shared': False, 'id': u'031dcf57-c259-41f2-be2a-1541f88d3238', 'provider:segmentation_id': None}

General info:

Configuration files and logs follow:

Network node:

Compute node:

Management Node:

Tcpdump output when launching a new instance:

edit retag flag offensive close delete

Comments

That error is innocuous most likely (I get the error). See https://bugs.launchpad.net/neutron/+bug/1192786 Can you post your nova.conf file?

SamYaple ( 2013-12-23 13:43:44 -0500 )edit

I've added the nova.conf files to the description (both on the compute and management nodes)

Diego Lima ( 2013-12-23 14:12:01 -0500 )edit
1

You are missing some information in your neutron.conf file under the [DEFAULT] header. You need the keystone auth stuff under that (though I am unsure why, I ran into the same issue). See http://pastebin.com/A0ZS33rD Also, your neutron.conf files conflict (allownamespace), use one file.

SamYaple ( 2013-12-23 14:31:51 -0500 )edit

I've made the suggested changes but still get no networking. Upon launching my vm I can see some traffic (dhcp requests) on the compute node's br-int interface, but no traffic on the network node. I'll edit my main post and place the tcpdump results here. I've also updated pastebin's conf files.

Diego Lima ( 2013-12-23 15:26:34 -0500 )edit

Well you are on the right track. This is a bridging issue now. I still see an entry for br-ex on your network node, have you removed that one? Apply changes from http://pastebin.com/A0ZS33rD Once you remove all br-ex from all nodes, retry.

SamYaple ( 2013-12-23 15:37:57 -0500 )edit

2 answers

Sort by ยป oldest newest most voted
0

answered 2014-01-27 17:46:36 -0500

anemic gravatar image

Do not put your physical interface eth3 into br-int if you're using gre tunnels. This causes the dhcp packets to flow to ethernet via the bridge without gre.

Also it pays off to tcpdump the raw interface(eth3) to see the gre tunneling in effect and use -e to see the mac addresses.

The packet flow should be like this: compute: br-int, packets tagged with vlan1 signifying the tenant network compute: br-tun where gre encapsulation occurs compute: eth3, this is a normal network interface without any bridging! packets should be gre tunneled here wire network: eth3, gre-tunneled, again no bridging here network: br-tun, removes encapsulation and attaches vlan tag 1 (for broadcast packets it might route it to several vlans if present) network: br-int, vlan1

You can check ovs-ofctl dump-flows br-tun to see the rules for gre<->vlan switching

edit flag offensive delete publish link more
0

answered 2014-02-11 14:51:10 -0500

ironhardchaw gravatar image

I've had the OP's warning "Fail scheduling network" before, but it never seemed to affect the operation of my networking. That said, I'm also having a similar problem, where my networking dies after a while and machines randomly fail to get DHCP address among other things.

If I run ip netns exec qdhcp-... tcpdump -i any on my various namespaces, I can see the traffic from my VM, but it doesn't make it across the OVS flows, which seem to be missing for me.

edit flag offensive delete publish link more

Comments

That happens to me on a 4 node RDO based installation too and I've either to restart all services or in some cases request remote hand support to reset one or 2 compute servers and I guess the smallest stable installation needs at least 6 servers, although Red Hat provides a 3 node PoC, I'm curious.

cloudssky ( 2014-02-11 15:52:40 -0500 )edit

Thing is, restarting isn't working for me this time. My installation is completely down, no matter how many times I restart/reboot. The OVS flows are never rebuilt.

ironhardchaw ( 2014-02-11 16:05:11 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

[hide preview]

Question Tools

Follow
2 followers

Stats

Asked: 2013-12-23 11:16:24 -0500

Seen: 413 times

Last updated: Feb 11