Ask Your Question
1

How can I fix a compute node network problem, in a Fedora 19 Havana 1 controller/1 compute node setup, 3 NIC's per server, using Neutron OVS vlans, built with packstack? [closed]

asked 2013-12-16 15:43:50 -0500

syspimp gravatar image

updated 2013-12-27 13:07:03 -0500

smaffulli gravatar image

I have a 2 node (1 controller/compute, 1 compute) Centos6 Folsom Openstack cluster up and working great for a long time, and now I'm building a sister Fedora 19 Havana 2 node (1 controller/compute, 1 compute) setup using existing VLANs on my router. I successfully got this working in Grizzly using provider networks, but having trouble getting Havana to work.

My problem is: The instances launched on the compute node don't have external network access, can't ping any gateways, floating ip's do not work, and meta data is unavailable. They successfully get a dhcp address, and I can ssh to one via ip netns exec. To get metadata working based upon info in the metadata_agent.ini file, I used this iptables command on the compute node:

iptables -I PREROUTING -t nat -s 0.0.0.0/0 -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination ${CONTROLLER}:${METADATAPORT}.

The route to 169.254.0.0 already exists in the routing table on all interfaces. The meta data agent is not listening for requests on the compute node, dnsmasq is running in the process list. Tcpdump showed the metadata request going out, but not getting any replies until I added this rule. I thought the metadata agent should forward this request to the target in the metadata_agent.ini, is this correct?

I used packstack to configure both the controller and compute nodes. This one is with CONFIG_NEUTRON_L3_EXT_BRIDGE=provider. Here is my answer file

Goal: I would like to have a management vlan2 NIC em1, data trunk NIC em2, and external network vlan3 NIC p1p1, use the router to handle traffic flow and neutron router to handle floating IP's, using vlans, on a one controller/ multiple compute node setup.

If I have to ditch using my own router in the setup, that is fine.

Background:

For the Havana setup, each node has three NIC's:

  • em1 (in management vlan2 on the switch, ip 10.55.2.156),
  • em2 (trunked on the switch, no ip, used for data network bridge br-em2),
  • p1p1 (in external vlan3, no ip. I want to use this for external network access).

Router has the first ip in each subnet 10.55.2.1 (management) , 10.55.3.1 ( DMZ ), 10.55.4.1 (VM Private) , 10.55.5.1 (Ops). All IP's are pingable, and is in a known good state.

Since I have a router on my network, with gateway ip's for each VLAN, I think I should use provider networking for the CONFIG_NEUTRON_L3_EXT_BRIDGE. When I tried using br-em2 in the answer file for the L3 bridge I had a different problem: the vm's were unable to get anything network related, but I see the requests going out the interfaces using tcpdump.

*Although I reference p1p1 above, I can't a compute node to work with just the management/em1 and data/em2. When I tried to use p1p1, I created ... (more)

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by dheeru
close date 2013-12-22 10:32:52.806072

1 answer

Sort by ยป oldest newest most voted
1

answered 2013-12-22 10:02:26 -0500

syspimp gravatar image

I fixed this. There were several problems, and here are my fixes. I have it just the way I want it with external traffic going out one interface ( p1p1, vlan 3), internal vm traffic handled on another interface ( p1p2 vlans 4 and 5 ).

Solutions were:

  1. activating vlan interfaces on fedora/os
  2. putting just one vlan interface into the ovs bridge
  3. admin up the vlan interfaces on the switch
  4. setting the MTU size via dnsmasq.conf

Changes to the packstack answer file were:

  1. CONFIG_NEUTRON_L3_EXT_BRIDGE=br-p1p1
  2. CONFIG_NEUTRON_OVS_BRIDGE_IFACES=br-p1p2:p1p2
  3. CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS=inter-vlan:br-p1p2

Then, I had to create the vlan interfaces and load the 8021q kernel module:

[root@compute4 ~]# cat /etc/sysconfig/network-scripts/ifcfg-p1p1.3 DEVICE=p1p1.3 VLAN=yes VLAN_NAME_TYPE=DEV_PLUS_VID_NO_PAD PHYSDEV=p1p1 ONBOOT=yes [root@compute4 ~]# cat /etc/sysconfig/network-scripts/ifcfg-p1p1 TYPE=Ethernet BOOTPROTO=none DEVICE=p1p1 NAME=p1p1 DEFROUTE=no ONBOOT=yes [root@compute4 ~]# cat /etc/sysconfig/network-scripts/ifcfg-p1p2.4 DEVICE=p1p2.4 VLAN=yes VLAN_NAME_TYPE=DEV_PLUS_VID_NO_PAD PHYSDEV=p1p2 ONBOOT=yes [root@compute4 ~]# cat /etc/sysconfig/network-scripts/ifcfg-p1p2.5 DEVICE=p1p2.5 VLAN=yes VLAN_NAME_TYPE=DEV_PLUS_VID_NO_PAD PHYSDEV=p1p2 ONBOOT=yes [root@compute4 ~]# cat /etc/sysconfig/network-scripts/ifcfg-p1p2 TYPE=Ethernet NAME=p1p2 DEVICE=p1p2 BOOTPROTO=none DEFROUTE=no ONBOOT=yes

I had to change which interfaces were in the bridges, and I added the vlan interfaces instead of the base interface. Caveat: it seems that you only need to add one vlan interface, even though you might have activated more than one on a particular interface. If I added both p1p2.4 and p1p2.5 to br-p1p2, I caused a packet storm on my switch.

[root@compute3 ~]# ovs-vsctl show c3b0d272-16f3-44b4-a38c-4840bde464c9 Bridge "br-p1p2" Port "br-p1p2" Interface "br-p1p2" type: internal Port "p1p2.4" Interface "p1p2.4" Port "phy-br-p1p2" Interface "phy-br-p1p2" Bridge "br-p1p1" Port "qg-a01da6d6-27" Interface "qg-a01da6d6-27" type: internal Port "br-p1p1" Interface "br-p1p1" type: internal Port "p1p1.3" Interface "p1p1.3" Bridge br-int Port br-int Interface br-int type: internal Port "int-br-p1p2" Interface "int-br-p1p2" Port "tapee78ce58-f5" tag: 1 Interface "tapee78ce58-f5" type: internal Port "tap75d3f322-49" tag: 2 Interface "tap75d3f322-49" type: internal Port "qvob1b9a2b7-b6" tag: 1 Interface "qvob1b9a2b7-b6" Port "qr-bb90214d-35" tag: 1 Interface "qr-bb90214d-35" type: internal ovs_version: "1.11.0"

But it still didn't work, so I checked the switch. 'show interface trunk' said only a couple vlans were up. I had to go into each vlan interface and perform a 'no shutdown' to activate, then used 'show interface trunk' to verify the trunks were up and being passed on the trunk interfaces.

Once that was all settled, all VMS could get a dhcp ip, but the floating ip wouldnt work on the compute node that didnt have a qrouter-xxx ip namespace. I could ping, but ssh just hung. TCPdump showed traffic being recieved remotely and within the vm, the vm would show tcp session ESTABLISHED in netstat, but a tcpdump showed it kept retransmitting a large packet of available algorithms.

I had to add to /etc/neutron/dhcp_agent.ini, dnsmasq_config_file = /etc/neutron/dnsmasq.conf

and in /etc/neutron/dnsmasq.conf: dhcp-option=26,1454

in order ... (more)

edit flag offensive delete link more

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2013-12-16 15:43:50 -0500

Seen: 1,605 times

Last updated: Dec 27 '13