Ask Your Question
0

Essex - Quantum - OVS - Multi-Node Architecture -> Working Partially !

asked 2012-06-08 11:11:56 -0500

emilienm gravatar image

Hi Stackers,

I will be the more precise as possible.

I'm working in a multi-node architecture with Ubuntu 12.04 / Essex up to date.

My architecture is clean and all was working in VLAN-Manager Mode. From now, I switch to Quantum Manager.

My docs references :

http://docs.openstack.org/trunk/openstack-network/admin/content/index.html (http://docs.openstack.org/trunk/opens...) http://openvswitch.org/openstack/documentation/ (http://openvswitch.org/openstack/docu...)

  • Node 1 : Controller

MySQL, Rabbit-MQ, nova-volume, nova-api, nova-network, nova-schedule, quantum-server with OVS plugin

nova.conf : http://paste.openstack.org/show/18401/

ovs-vsctl add-br br-int ovs-vsctl add-port br-int eth1

ovs-vsctl br-set-external-id br-int bridge br-int (useful ?) [Edit : I've rebuilt by bridge without this command]

I use default mode of Quantum (Without tunneling).

nova-manage network create --label=public --fixed_range_v4=192.168.15.0/24

DNSMASQ is running well on controller node (which is also nova-network).

/etc/network/interfaces with eth1 :

[..]

iface eth1 inet manual up ifconfig $IFACE 0.0.0.0 up up ip link set $IFACE promisc on down ip link set $IFACE promisc off down ifconfig $IFACE down

  • Node 2 : Compute1 and Node 3 Compute2 :

nova.conf -> same as controller

nova-compute.conf -> http://paste.openstack.org/show/18403/


I explain now some stuff I've seen :

  • When I create an instance, it's does not get an IP address from DNSMASQ. After many hours to looking for why, I can see I'm not alone to be in this situation. I did not find someone in the OpenStack community with Essex + Quantum + OVS working in Multi-Node Architecture ! That's why I'm doing an investigation as I can, and I think to have localized the issue.

  • On the compute node :

root@compute1:~# ovs-vsctl show Bridge br-int Port "eth1" Interface "eth1" Port br-int Interface br-int type: internal Port "tap771bf804-eb" tag: 4095 Interface "tap771bf804-eb" ovs_version: "1.4.0+build0"

My first question :

Why we have a 4095 tag for the TAP interface (which is vNIC of VM) ?

What I found :

If I delete TAP interface after VM creation, and I recreate it, my VM gets an IP !!! :

ovs-vsctl del-port tap771bf804-eb ovs-vsctl add-port br-int tap771bf804-eb

After that, if my VM asks for an IP, she gets an IP.

I know that's not clean, but I try to find what's wrong with OVS Plugin in https://github.com/openstack/quantum/blob/master/quantum/plugins/openvswitch/agent/ovs_quantum_agent.py (https://github.com/openstack/quantum/...)

Maybe a issue with :

self.int_br.add_flow(priority=2, in_port=p.ofport, actions="drop")

?

  • Other problem now, I can connect to the VM with this tips, but I can't connect from other hosts than my controller (ans nova-network as well). And also my VM does not have Internet.

Second question :

What's wrong with IPtables ? My security groups allow SSH + ICMP.

I think to have isolated the issue, but now we have to debug it and to understand what's wrong with OVS + Quantum in multi-node architecture.

Thank's for help, and please let me know ...

(more)
edit retag flag offensive close merge delete

29 answers

Sort by » oldest newest most voted
0

answered 2012-06-10 00:52:02 -0500

danwent gravatar image

Hi Emilien,

You are correct, the OVS doc page was missing the comment that you need to create an integration bridge and run the ovs_quantum_agent.py on the nova-network node.... sorry about that. I've updated the page. Thanks for letting us know!

If you run into issues in the future and want to see how a working setup, you should be able to use the instructions for multi-node devstack, as those are used pretty regularly and should be up-to-date: http://wiki.openstack.org/QuantumDevstack (http://wiki.openstack.org/QuantumDevs...)

Here are a couple suggestions for debugging:

  • look at the tables in ovs_quantum database. There should be a table for networks, ports, and vlan_bindings. The vlan_bindings table describes which vlan id has been allocated to a particular network. When a device appears on br-int, the ovs_quantum_agent.py finds the external-ids:iface-id attribute of that device from the ovsdb Interfaces table (this is the per-host OVS database that comes with OVS, not the OVS plugin database), and then queries the centralized OVS plugin database to find the associated port, network, and vlan. It will then set the vlan of the port to be the vlan from vlan_bindings. If you see a port with a vlan of 4095, this means we were unable to find a quantum port associated with that external-id:iface-id value.
edit flag offensive delete link more
0

answered 2012-06-11 18:42:10 -0500

danwent gravatar image

you could actually just change the field in the vlan_bindings table of the ovs_quantum database on the controller node to be something else. My guess is that VLAN 1 is not being trunked by your physical network.

edit flag offensive delete link more
0

answered 2012-06-11 19:08:25 -0500

askstack gravatar image

Emilien

Have you tried using a ethernet directly connecting the two eth1 ports? This way it will by pass the switch and no packets will get dropped.

edit flag offensive delete link more
0

answered 2012-06-11 19:46:51 -0500

emilienm gravatar image

@Dan :

I change to VLAN 7 but same :

root@compute1:~# tcpdump -n -e -vv -ttt -i eth1 | grep "vlan 7" 00:00:00.254417 fa:16:3e:38:08:c8 > 33:33:00:00:00:02, ethertype 802.1Q (0x8100), length 74: vlan 7, p 0, ethertype IPv6, (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::f816:3eff:fe38:8c8 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16

root@controller:~# tcpdump -n -e -vv -ttt -i eth1 | grep "vlan 7"

... nothing

@asktask :

No I didn't yet. I use 2 computes nodes and 1 controller. I want to stay in this configuration and to solve my issues with real hardware.

edit flag offensive delete link more
0

answered 2012-06-10 21:28:25 -0500

danwent gravatar image

Hi Emiliean,

"ovs-vsctl --no-wait br-set-external-id $OVS_BRIDGE bridge-id br-int" is required for use with the NVP plugin, but not currently required for correct operation with the OVS plugin. For simplicity we try to keep a single set of instructions for setting up br-int whether its being used with the OVS plugin or the NVP plugin, which is why its included there.

edit flag offensive delete link more
0

answered 2012-06-08 13:11:15 -0500

emilienm gravatar image

Here you can find Agent logs when I create a VM :

DEBUG:root:## running command: sudo ovs-vsctl --timeout=2 list-ports br-int DEBUG:root:## running command: sudo ovs-vsctl --timeout=2 get Interface eth1 external_ids DEBUG:root:## running command: sudo ovs-vsctl --timeout=2 get Interface eth1 ofport DEBUG:root:## running command: sudo ovs-vsctl --timeout=2 get Interface tap2d31bc15-08 external_ids DEBUG:root:## running command: sudo ovs-vsctl --timeout=2 get Interface tap2d31bc15-08 ofport DEBUG:root:## running command: sudo ovs-vsctl --timeout=2 set Port tap2d31bc15-08 tag=4095 DEBUG:root:## running command: sudo ovs-ofctl add-flow br-int priority=2,in_port=2,actions=drop

Someone knows if actions=drop is normal ?

edit flag offensive delete link more
0

answered 2012-06-09 23:35:31 -0500

emilienm gravatar image

After more investigations with Pedro Navarro Pérez,

We found why we have "actions=drop", it's because my computes nodes were not able to access to Database (my fault).

But my VMs does not have access yet to the network.

In ovs_quantum database, I can see the ports (my VM TAP and the Gateway). The port of my Gateway is state "ACTIVE" but in op_status "DOWN". Is it normal ?

What can I do ?

I try several times to delete / recreate networks, but always the same situation.

I continue investigations.

edit flag offensive delete link more
0

answered 2012-06-09 23:46:45 -0500

emilienm gravatar image

The port of my Gateway is state "ACTIVE" but in op_status "DOWN" because I did not run OVS Agent on controller (also nova-network).

So now, I run the agent on the controller, and my gateway is ACTIVE & UP.

If I read http://openvswitch.org/openstack/documentation (http://openvswitch.org/openstack/docu...) , it's written that the agent must be run from computes nodes, or maybe I did not understand :-).

My VMs does not have access yet to the network. Coming soon I hope !!

edit flag offensive delete link more
0

answered 2012-06-10 20:49:10 -0500

emilienm gravatar image

@Dan

As usual, thank's for support.

  • For the doc, don't worry, I'm glad my work can be useful for everybody.

  • I like DevStack but I want to build my architecture from scratch with the goal to be a Jedi in Quantum :-) (and OpenStack also). I always use DevStack as a reference when I need to confirm an information about configuring or something else. But in my case I can see everythink is like DevStack without Melange (Should I use it ?) and I did not launch "ovs-vsctl --no-wait br-set-external-id $OVS_BRIDGE bridge-id br-int" (-> Is it usefull ? Can you explain me why DevStack is using it ?)

  • For debugging, of course I use Databases and tomorrow I will continue to post here the news about my work.

Thank's again Dan !

edit flag offensive delete link more
0

answered 2012-06-11 15:38:53 -0500

emilienm gravatar image

Dan,

  • Today I've setup DevStack and I've compared all the stuffs : I've exactly the same configuration.

  • When I dump the traffic I can see something strange :

    *tape3f5c3b2-c5 sends DHCP DISCOVER

    *All eth1 (connected to br-int) can see this DHCP packet

    *gw-385e6c3d-18 can't see the DHCO packet.

On my controller / nova-network / Quantum-Server :

Bridge br-int
    Port br-int
        Interface br-int
            type: internal
    Port "eth1"
        Interface "eth1"
    Port "gw-385e6c3d-18"
        tag: 1
        Interface "gw-385e6c3d-18"
            type: internal

On my compute node :

Bridge br-int
    Port br-int
        Interface br-int
            type: internal
    Port "eth1"
        Interface "eth1"
    Port "tape3f5c3b2-c5"
        tag: 1
        Interface "tape3f5c3b2-c5"

DNSMASQ is running on controller -> http://paste.openstack.org/show/18447/

And on the controller I can see my VM IP Configuration in /var/lib/nova/networks/nova-gw-385e6c3d-18.conf : fa:16:3e:72:48:2d,host-192.168.22.3.novalocal,192.168.22.3

In my ovs_quantum database, all the ports are ACTIVE / UP (gateway + TAP).

So... What's wrong ?

The nova.conf of my controller is it correct ? -> http://paste.openstack.org/show/18448/

Nova-compute.conf of my computes nodes : http://paste.openstack.org/show/18449/

Thank's

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2012-06-08 11:11:56 -0500

Seen: 479 times

Last updated: Jun 15 '12