# routing on different interfaces on the network node

Hello,

We have successfully built a 10 nodes Grizzly Openstack (4 computes and 6 storages providing the differents services). Our VMs can access outside through the Network node running quantum with OpenvSwitch plugin.

Here a list of the services running on the network node: - quantum-server - quantum-dhcp-agent - quantum-l3-agent - quantum-metadata-agent - quantum-plugin-openvswitch-agent - openvswitch-switch

On the compute nodes: - quantum-plugin-openvswitch-agent - openvswitch-switch

Each node has: - 1 Gb interface for server management (say eth0 on 10.0.0.0/24) - an Infiniband adapter (IPoIB) for the "Openstack traffic", we don't use the mellanox plugin (yet?) (say ib0 on 172.16.0.0/24) - an other Infiniband adapter for storage (we use Ceph as a backend storage for glance and cinder) (say ib1 on 172.16.1.0/24)

One dedicated 1Gb link is used for the VMs on the network node (eth1 in the bridge br-ex 10.10.0.0/24).

We can assign floating IP to VMs, ssh to them,... The VMs can reach the outside through the dedicated link on the network node.

How ever, we need to reach the storage network (172.16.1.0/24 on ib1) from the VMs.

If I try to ping the storage network, I see all traffic going through eth1 and hitting the default gateway.

We have setup some rules:

# ip rule

0: from all lookup local 1000: from 10.10.0.0/24 lookup vm 32766: from all lookup main 32767: from all lookup default

# ip route list table vm

default via 10.10.0.1 dev br-ex

If we add a route for 172.16.1.0/24 in that table (or in the default one), the packets are still sent to the default gateway. We also tried to add routes on the virtual router attached to the public network (10.10.0.0/24) and on the subnet without success.

So my question is: How can we reach other networks attached to other interfaces of the network node? Is it a problem of putting the right routes in right place or should we build a new public network (with a new L3 agent, which is less desirable)? In the case that the above is impossible, as compute nodes are also on the storage network, can we simply add a virtual interface on those nodes (as we would do on a standard hypervisor)? Could it be specified during the VM creation?

You may ask why we don't create volumes and attached them to VMs, this is because we need a shared storage that can be accessible (rw) from multiple VMs for distributed computing.

edit retag close merge delete

Sort by » oldest newest most voted

No I didn't, so I tried a flat provider network.

The main problem with Infiniband is that you can't put an interface in a bridge (std linux or ovs), because IP over Infiniband is used and not Ethernet over Inifniband. Therefore the layer 2 seems not to be the same between an ethernet if and an IB one.

I came up with the idea of creating a virtual ethernet interface with:

It created me 2 interfaces (veth0 and veth1)

I created a bridge in ovs, added one of the veth into it and added an IP address (different than the storage network). Everthing seems to be fine as the route are good and I can ping the bridge's IP (from an other host).

In ovs plugins settings (/etc/quantum/plugins/openvswitch/ovs_quantum_plugin.ini), I added the following lines: network_vlan_ranges = ceph-net:4090:4095 # I put some dummy numbers as we use gre and no vlan bridge_mappings = ceph-net:br-ceph

I had to do this on each compute nodes and the network node (every nodes running the ovs quantum agent), to avoid any errors such as: ERROR [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Cannot provision flat network for net-id=... - no bridge for physical_network br-ceph

I created the network in quantum after rebooting the ovs agent: quantum net-create ceph-net --tenant_id ... --provider:network_type flat --provider:physical_network br-ceph

I also shared it. I created the subnet in the range of the br-ceph's IP (no dhcp yet)

When I boot a VM with an interface on our "standard" network and one in ceph-net, I can access the VM as usual but if I set an IP address on the interface connected to ceph-net I can't ping anything (i.e. I dont' see any traffic on the br-ceph or the veth attached to it). I also don't see any new interface added to it. On the network node: http://pastebin.com/mmFU5UbF

On the compute node: http://pastebin.com/kvgCv86Z

Am I missing something, for example an agent/plugin to add on the compute nodes?

more

( 2016-04-16 01:52:11 -0500 )edit

Alexandre - I have a similar set up except with only one IB interface and I'm also using GRE interfaces.

I've been trying to get the nova/quantum setup to use IB as the transport for the gre tunnels. It seems to do that (according to tcpdump), but the bandwidth I'm getting is about 1gbit/s (bare-metal use of the ib interface is ~9gb/s). Do you get better performance than that between VM's?

I'm also curious how you got boot-from-volume working with Ceph/RBD and libvirt.

With Ubuntu 13.10 I seem to suffer from two problems with Ceph - one is a libvirt race condition that causes snapshots to fail ( https://bugs.launchpad.net/nova/+bug/1244694 (https://bugs.launchpad.net/nova/+bug/...) ) and another is that at least in grizzly, I can't use cinder to make a volume from anything that isn't a RAW image (minor annoyance).

Are you using Ubuntu or some other base distro?

more

more

We followed to installation steps in the Openstack documentation to setup the network tunnels in OpenvSwitch. So we use GRE for the tenant network. Here is the configuration of the plugin on the network node (without the database settings):

[OVS] tenant_network_type = gre

enable_tunneling = True tunnel_id_ranges = 1:1000 local_ip = 172.16.0.23

network_vlan_ranges = ceph-net:4090:4095 bridge_mappings = ceph-net:br-ceph

[AGENT] polling_interval = 2

[SECURITYGROUP] firewall_driver = quantum.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver

Let's recap a little... - Our Infiniband cards are dual ports, so we have 2 interfaces on our nodes: ib0 and ib1 - Openstack traffic (inter-services traffic as well as GRE tunnels created by OpenvSwitch) goes through one Infiniband network: the ib0 interface 172.16.0.0/24 - Ceph traffic goes onto another Infiniband network: ib1 172.16.1.0/24 - if I add ib1 in a bridge it doesn't work - Glance and Cinder use Ceph, we boot the VM from a volume created from an image (so we can use copy-on-write in Ceph), but I don't think it's relevant to that problem

What we want to achieve is to have the fastest access from the VMs to the storage for MapReduce kind of processing (maybe using Savanna for hadoop). I thought that mounting CephFS directly on the VM might be the solution. I really like your idea of creating a provider network.

Do you think that the steps I took to create the provider network are OK?

more

that makes more sense, as we can narrow down the issue to the open vswitch agent.

When you posted your configuration above, I did not notice that ceph-net was an ovs bridge, and that you're trying to plug VIF directly into a OVS bridge different from br-int; the nova VIF drivers are unfortunately unable to do that.

The port directed to ceph-net indeed landed on br-int:

    Port "tap5645e5d5-36"
tag: 2
Interface "tap5645e5d5-36"


You already said that you are unable to plug directly the infiniband interface into ceph net. But I reckon you should be able to create another veth pair between br-int and ceph-net ovs bridges. That might work, but we might also start adding some serious latency.

You also mentioned that your vlan tags are dummy because you're using GRE. Can you clarify if you referred to the tenant network or if you're using GRE tunnels to reach into your storage network?

more

Sorry about the confusion... I meant I can only see a single port for ceph-net but the VM has indeed two ports, one on our standard network (which works, floating Ip & co) and the other one in ceph-net. In the VM I can see two interfaces eth0 and eth1. eth0 is configured by dhcp form our standard network and I configure eth1 manually.

more

If you can only see a single port for your VM then nova is not correctly interpreting your request to boot the VM with two NICs. I don't think it's failing to create the second port; in that case you should indeed obtain a failure while booting the instance as networking provisioning is currently processed synchronously with the create server request.

I am not sure how this could be happening. Looking at your nova compute logs you should be able to see the requests made to the neutron server. From the nova api i log you should be able to see the request received to check the network interface spec in the json request body.

more

Yeah I can see one port: quantum port-show 5645e5d5-362a-498c-9864-b19db1034531 +-----------------+-------------------------------------------------------------------------------------+ | Field | Value | +-----------------+-------------------------------------------------------------------------------------+ | admin_state_up | True | | device_id | 38eac7dd-2a5b-4b26-9ea4-3ba9c63d0315 | | device_owner | compute:None | | fixed_ips | {"subnet_id": "f1225ed7-1560-4e0f-a70d-d66200c1a282", "ip_address": "172.17.1.102"} | | id | 5645e5d5-362a-498c-9864-b19db1034531 | | mac_address | fa:16:3e:00:68:69 | | name | | | network_id | bf16ed0b-ecd4-4992-b19c-7d1d2c7dc76c | | security_groups | 9c6b116a-83e1-4c27-84bd-14a216f0a854 | | status | ACTIVE | | tenant_id | 080b784baeea487587786b160e2f30b5 | +-----------------+-------------------------------------------------------------------------------------+

On Horizon: Name None ID 5645e5d5-362a-498c-9864-b19db1034531 Network ID bf16ed0b-ecd4-4992-b19c-7d1d2c7dc76c Project ID 080b784baeea487587786b160e2f30b5 Fixed IP IP address: 172.17.1.102, Subnet ID f1225ed7-1560-4e0f-a70d-d66200c1a282 Mac Address fa:16:3e:00:68:69 Status ACTIVE Admin State UP Attached Device Device Owner: compute:None Device ID: 38eac7dd-2a5b-4b26-9ea4-3ba9c63d0315

The Network ID correspond to ceph-net

more

Can you at least see the logical neutron ports for the VM interfaces on ceph-net? This will help us debugging the root cause of the issue.

more

Have you considered provider networks? I was thinking that: - If you can boot VMs with multiple NICs, each VM might have a 2nd NIC on a provider network mapped to the storage network, so you'll have direct access to it. - If you don't want to change the VM layout for accessing the storage network, you can still try and map it to a provider network and then attach that network to the logical routers.

Even if you will not launch any VM on this provider network, you should still be able to reach the hosts already running on it.

more