[CLOSED] Neutron SNAT not forwarding return packets

asked 2016-04-18 14:22:56 -0600

Depa77 gravatar image

updated 2016-04-19 14:37:45 -0600

Hi everyone, I'm having problems configuring dvr-snat on neutron on Openstack Liberty. I've got 4 compute nodes and 3 controller nodes with also the network role. I've set up correctly the vxlan tunnel on each node, therefore inside a tenant network I can ping any instance and also the outside interface (qg-*) on the snat namespace. To connect the nodes to the net I'm using two 10G interfaces bonded together in LACP and I'm using 4 vlans on it: - 1 mgmt - 2 openstack internal communications - 3 vxlan tunnel - 4 public network. I'm also using the Open vSwitch package.

Since I'm not an expert of Open vSwitch I preferred managing the interfaces with the ifcfg-* files, and the public vlan is shown to neutron as a flat network (the linux kernel will correctly tag the packets).

The problem is that if I ping anything outside openstack from an instance, running tcpdump on the inside interface (sg-*) of the snat namespace, I see the packets arriving from the instance. Those are correctly forwarded to my real router with the source address changed with the public one of the snat namespace (I see this using tcpdump too). Those packets arrive to my real router and are correctly forwarded to the internet and I can see the replays on the public interface of the snat namespace. At this point the packet should be forwarded to the instance that initially generated the traffic, but this does not happen, and the packet is lost inside the snat namespace.

Both NetworkManager and firewalld are stopped.

Any hints on how to proceed? If you need some configuration file or command output I'll be glad to post them.

Update 1

As asked from dbaxps, the Controller nodes are in HA using keepalived to maintain the virtual IP, and the load is balanced with HAproxy installed on each node. The MySQL database (MariaDB) is in cluster with galera. If the active controller goes down, HAproxy forwards every request to the second node.

The dvr is actually done by every node (both controller and compute). In fact, I've got a qrouter-* namespace everywhere. When I add a router with a gateway in neutron, a snat-* namespace is created on a controller node. If i shutdown that node (or simply disconnect it from the network), a new snat-* namespace with the same addresses is created on another controller.

My only problem is that, even with the following entries in the nat table (of iptables) in the snat namespace, the return packets (icmp replay, or any other -- tcp ack, dns replay, etc) are lost in the snat namespace: I see them coming into the qg-* interface, but no DNAT is performed on them and neither they are forwarded into the sg-* interface.

-A neutron-l3-agent-snat -o qg-bcd07598-ef -j SNAT --to-source x.y.z.5
-A neutron-l3-agent-snat -m mark ! --mark 0x2/0xffff -m conntrack --ctstate DNAT -j SNAT --to-source x.y.z.5

(x.y.z ... (more)

edit retag flag offensive close merge delete


As far as Neutron router is not HA router, in other words is not driven by VRRP, it won't come to active state on second Controller node. In best case this system might be DVR cluster. I don't see here any HA 3/2 Nodes Controller implementation. Go ahead and prove me that I am wrong.

jasonwg gravatar imagejasonwg ( 2016-04-19 15:09:17 -0600 )edit

It would much more strong step, rather then print [CLOSED] in the header. What are you afraid of closing post which has an obvious issue ?

jasonwg gravatar imagejasonwg ( 2016-04-19 15:14:26 -0600 )edit

2 answers

Sort by ยป oldest newest most voted

answered 2016-04-19 04:56:24 -0600

Depa77 gravatar image

updated 2016-04-19 06:25:20 -0600

I've finally come to a solution. In the /etc/sysconfig/network-scripts/ifcfg-vlan4 file, I've changed the REORDER_HDR directive from 0 to 1, and now everything is working.

I thing that packets weren't DNAT'ed because they were forwarded to the snat namespace with the 802.1q header, but they were expected with the Ethernet II header.

Update 1

The router was not created via the neutron cli but using horizon dashboard.

This is the output of neutron router-show adm-rtr -f json|python -m json.tool

        "Field": "admin_state_up",
        "Value": true
        "Field": "distributed",
        "Value": true
        "Field": "external_gateway_info",
        "Value": "{\"network_id\": \"58ebef8f-2e2c-4e35-bbc4-7a6ea1e6f029\", \"enable_snat\": true, \"external_fixed_ips\": [{\"subnet_id\": \"9a18bd24-565a-4cd8-870e-973730532708\", \"ip_address\": \"x.y.z.5\"}]}"
        "Field": "ha",
        "Value": false
        "Field": "id",
        "Value": "acc896a5-6bbb-43da-9fe8-cb7da7f21ec5"
        "Field": "name",
        "Value": "adm-rtr"
        "Field": "routes",
        "Value": ""
        "Field": "status",
        "Value": "ACTIVE"
        "Field": "tenant_id",
        "Value": "b611e034000b42da89e257213c207661"

On controller 1: ip netns


On controller 2: ip netns


On compute 1: ip netns

edit flag offensive delete link more


Could you post

neutron router-show your-router-name.
dbaxps gravatar imagedbaxps ( 2016-04-19 05:36:36 -0600 )edit

Please , post also on Controller/Network and Compute Nodes.

ip netns
dbaxps gravatar imagedbaxps ( 2016-04-19 05:52:55 -0600 )edit

As far as I understand REORDER_HDR=1 strips vlan header from arriving packet. So SNAT forwarding start to work and nothing else. You still actually state then your router supports VRRP && DVR on Liberty a time.

dbaxps gravatar imagedbaxps ( 2016-04-19 06:15:16 -0600 )edit

neutron router-show works for any router created via dashboard

dbaxps gravatar imagedbaxps ( 2016-04-19 06:50:15 -0600 )edit

answered 2016-04-18 16:38:07 -0600

dbaxps gravatar image

updated 2016-04-19 07:11:14 -0600

UPDATE after Depa77's ANSWER UPDATE provided
Quoting Depa77
Router doesn't support HA

        "Field": "ha",
        "Value": false

On Compute Node fip-namespace is absent

On compute 1: ip netns qrouter-acc896a5-6bbb-43da-9fe8-cb7da7f21ec5

    Final Conclusion :-
    No HA Controller has been created && At the moment is not quite clear will assign FIP to VM create fip-namespace on Compute to perform DNAT routing via "fg" interface.Attempt router-update to --ha True will conflict on Liberty with
        "Field": "distributed",
        "Value": true

UPDATE 04/19/2016
Per your feed
1.Hi everyone, I'm having problems configuring dvr-snat on neutron on Openstack Liberty.
2. As asked from dbaxps, the Controller nodes are in HA using keepalived to maintain the virtual IP
3.The dvr is actually done by every node (both controller and compute)

On Openstack Liberty Neutron Router is unable to run in VRRP ( Keepalived ) and DVR (Distributed mode) at a time.
Thus your system is not functional

Known Bug
Resolved in Mitaka M3 via running keepalived inside SNAT namespace. All details in link posted above.

Per UPDATE 2 you created neutron router which doesn't support either VRRP or DVR
and seems to me to be your core mistake done following

Your command was :-

neutron router-create tenant-router

Short explanation with snapshots from http://docs.openstack.org/liberty/net...
is here http://bderzhavets.blogspot.com/2016/...
Check guide one more time ( neutron router-create commands). They create either "ha" or "distributed" router. But not having both features at a time.

On Liberty for HA Controlller I have to run as admin:-

#   neutron router-create --ha True --tenant_id  xxxxxxxx Router01      
Doing so I will create router based on VRRP protocol and become active on new MASTER after failure been
caught by Keepalived ( implementing VRRP on Linux ). This router is NOT distributed and cannot support
Compute nodes running in DVR mode

To support DVR cluster I need to ussue

#  neutron router-create --ha True --distributed True --tenant_id  xxxxxxxx Router02

This is possible only on Mitaka and was tested here - https://www.linux.com/blog/ha-support...
I saw link http://docs.openstack.org/liberty/net...
It suggests VRRP and DVR as different options silently skipping question of compatibility, which appears to be long term issue been started in Kilo cycle and resolved in Mitaka.
To be safe I would choose "Pacemaker,corosync,HAProxy,Galera" to support Cluster on Liberty (or Mitaka)

    I  just guess then on Mitaka HAProxy/Keeplived could work as well , due to Mitaka Router should be activated not due to  normal "Keepalived" service pushing forward new MASTER. It should be activated by keepalived process running
inside SNAT namespaces and might become active on third BACKUP (in keepalived sense) Controller,not necessary on keepalived MASTER. All what I wrote up here needs to be carefully tested.


Sounds like you have 3 Node HA Controller . From your post it's not clear what for you need dvr_snat for L3 routers on Controllers. Usually this step ... (more)

edit flag offensive delete link more


I think we misunderstood each other. Keepalived is used only for the APIs. The snat is not behind vrrp. I've followed the "High Availability using Distributed Virtual Routing (DVR)" guide, placing the network node configurations in the controller ones.

Depa77 gravatar imageDepa77 ( 2016-04-19 03:08:46 -0600 )edit

Please, place the CLI been used to create Neutron Router on HA Controller && link you have been following

dbaxps gravatar imagedbaxps ( 2016-04-19 03:18:29 -0600 )edit

I've added update 2 with all the information you asked.

Depa77 gravatar imageDepa77 ( 2016-04-19 03:50:38 -0600 )edit

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower


Asked: 2016-04-18 14:22:56 -0600

Seen: 1,534 times

Last updated: Apr 19 '16