Revision history [back]

click to hide/show revision 1
initial version

Problem with external/public network, even though I can see the floating IPs in the "attached devices" list of my physical router

Hi all,

i'd like to request your help in troubleshooting this simple Openstack deployment problem. The network deployment is very simple, it's a one node deployment (virtual machine running CentOS 7, on top of an ESXI hypervisor), and I use my home LAN as the external network of the deployment.

I have created two networks, a private and a public one:

(openstack) network list
+--------------------------------------+---------+--------------------------------------+
| ID                                   | Name    | Subnets                              |
 +--------------------------------------+---------+--------------------------------------+
| 4a4a3a00-2942-42ea-b08e-d0c71497fea5 | private | 31913d7c-5d25-44ed-ad57-e28a1f60c6dc |
| 3f076faa-429f-4e8e-b697-c3f7d659e6be | public  | db73586f-4a35-433b-aeee-d97fb0e243bd |
+--------------------------------------+---------+--------------------------------------+

Everything on the private network seems to be working fine, meaning that the two instances I've launched can ping each other and the router's private interface. Here's what the router looks like:

(openstack) router show "Router 1"
+-------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                   | Value                                                                                                                                                                                    |
+-------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| admin_state_up          | UP                                                                                                                                                                                       |
| availability_zone_hints |                                                                                                                                                                                          |
| availability_zones      | nova                                                                                                                                                                                     |
| description             |                                                                                                                                                                                          |
| distributed             | False                                                                                                                                                                                    |
| external_gateway_info   | {"network_id": "6e28d867-10e8-48e0-ac02-46c0d7d18d67", "enable_snat": true, "external_fixed_ips": [{"subnet_id": "af5c4630-edce-461b-bf13-274ca833b68a", "ip_address": "192.168.2.20"}]} |
| ha                      | False                                                                                                                                                                                    |
| id                      | ebc1a5f4-28f3-4d9c-9773-ef9e9c17c086                                                                                                                                                     |
| name                    | Router 1                                                                                                                                                                                 |
| routes                  | []                                                                                                                                                                                       |
| status                  | ACTIVE                                                                                                                                                                                   |
| tenant_id               | 16a72508575e4615b0a0b6e1806d6f84                                                                                                                                                         |
+-------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

The problem is with the public network. I can't ping any of the the floating IPs from my LAN towards the two VM instances, or vice versa. I can't ping the router's public interface either. I think I've done the hosts' bridging correctly, so there should be no issue there:

[root@localhost ~]# cat /etc/sysconfig/network-scripts/ifcfg-br-ex 
TYPE="OVSBridge"
DEVICETYPE="ovs"
BOOTPROTO="static"
DEVICE="br-ex"
ONBOOT="yes"
IPADDR="192.168.2.205"
PREFIX="24"
GATEWAY="192.168.2.50"
DNS1="192.168.2.50"

[root@localhost ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens32 
DEVICE="ens32"
TYPE="OVSPort"
DEVICETYPE="ovs"
OVS_BRIDGE="br-ex"
ONBOOT="yes"

Here's what the public network looks like:

 (openstack) network show public
+---------------------------+--------------------------------------+
| Field                     | Value                                |
+---------------------------+--------------------------------------+
| admin_state_up            | UP                                   |
| availability_zone_hints   |                                      |
| availability_zones        | nova                                 |
| created_at                | 2016-08-15T19:27:45                  |
| description               |                                      |
| id                        | 6e28d867-10e8-48e0-ac02-46c0d7d18d67 |
| ipv4_address_scope        | None                                 |
| ipv6_address_scope        | None                                 |
| is_default                | False                                |
| mtu                       | 1450                                 |
| name                      | public                               |
| project_id                | 16a72508575e4615b0a0b6e1806d6f84     |
| provider:network_type     | vxlan                                |
| provider:physical_network | None                                 |
| provider:segmentation_id  | 0                                    |
| router_external           | Internal                             |
| shared                    | True                                 |
| status                    | ACTIVE                               |
| subnets                   | af5c4630-edce-461b-bf13-274ca833b68a |
| tags                      | []                                   |
| updated_at                | 2016-08-15T19:27:45                  |
+---------------------------+--------------------------------------+


(openstack) subnet show public-subnet
+-------------------+--------------------------------------+
| Field             | Value                                |
+-------------------+--------------------------------------+
| allocation_pools  | 192.168.2.20-192.168.2.30            |
| cidr              | 192.168.2.0/24                       |
| created_at        | 2016-08-15T19:28:58                  |
| description       |                                      |
| dns_nameservers   | 192.168.2.50                         |
| enable_dhcp       | False                                |
| gateway_ip        | 192.168.2.50                         |
| host_routes       |                                      |
| id                | af5c4630-edce-461b-bf13-274ca833b68a |
| ip_version        | 4                                    |
| ipv6_address_mode | None                                 |
| ipv6_ra_mode      | None                                 |
| name              | public-subnet                        |
| network_id        | 6e28d867-10e8-48e0-ac02-46c0d7d18d67 |
| project_id        | 16a72508575e4615b0a0b6e1806d6f84     |
| subnetpool_id     | None                                 |
| updated_at        | 2016-08-15T19:28:58                  |
+-------------------+--------------------------------------+

The router's external gateway interface shows as "Status = BUILD" though, I was expecting something like "Active".

And here's the weird part. On my physical router, I can see the router's external interface, and the two floating IPs, in the list of attached devices!!! So maybe there's something wrong in L3 networking? These are the ERRORs I can see in the neutron logs:

l3-agent.log:2016-08-15 12:56:12.642 30079 ERROR neutron.common.rpc [req-de919f92-1c78-43bf-8c59-88896b8d700a - - - - -] Timeout in RPC method get_service_plugin_list. Waiting for 19 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.
openvswitch-agent.log:2016-08-15 12:56:12.707 30005 ERROR neutron.common.rpc [-] Timeout in RPC method report_state. Waiting for 39 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.
openvswitch-agent.log:2016-08-15 12:56:13.420 30005 ERROR neutron.common.rpc [req-8308450a-13d8-4733-95fd-105318d4b4f4 - - - - -] Timeout in RPC method tunnel_sync. Waiting for 36 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [-] Failed reporting state!
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last):
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 313, in _report_state
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     True)
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/rpc.py", line 86, in report_state
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     return method(context, 'report_state', **kwargs)
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/common/rpc.py", line 155, in call
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     time.sleep(wait)
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     self.force_reraise()
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     six.reraise(self.type_, self.value, self.tb)
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/common/rpc.py", line 136, in call
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     return self._original_context.call(ctxt, method, **kwargs)
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 158, in call
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     retry=self.retry)
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 90, in _send
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     timeout=timeout, retry=retry)
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 470, in send
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     retry=retry)
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 459, in _send
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     result = self._waiter.wait(msg_id, timeout)
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 342, in wait
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     message = self.waiters.get(msg_id, timeout=timeout)
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 244, in get
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     'to message ID %s' % msg_id)
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent MessagingTimeout: Timed out waiting for a reply to message ID 54d66b3bd43040f48361a411aba28aa5
openvswitch-agent.log:2016-08-15 12:56:52.195 30005 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
openvswitch-agent.log:2016-08-15 13:24:49.154 30005 ERROR neutron.agent.linux.async_process [-] Error received from [ovsdb-client monitor Interface name,ofport,external_ids --format=json]: None
openvswitch-agent.log:2016-08-15 13:24:49.156 30005 ERROR neutron.agent.linux.async_process [-] Process [ovsdb-client monitor Interface name,ofport,external_ids --format=json] dies due to the error: None
server.log:2016-08-15 15:25:36.463 30460 ERROR oslo.messaging._drivers.impl_rabbit [req-a38f093c-67fc-49d8-92dc-80c7ac46d060 bc133262a322431cb7ec2eec3640aa51 16a72508575e4615b0a0b6e1806d6f84 - - -] AMQP server on 192.168.2.205:5672 is unreachable: [Errno 110] Connection timed out. Trying again in 1 seconds.

Last but not least, what makes it even weirder is the fact that I followed the same simple deployment approach in a CentOS 7 VM installed on my laptop, and everything worked fine. The only visual difference I could spot is that, for some reason, the VM's network interface was assigned the name "enp2s0" instead of the "ens32" when launching on top of the ESXI.

All suggestions are very welcome and much appreciated. :)

Alex.