OSA Playbooks fail at galera - no intra-container connectivity
I am attempting to install Openstack-Ansible (16.0.5) on a few machines, but I cannot seem to ever get past PIP INSTALL, which shows up during the galera-install playbook.
TASK [pip_install : Get Modern PIP] *********************************************************************************
Wednesday 27 December 2017 10:38:37 -0700 (0:00:00.089) 0:00:30.794 ****
FAILED - RETRYING: Get Modern PIP (5 retries left).
FAILED - RETRYING: Get Modern PIP (4 retries left).
FAILED - RETRYING: Get Modern PIP (3 retries left).
FAILED - RETRYING: Get Modern PIP (2 retries left).
FAILED - RETRYING: Get Modern PIP (1 retries left).
fatal: [infra1_galera_container-e563a45d]: FAILED! => {"attempts": 5, "changed": false, "dest": "/opt/get-pip.py", "failed": true, "gid": 0, "group": "root", "mode": "0644", "msg": "Request failed: <urlopen error [Errno 111] Connection refused>", "owner": "root", "size": 1595408, "state": "file", "uid": 0, "url": "http://192.168.178.55:8181/os-releases/16.0.1/ubuntu-16.04-x86_64/get-pip.py"}
The problem seems centered around the inability for the galera container (and all other containers) to access other containers located on a different host. In this case, I have three infrastructure nodes (infra1-3). The connectivity behaves as follows:
192.168.178.0/23
+--------+----------------+
| deploy | 192.168.178.50 |
+--------+----------------+
| Infra1 | 192.168.178.61 |
+--------+----------------+
| Infra2 | 192.168.178.62 |
+--------+----------------+
| Infra3 | 192.168.178.63 |
+--------+----------------+
| LB | 192.168.178.55 |
+--------+----------------+
All Infra nodes can ping all other nodes, including LB. All nodes can ping all containers, and all nodes can reach the repo server on 8181 from the same node, but cannot reach any other containers on a different node. All VLANs appear to be operating, as the deployment host has br-mgmt connectivity to all other nodes
- Infra1 can ping every IP, LB, for all nodes and all containers.
- Infra1 can
telnet infra1_repo_container-9cd7a69e 8181
- Infra1 cannot hit infra2_repo, infra3_repo...
-
- Infra2 can ping every IP, LB, for all nodes and all containers.
- Infra2 can
telnet infra2_repo_container-2ca31bc1 8181
- Infra2 cannot hit infra1_repo, infra3_repo...
-
- Infra3 can ping every IP, LB, for all nodes and all containers.
- Infra3 can
telnet infra3_repo_container-69326509 8181
- Infra3 cannot hit infra1_repo, infra2_repo...
So it seems like all containers appear to have host-only connectivity.
Here's a copy of https://pastebin.com/cTFGbDfq (infra2's network interfaces). Infra1, infra3 have identical /etc/network/interface
definitions, except for the .61/.62/.63 addresses.
My infra1 repo server is 192.168.178.110
:
root@infra1-repo-container-9cd7a69e:~# telnet 192.168.179.110 8181
Trying 192.168.179.110...
Connected to 192.168.179.110.
root@cg18-1:~# telnet 192.168.179.110 8181 #this is infra1 host node
Trying 192.168.179.110...
Connected to 192.168.179.110.
[root@nfs1:~]# telnet 192.168.179.110 8181 #this is the deployment node
Trying 192.168.179.110...
Connected to 192.168.179.110.
root@infra2-repo-container-2ca31bc1:~# telnet 192.168.179.110 8181
Trying 192.168.179.110...
Connected to 192.168.179.110.
root@cg18-2:~# telnet 192.168.179.110 8181 #this is infra2 host node
Trying 192.168.179.110...
Connected to 192.168.179.110.
root@infra2-galera-container-b2694938:~# telnet 192.168.179.110 8181
Trying ...