Ask Your Question
1

Load-balancer not passing traffic correctly when accessed via floating-IP

asked 2015-04-30 19:10:19 -0600

Nate_McKay gravatar image

Hi all,

I have a multi-node Juno lab on CentOS 7.1 with one controller node, one network node, and two compute nodes.

I am using haproxy for lbaas, and am finding that while I can create a working VIP/pool et al that passes traffic on the internal VIP address, the floating-ip I have associated with it fails to respond or otherwise pass traffic. The floating-ips that I have associated with the individual instances themselves pass traffic no problem however.

I see that the VIP floating-ip is responding to ARP requests, but will not ping or return a SYN-ACK for configured ports despite being allowed by the security group. I am not seeing any log errors that look like they would point to this specifically. I do see some errors in the ovs-vswitchd.log relating to non-existent devices, but I wouldn't think them related (need to sort that separately).

Any ideas where to start looking in order to track this down? Any help would be much appreciated!

Curl against the VIP from a peer instance:

[fedora@web-84ffe0f7-9169-4c29-8be5-d2f4ba443019 ~]$ curl -v http://192.168.1.200/server.txt
* Hostname was NOT found in DNS cache
*   Trying 192.168.1.200...
* Connected to 192.168.1.200 (192.168.1.200) port 80 (#0)
> GET /server.txt HTTP/1.1
> User-Agent: curl/7.37.0
> Host: 192.168.1.200
> Accept: */*
>
< HTTP/1.1 200 OK
* Server nginx/1.6.3 is not blacklisted
< Server: nginx/1.6.3
< Date: Thu, 30 Apr 2015 23:53:14 GMT
< Content-Type: text/plain
< Content-Length: 41
< Last-Modified: Thu, 30 Apr 2015 18:28:58 GMT
< ETag: "5542746a-29"
< Accept-Ranges: bytes
<
web-6744ff64-8b4c-4c60-9823-ee891d37adc3
* Connection #0 to host 192.168.1.200 left intact
[fedora@web-84ffe0f7-9169-4c29-8be5-d2f4ba443019 ~]$

Curl against the pool member instance's floating-IP:

[nmckay@bistromath ~]$ curl -v http://10.12.21.204/server.txt
* Hostname was NOT found in DNS cache
*   Trying 10.12.21.204...
* Connected to 10.12.21.204 (10.12.21.204) port 80 (#0)
> GET /server.txt HTTP/1.1
> User-Agent: curl/7.36.0
> Host: 10.12.21.204
> Accept: */*
>
< HTTP/1.1 200 OK
* Server nginx/1.6.3 is not blacklisted
< Server: nginx/1.6.3
< Date: Thu, 30 Apr 2015 23:52:01 GMT
< Content-Type: text/plain
< Content-Length: 41
< Last-Modified: Thu, 30 Apr 2015 18:28:58 GMT
< Connection: keep-alive
< ETag: "5542746a-29"
< Accept-Ranges: bytes
<
web-6744ff64-8b4c-4c60-9823-ee891d37adc3
* Connection #0 to host 10.12.21.204 left intact
[nmckay@bistromath ~]$

Curl against the VIP's floating IP:

[nmckay@bistromath ~]$ curl -v http://10.12.21.205/server.txt
* Hostname was NOT found in DNS cache
*   Trying 10.12.21.205...
* connect to 10.12.21.205 port 80 failed: Operation timed out
* Failed to connect to 10.12.21.205 port 80: Operation timed out
* Closing connection 0
curl: (7) Failed to connect to 10.12.21.205 port 80: Operation timed out
[nmckay@bistromath ~]$

Tcpdump output showing ARP response:

[nmckay@bistromath ~]$ sudo tcpdump -i em2 host 10.12.21.205 ...
(more)
edit retag flag offensive close merge delete

Comments

Can you ping the VIP from some other node in the same subnet?

uts9 gravatar imageuts9 ( 2015-05-01 23:40:20 -0600 )edit

I can ping the VIP fixed address from an instance on the same subnet.

I cannot ping the floating address associated with the VIP from a node on the same subnet.

The network node is responding to ARP requests for the floating address however.

Nate_McKay gravatar imageNate_McKay ( 2015-05-02 00:42:39 -0600 )edit

LBaas Service quick setup check test

Free OpenStack Consultant gravatar imageFree OpenStack Consultant ( 2015-05-12 10:38:56 -0600 )edit

2 answers

Sort by ยป oldest newest most voted
0

answered 2015-07-08 00:20:38 -0600

Nate_McKay gravatar image

Sorted this a while back and neglected to post the cause/solution.

What I found was that the qlbaas namespace under which the haproxy load-balancer was running did not have a default route set. Consequently the haproxy process did not know how to route packets back to the client, and thus the client never received any kind of response from the VIP.

Manually setting the default route inside of the qlbaas namespace resolved the problem, but did not persist accross reboots until eventually it kind of just sorted itself out and stopped happening. I had been able to repro this once on a stack cloned from the one where I was experiencing the problem initially, but not reliably or predictably.

For what it's worth...

edit flag offensive delete link more

Comments

@Nate_McKay I seem to have run into the same problem you describe here, but am not sure where to set the default route. Could you point me in the correct direction for setting it? Thank you.

Sean_Gray gravatar imageSean_Gray ( 2016-07-26 11:39:06 -0600 )edit
0

answered 2016-05-12 11:17:02 -0600

Vinoth gravatar image

Hi,

Adding "ALLOW' all traffic" rule in the existing Default security has solved the issue for me. In my case, only "default security group" applies to LBAAS demon created.

In normal case whatever the security group we are added to the VMs (member pool VMs) should be applied to the VIP demon as well but In our case, only "default" security group applies.

So the fix is to add the rule in the default security group to allow appropriate traffic.

Thanks,

Vinoth Kumar Selvaraj

edit flag offensive delete link more

Comments

Hi, Vinoth!

Your comment here about the default security group resolved several hours of frustration for me. Thanks!

However, that still leaves the question open: why is the load balancer only inheriting from the default SG and not the one created for the LB (in my case, a heat config)? Any ideas?

MatthewSecaur gravatar imageMatthewSecaur ( 2017-07-26 10:04:35 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2015-04-30 19:10:19 -0600

Seen: 1,782 times

Last updated: May 12 '16