How to Run Windows NLB Cluster on Openstack?

asked 2020-03-05 03:24:40 -0500

corey gravatar image

I have two Nova instances run as Windows RD Gateway, I would like to build a High Availability architecture by using Windows Network Load Balance Manager (NLB), I set up NLB according to this https://us.informatiweb-pro.net/system-admin/win-server/82--windows-server-2012-2012-r2-rds-implement-high-availability-for-your-rds-gateways.html (article), here is some info of my NLB cluster:

Cluster Name: rdgw (virtual-ip: 192.168.0.100)
Host-1: rdgw-1 (192.168.0.91) (NLB installed, RD Gateway Server Farm joined, static IP, Converged, priority=1)
Host-2: rdgw-2 (192.168.0.92) (NLB installed, RD Gateway Server Farm joined, static IP, Converged, priority=2)
Cluster Operation Mode: Multicast
Filtering Mode: Single Host (My main purpose is HA)

And here are some settings on Openstack side:
I created three network ports, all of them are under the same tenant, with the same security groups, and set allowed_address_paris for two RDGW nodes.

allowed_address_pairs | {"ip_address": "192.168.0.100", "mac_address": "fa:16:3e:a8:9c:0e"}

So all packets came from 192.168.0.100 will map to the two nodes (rdgw-1 & rdgw-2).

But it doesn't work at all, everything could be working only when one of the two nodes is going down!! It means that when I shutdown rdgw-1, end-user can establish a connection via virtual-ip (redirect to rdgw-2 by NLB), when both of them online, it doesn't work, end-user cannot establish a connection via virtual-ip.

Troubleshooting:
1. rdgw-1 and rdgw-2 can ping each other
2. From outside, ping virtual-ip => DUP (two nodes respond to ICMP message)
3. From outside, telnet virtual-ip 443 => Connection closed by foreign host
(I can telnet rdgw-1-ip 443 & telnet rdgw-2-ip 443 directly)
4. I install Windows Network Monitor for debugging this issue, I can see the packets send by Heartbeat(Protocol Name: NLBHB), it means the communication between two nodes might be working. But I found that the node will receive Duplicate Packets (TCP:[DUP Ack......]) when I "telnet virtual-ip 443", what does it mean? Seems like NLB didn't redirect the packets to the node with highest priority.

I post the question here because I can't make sure that the problem is caused by NLB or Openstack Neutron, the main questions are:

1. Does OpenStack Neutron support Multicast with Open vSwitch?
2. How to modify Neutron settings for supporting Windows NLB multicast?
3. Is there anyone have experience that setting up Windows NLB cluster on OpenStack?

Openstack Version: Pike
Windows Version: Windows Server 2012 R2

edit retag flag offensive close merge delete