Neutron Router HA: Failover Issues

asked 2016-12-05 09:40:39 -0600

mathias gravatar image

Hi, I have a really strange problem that I cannot seem to get to the ground of. I run L3 Agent on (currently) two network nodes and enable HA by default for every router. Everything seems to work, but from time to time weird stuff happens: I have an instances with a floating IP assigned and a web server running on it. Making an http request from the outside, I can see traffic going into the virtual router on network node 0 (net00) and leaving the router towards the instance. Next thing I see is the response coming from the instance, BUT I see this response traffic hitting network node 1 (net01)!!!

This is the "inner" interface (towards the instance) on net00:

qr-4858d71b-cf Link encap:Ethernet  HWaddr fa:16:3e:90:1e:a3  
          inet addr:172.16.100.1  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fe90:1ea3/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:8950  Metric:1
          RX packets:10818 errors:0 dropped:15 overruns:0 frame:0
          TX packets:13010 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:2295246 (2.2 MB)  TX bytes:2868936 (2.8 MB)

This is on net01:

qr-4858d71b-cf Link encap:Ethernet  HWaddr fa:16:3e:90:1e:a3  
          UP BROADCAST RUNNING MULTICAST  MTU:8950  Metric:1
          RX packets:930414 errors:0 dropped:44 overruns:0 frame:0
          TX packets:433183 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1212518844 (1.2 GB)  TX bytes:200079736 (200.0 MB)

You can see that the gateway IP sits on net00, but as said: response traffic from the instance arrives on net01. Of course, the response never reaches the client that requested the website.

I also notices that both MAC addresses of the two vrouters are the same! My first instinct was to reboot the instance to clear it's ARP cache but when I saw this it made sense, that clearing the ARP cache didnt work.

Are the MACs supposed to be equal? If yes, what's the solution to this?

edit retag flag offensive close merge delete

Comments

I just found out that MAC are supposed to be equal. So what I suspect is the that Open vSwitch is doing it wrong... No other clue though...

mathias gravatar imagemathias ( 2016-12-05 11:00:43 -0600 )edit