Return ACK and FIN packets from a certain host are not forwarded to VM TAP Interface

asked 2020-02-05 11:35:51 -0500

I've got a weird one: ACK and FIN packets are not being forwarded to the TAP interface for a VM for connection to a specific external host. Push and SYN/ACK packets are correctly forwarded in the same connection, however. All packets reach the geneve interface on the same compute node.

Some clients handle this without issue (curl) but in my case PHP hangs for 60 seconds before resetting the connection. It does receive the data. I know I could work around this by changing FIN timeout behavior or using a different client, but would like to understand why these packets aren't forwarded and correct it.

This behavior reproduces 100% of the time, with the exact same packets missing from the TAP interface. I have four clusters (in two separate data centers), and all VMs on all clusters exhibit the same behavior. Using http://google.com in the repro case behaves correctly, so it's something specific to this remote host.

I can infer that the failure comes from openflow routing rules, but I'm unsure how to further debug the problem. I'm also stumped as to why this specific remote host causes this behavior, and further why only the ACK and FIN packets are missing while SYN/ACK and PUSH/ACK are forwarded correctly. Any guidance would be appreciated.

Technical details follow.

Reproduction Case (ran in the VM)

<?php
$context = stream_context_create(
    array(
        'http' => array(
            'follow_location' => false
        )
    )
);
$result = file_get_contents('http://api.bincodes.com/', false, $context);
print($result);
?>

Then simply php repro.php. Note that the curl equivalent shows the same missing packet behavior, but curl handles it better by sending its own FIN after the data is received.

tcpdump results (taken on compute node)

These are taken via tcpdump -i <interface> host api.bincodes.com

Geneve interface:

[root@staging-worker01 ~]# cat genevetcpdump.txt 
09:38:13.435199 IP 10.0.1.8.36858 > 104.18.44.130.http: Flags [S], seq 1762052365, win 28040, options [mss 1402,sackOK,TS val 1581242613 ecr 0,nop,wscale 7], length 0
09:38:13.437925 IP 104.18.44.130.http > 10.0.1.8.36858: Flags [S.], seq 2490990199, ack 1762052366, win 65535, options [mss 1400,nop,nop,sackOK,nop,wscale 10], length 0
09:38:13.438411 IP 10.0.1.8.36858 > 104.18.44.130.http: Flags [.], ack 1, win 220, length 0
09:38:13.438411 IP 10.0.1.8.36858 > 104.18.44.130.http: Flags [P.], seq 1:17, ack 1, win 220, length 16: HTTP: GET / HTTP/1.0
09:38:13.440702 IP 104.18.44.130.http > 10.0.1.8.36858: Flags [.], ack 17, win 64, length 0
09:38:13.645354 IP 10.0.1.8.36858 > 104.18.44.130.http: Flags [P.], seq 17:62, ack 1, win 220, length 45: HTTP
09:38:13.647509 IP 104.18.44.130.http > 10.0.1.8.36858: Flags [.], ack 62, win 64, length 0
09:38 ...
(more)
edit retag flag offensive close merge delete