Unable to PXE boot for overcloud deployment

asked 2020-07-14 06:05:37 -0500

arnsong gravatar image

updated 2020-07-15 20:06:50 -0500

Hi all,

I'm having some trouble with overcloud deployment using TripleO release Train and wondered if this look familiar to anyone.

The nodes to be provisioned as overcloud nodes all pass introspection cleanly and I’m able to move them into the available state. I had set profile:compute and profile:control in the instackenv.json file. The provisioning of the compute and controller nodes times out.

It seems pretty obvious to me that the DHCP requests from the nodes are being ignored by the ironic-inspector-dnsmasq service, but I can't for the life of me figure out what is misconfigured to cause this behavior.

Just for reference, when PXE booting is working during introspection, the tcpdump output looks like this:

17:28:39.035851 c8:1f:66:c2:f6:63 > Broadcast, ethertype IPv4 (0x0800), length 442: 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from c8:1f:66:c2:f6:63, length 400 17:28:39.036251 c8:1f:66:c3:0e:87 > Broadcast, ethertype IPv4 (0x0800), length 373: 10.232.16.7.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 331

Where you see a BOOTP/DHCP request from the PXE enabled NIC (in this case, c8:1f:66:c2:f6:63), then a reply shortly after from the PXE boot server (10.232.16.7).

In the ironic-inspector-dnsmasq container, during introspection you first see this:

Jul 13 17:25:25 dnsmasq[7]: inotify, new or changed file /var/lib/ironic-inspector/dhcp-hostsdir/c8:1f:66:c2:f6:63 Jul 13 17:25:25 dnsmasq-dhcp[7]: read /var/lib/ironic-inspector/dhcp-hostsdir/c8:1f:66:c2:f6:63

Which is when the ignore flag is removed from /var/lib/ironic-inspector/dhcp-hostsdir/c8:1f:66:c2:f6:63. This change is detected and then read by dnsmasq-dhcp. This allow the PXE boot request to be received by the PXE boot server and then you see the following:

Jul 13 17:28:28 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) c8:1f:66:c2:f6:63 Jul 13 17:28:28 dnsmasq-dhcp[7]: DHCPOFFER(br-ctlplane) 10.232.18.2 c8:1f:66:c2:f6:63 Jul 13 17:28:32 dnsmasq-dhcp[7]: DHCPREQUEST(br-ctlplane) 10.232.18.2 c8:1f:66:c2:f6:63 Jul 13 17:28:32 dnsmasq-dhcp[7]: DHCPACK(br-ctlplane) 10.232.18.2 c8:1f:66:c2:f6:63

The introspection completes and you see the /var/lib/ironic-inspector/dhcp-hostsdir/c8:1f:66:c2:f6:63 file change again and get read by dnsmasq-dhcp with the ignore flag now included.

When things aren’t working during the heat stack build, in the tcpdump output you see the BOOTP/DHCP request as before, but no reply from the PXE boot server.

You also don’t see any of the files in /var/lib/ironic-inspector/dhcp-hostsdir change, so you see the following:

Jul 13 17:51:28 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) c8:1f:66:c2:f6:63 ignored

This is repeated until the provisioning times ... (more)

edit retag flag offensive close merge delete