Pike - SR-IOV instance - not enough hosts available [closed]
Hi
I have installed OpenStack Pike on CentOS 7 (Linux networkingnode 3.10.0-862.6.3.el7.x86_64), I can create instances, network, router, volume etc..... absolutely no issue here, everything works fine
EXCEPT if I try to create an instance with a SR-IOV port, in that case only, I'll get the error: "There are not enough hosts available".
I dedicated my Mellanox Technologies MT27520 Family [ConnectX-3 Pro] for the SR-IOV
This is how I configure the controller and compute node:
1- Enable SR-IOV in BIOS
2- Modify the Kernel with the options intel_iommu=on iommu=pt:
/etc/sysconfig/grub:
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="rd.lvm.lv=centos_computenode/root rd.lvm.lv=centos_computenode/swap rhgb quiet intel_iommu=on iommu=pt"
GRUB_DISABLE_RECOVERY="true"
and then
[root@computenode ~]# dracut --regenerate-all --force
and reboot the server
3- NIC driver installation:
[root@computenode ~]# lspci | grep Mellanox
04:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
[root@computenode ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.4 (Maipo)
download MLNX_OFED_LINUX-4.4-1.0.0.0-rhel7.5-x86_64.tar
tar -xvf MLNX_OFED_LINUX-4.4-1.0.0.0-rhel7.5-x86_64.tar
./mlnxofedinstall
[root@computenode]# modprobe -rv ib_isert rpcrdma ib_srpt
rmmod ib_isert
rmmod iscsi_target_mod
rmmod rpcrdma
rmmod ib_srpt
[root@computenode]# /etc/init.d/openibd restart
Unloading HCA driver: [ OK ]
Loading HCA driver and Access Layer: [ OK ]
[root@computenode MLNX_OFED_LINUX-4.4-1.0.0.0-rhel7.5-x86_64]#
reboot the server
[root@computenode ~]# mst start
Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices
[root@computenode ~]# mst status
MST modules:
------------
MST PCI module loaded
MST PCI configuration module loaded
MST devices:
------------
/dev/mst/mt4103_pciconf0 - PCI configuration cycles access.
domain:bus:dev.fn=0000:04:00.0 addr.reg=88 data.reg=92
Chip revision is: 00
/dev/mst/mt4103_pci_cr0 - PCI direct access.
domain:bus:dev.fn=0000:04:00.0 bar=0x96400000 size=0x100000
Chip revision is: 00
[root@computenode ~]# mlxconfig -d /dev/mst/mt4103_pciconf0 q
Device #1:
----------
Device type: ConnectX3Pro
Device: /dev/mst/mt4103_pciconf0
Configurations: Next Boot
SRIOV_EN True(1)
NUM_OF_VFS 8
LOG_BAR_SIZE 3
BOOT_OPTION_ROM_EN_P1 False(0)
BOOT_VLAN_EN_P1 False(0)
BOOT_RETRY_CNT_P1 0
LEGACY_BOOT_PROTOCOL_P1 None(0)
BOOT_VLAN_P1 1
BOOT_OPTION_ROM_EN_P2 False(0)
BOOT_VLAN_EN_P2 False(0)
BOOT_RETRY_CNT_P2 0
LEGACY_BOOT_PROTOCOL_P2 None(0)
BOOT_VLAN_P2 1
IP_VER_P1 IPv4(0)
IP_VER_P2 IPv4(0)
CQ_TIMESTAMP True(1)
[root@computenode ~]# ibstat
CA 'mlx4_0'
CA type: MT4103
Number of ports: 2
Firmware version: 2.42.5000
Hardware version: 0
Node GUID: 0xec0d9a0300e78930
System image GUID: 0xec0d9a0300e78930
Port 1:
State: Active
Physical state: LinkUp
Rate: 10
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x04010000
Port GUID: 0xee0d9afffee78930
Link layer: Ethernet
Port 2:
State: Down
Physical state: Disabled
Rate: 10
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x04010000
Port GUID: 0xee0d9afffee78931
Link layer: Ethernet
Create (or edit) /etc/modprobe.d/mlx4_core.conf
options mlx4_core num_vfs=8 port_type_array=2,2 probe_vf=0
Restart the driver
/etc/init.d/openibd restart
Check that the VFs ...
Your scenario: The scheduler picks a host and the VM launch on that host fails. The scheduler goes through another cycle finding a host. In the second cycle, the retry filter ensures that the failing host is not selected a 2nd time.
You may get clues from the failing host’s compute log.