PCI Passthrough Issues
Hey All,
I'm having some issues trying to pass PCI cards (K40m's and Xeon Phi's) to my guest VM's. I started with http://docs.openstack.org/admin-guide/compute-pci-passthrough.html (this)(http://docs.openstack.org/admin-guide/compute-pci-passthrough.html (http://docs.openstack.org/admin-guide...)) guide, but have been literally all over the internet (including many questions here which have not helped) trying to find a solution.
The issue is as such:
2016-12-23 08:33:01.063 3013 ERROR nova.compute.manager [instance: b9cfb26d-b37a-470e-95be-a981b0dafacd] libvirtError: internal error: process exited while connecting to monitor: 2016-12-23T08:33:00.963769Z qemu-system-x86_64: -device vfio-pci,host=83:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio: failed to open /dev/vfio/vfio: Operation not permitted
2016-12-23 08:33:01.063 3013 ERROR nova.compute.manager [instance: b9cfb26d-b37a-470e-95be-a981b0dafacd] 2016-12-23T08:33:00.963819Z qemu-system-x86_64: -device vfio-pci,host=83:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio: failed to setup container for group 53
2016-12-23 08:33:01.063 3013 ERROR nova.compute.manager [instance: b9cfb26d-b37a-470e-95be-a981b0dafacd] 2016-12-23T08:33:00.963831Z qemu-system-x86_64: -device vfio-pci,host=83:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio: failed to get group 53
2016-12-23 08:33:01.063 3013 ERROR nova.compute.manager [instance: b9cfb26d-b37a-470e-95be-a981b0dafacd] 2016-12-23T08:33:00.963851Z qemu-system-x86_64: -device vfio-pci,host=83:00.0,id=hostdev0,bus=pci.0,addr=0x5: Device initialization failed
System Configuration
The board these devices are plugged into is a Supermicro X10DRG-Q with two Intel E5-2620 V3's. The host operating systems I have tried are Ubuntu 16.04 and CentOS 7. I have enabled every single option potentially pertaining to IOMMU and virtualization.
I'm using MAAS to provision the metal and Juju to configure them. There are no other issues to report.
Hypervisor is KVM.
IOMMU commands are being passed to the kernel:
root@steady-vervet:/var/log/nova# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.4.0-57-generic root=UUID=5418ff63-06b7-4d50-a1bb-1e9af7e85d5c ro iommu=pt intel_iommu=on
IOMMU groups are posted later as are the addresses of the devices I want to forward.
What I have tried
Both the controller and the compute nodes are aware of (and in the case of the compute node, whitelisted) the cards (respective nova.conf):
Controller:
pci_alias = { "vendor_id":"10de", "product_id":"1023", "device_type":"type-PCI", "name":"K40m" }
Compute:
pci_passthrough_whitelist = { "vendor_id":"10de", "product_id":"1023" }
pci_alias = { "vendor_id":"10de", "product_id":"1023", "device_type":"type-PCI", "name":"K40m" }
In reality, I've tired more than this, but it has made no impact on the outcome.
Any advice?
System Logs
root@steady-vervet:/var/log/nova# find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/0/devices/0000:ff:08.0
/sys/kernel/iommu_groups/0/devices/0000:ff:08.2
/sys/kernel/iommu_groups/0/devices/0000:ff:08.3
/sys/kernel/iommu_groups/1/devices/0000:ff:0b.0
/sys/kernel/iommu_groups/1/devices/0000:ff:0b.1
/sys/kernel/iommu_groups/1/devices/0000:ff:0b.2
/sys/kernel/iommu_groups/2/devices/0000:ff:0c.0
/sys/kernel/iommu_groups/2/devices/0000:ff:0c.1
/sys ...
Sounds like "vfio: failed to open /dev/vfio/vfio: Operation not permitted" is the thing to address. Have you modified/checked qemu.conf? For me it sounds like : user /group and cgroup_device_acl in qemu.conf Mayble changing the priveleges /dev/vfio/vfio to smth 0666 might be also possible approach
I would agree with you on the thing to address. I have not made any additional modifications to qemu.conf, I have edited the original post to show the contents of that. I have tried all kinds of combinations of permissions on /dev/vfio/vfio with no success (also included in post).