Ask Your Question
1

PCI Passthrough Issues

asked 2016-12-23 03:32:59 -0600

SeanWallace gravatar image

updated 2016-12-23 12:48:13 -0600

Hey All,

I'm having some issues trying to pass PCI cards (K40m's and Xeon Phi's) to my guest VM's. I started with http://docs.openstack.org/admin-guide/compute-pci-passthrough.html (this)(http://docs.openstack.org/admin-guide/compute-pci-passthrough.html (http://docs.openstack.org/admin-guide...)) guide, but have been literally all over the internet (including many questions here which have not helped) trying to find a solution.

The issue is as such:

2016-12-23 08:33:01.063 3013 ERROR nova.compute.manager [instance: b9cfb26d-b37a-470e-95be-a981b0dafacd] libvirtError: internal error: process exited while connecting to monitor: 2016-12-23T08:33:00.963769Z qemu-system-x86_64: -device vfio-pci,host=83:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio: failed to open /dev/vfio/vfio: Operation not permitted
2016-12-23 08:33:01.063 3013 ERROR nova.compute.manager [instance: b9cfb26d-b37a-470e-95be-a981b0dafacd] 2016-12-23T08:33:00.963819Z qemu-system-x86_64: -device vfio-pci,host=83:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio: failed to setup container for group 53
2016-12-23 08:33:01.063 3013 ERROR nova.compute.manager [instance: b9cfb26d-b37a-470e-95be-a981b0dafacd] 2016-12-23T08:33:00.963831Z qemu-system-x86_64: -device vfio-pci,host=83:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio: failed to get group 53
2016-12-23 08:33:01.063 3013 ERROR nova.compute.manager [instance: b9cfb26d-b37a-470e-95be-a981b0dafacd] 2016-12-23T08:33:00.963851Z qemu-system-x86_64: -device vfio-pci,host=83:00.0,id=hostdev0,bus=pci.0,addr=0x5: Device initialization failed

System Configuration

The board these devices are plugged into is a Supermicro X10DRG-Q with two Intel E5-2620 V3's. The host operating systems I have tried are Ubuntu 16.04 and CentOS 7. I have enabled every single option potentially pertaining to IOMMU and virtualization.

I'm using MAAS to provision the metal and Juju to configure them. There are no other issues to report.

Hypervisor is KVM.

IOMMU commands are being passed to the kernel:

root@steady-vervet:/var/log/nova# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-4.4.0-57-generic root=UUID=5418ff63-06b7-4d50-a1bb-1e9af7e85d5c ro iommu=pt intel_iommu=on

IOMMU groups are posted later as are the addresses of the devices I want to forward.

What I have tried

Both the controller and the compute nodes are aware of (and in the case of the compute node, whitelisted) the cards (respective nova.conf):

Controller:

pci_alias = { "vendor_id":"10de", "product_id":"1023", "device_type":"type-PCI", "name":"K40m" }

Compute:

pci_passthrough_whitelist = { "vendor_id":"10de", "product_id":"1023" }
pci_alias = { "vendor_id":"10de", "product_id":"1023", "device_type":"type-PCI", "name":"K40m" }

In reality, I've tired more than this, but it has made no impact on the outcome.

Any advice?

System Logs

root@steady-vervet:/var/log/nova# find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/0/devices/0000:ff:08.0
/sys/kernel/iommu_groups/0/devices/0000:ff:08.2
/sys/kernel/iommu_groups/0/devices/0000:ff:08.3
/sys/kernel/iommu_groups/1/devices/0000:ff:0b.0
/sys/kernel/iommu_groups/1/devices/0000:ff:0b.1
/sys/kernel/iommu_groups/1/devices/0000:ff:0b.2
/sys/kernel/iommu_groups/2/devices/0000:ff:0c.0
/sys/kernel/iommu_groups/2/devices/0000:ff:0c.1
/sys ...
(more)
edit retag flag offensive close merge delete

Comments

Sounds like "vfio: failed to open /dev/vfio/vfio: Operation not permitted" is the thing to address. Have you modified/checked qemu.conf? For me it sounds like : user /group and cgroup_device_acl in qemu.conf Mayble changing the priveleges /dev/vfio/vfio to smth 0666 might be also possible approach

volenbovsky gravatar imagevolenbovsky ( 2016-12-23 07:22:11 -0600 )edit

I would agree with you on the thing to address. I have not made any additional modifications to qemu.conf, I have edited the original post to show the contents of that. I have tried all kinds of combinations of permissions on /dev/vfio/vfio with no success (also included in post).

SeanWallace gravatar imageSeanWallace ( 2016-12-23 12:43:01 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
2

answered 2016-12-29 01:02:16 -0600

SeanWallace gravatar image

updated 2016-12-29 22:38:47 -0600

I managed to answer my own question. As it usually is, the solution was simple but not obvious. The biggest clue was libvirt complaining about issues accessing /dev/vfio/vfio. The thing is, it turns out it wasn't talking about permissions, but rather something else entirely.

I made the following changes to my /etc/libvirt/qemu.conf:

cgroup_device_acl = [
   "/dev/null", "/dev/full", "/dev/zero",
   "/dev/random", "/dev/urandom",
   "/dev/ptmx", "/dev/kvm", "/dev/kqemu",
   "/dev/rtc", "/dev/hpet", "/dev/net/tun",
   "/dev/vfio/vfio", "/dev/vfio/43", "/dev/vfio/53"
]

And I was able to pass my cards into my VM's. On some of my hosts it was necessary to add all of the corresponding vfio groups to this list and on others not. The one thing was was absolutely necessary to pass any PCI card into a VM was the /dev/vfio/vfio addition.

Hopefully this saves somebody a few days of head-bangining!

Edit: I decided to tear the whole setup down and start over because of the mess I've made of configuration files all over my hosts and I've found that I don't need to add any of the vfio groups to that list, just the /dev/vfio/vfio. Simpler is better!

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2016-12-23 03:24:57 -0600

Seen: 990 times

Last updated: Dec 29 '16