After @wendell did another live stream on the whole GPU pass-through topic, I decided to finally give it a shot on my Intel system. Since my hardware (4790K, Z97) supports VT-d and IOMMU is working, I thought there shouldn’t be an issue, if I follow the Level1Techs tutorial and adjust it to my hardware.
I managed to get IOMMU working after setting the kernel parameters in GRUB
GRUB_CMDLINE_LINUX="[...] iommu=1 intel_iommu=on rd.driver.pre=vfio-pci"
and I was able to identify the grouping of my PCI devices. My Nvidia GTX 770 video card is in a separate IOMMU group together with the PCIe x16 controller that it is seated in.
IOMMU group 1
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller [8086:0c01] (rev 06)
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK104 [GeForce GTX 770] [10de:1184] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1)
Since I remember Wendell once mentioning that all devices in an IOMMU group have to be passed-through, I wrote down all three IDs for the next steps.
I added the options
option to my /etc/modprobe.d/vfio.conf
file as follows:
options vfio-pci ids=8086:0c01,10de:1184,10de:0e0a
To write the changes to the initram-image I ran dracut
as per the tutorial:
dracut -f --kver `uname –r`
which ran without any error message. (I also updated GRUB with grub2-mkconfig
).
Now, after doing a reboot the nouveau driver is still loaded for the Nvidia card. If I blacklist it in a separate modprobe.d
file, the boot process freezes during the systemd messages.
My assumption is that the vfio-pci
driver couldn’t be found/loaded and that is why it falls back to the nouveau driver and why it can’t continue booting if I blacklist that one. However, the vfio-pci module does exist:
# modinfo vfio-pci
filename: /lib/modules/4.13.9-200.fc26.x86_64/kernel/drivers/vfio/pci/vfio-pci.ko.xz
description: VFIO PCI - User Level meta-driver
author: Alex Williamson <[email protected]>
license: GPL v2
version: 0.2
srcversion: DB8F55EC2187EC83F7E71EA
depends: vfio,irqbypass,vfio_virqfd
intree: Y
name: vfio_pci
vermagic: 4.13.9-200.fc26.x86_64 SMP mod_unload
signat: PKCS#7
signer:
sig_key:
sig_hashalgo: md4
parm: ids:Initial PCI IDs to add to the vfio driver, format is "vendor:device[:subvendor[:subdevice[:class[:class_mask]]]]" and multiple comma separated entries can be specified (string)
parm: nointxmask:Disable support for PCI 2.3 style INTx masking. If this resolves problems for specific devices, report lspci -vvvxxx to [email protected] so the device can be fixed automatically via the broken_intx_masking flag. (bool)
parm: disable_vga:Disable VGA resource access through vfio-pci (bool)
parm: disable_idle_d3:Disable using the PCI D3 low power state for idle, unused devices (bool)
I’d also like to add that I am using Fedora’s built-in full-disk-encryption (LVM on LUKS, unencrypted /boot
) but I don’t think that should be an issue here.
Does someone know what could be wrong here or how I can go about to debug this?
EDIT: I went through the journalctl
and found the following line, which doesn’t make sense to me currently:
dracut-pre-udev[385]: modprobe: FATAL: Module vfio-pci not found in directory /lib/modules/4.13.9-200.fc26.x86_64
Because when I run find
in the mentioned directory there is the corresponding module:
# find /lib/modules/4.13.9-200.fc26.x86_64 | grep vfio
/lib/modules/4.13.9-200.fc26.x86_64/kernel/drivers/vfio
/lib/modules/4.13.9-200.fc26.x86_64/kernel/drivers/vfio/vfio_iommu_type1.ko.xz
/lib/modules/4.13.9-200.fc26.x86_64/kernel/drivers/vfio/vfio_virqfd.ko.xz
/lib/modules/4.13.9-200.fc26.x86_64/kernel/drivers/vfio/pci
/lib/modules/4.13.9-200.fc26.x86_64/kernel/drivers/vfio/pci/vfio-pci.ko.xz
/lib/modules/4.13.9-200.fc26.x86_64/kernel/drivers/vfio/vfio.ko.xz
/lib/modules/4.13.9-200.fc26.x86_64/kernel/drivers/vfio/mdev
/lib/modules/4.13.9-200.fc26.x86_64/kernel/drivers/vfio/mdev/vfio_mdev.ko.xz
/lib/modules/4.13.9-200.fc26.x86_64/kernel/drivers/vfio/mdev/mdev.ko.xz
It seems like that line from journalctl
was logged before the root partition was unlocked and mounted, but shouldn’t have dracut
copied the module to the boot partition, so it can be loaded before the LUKS container is unlocked?
EDIT2: After running lsinitrd | less
I noticed that the vfio-pci
kernel module was indeed missing from the initram-image, so I ran dracut
again, now adding --add-drivers vfio-pci
to the command. I verified the process by running lsinitrd | less
again, and now the above mentioned directories are copied over too.
However, when I try rebooting with the card installed the system freezes at a black screen after I entered the password at the LUKS prompt. journalctl
shows that vfio-pci
now claims the device IDs, but there is still something not working right.