Fedora 33: Ultimiate VFIO Guide for 2020/2021 [WIP]

Should be fine, was a typo

1 Like

I am not seeing the enabled when I run this command after completing these steps. If I remove the last grep I get the following output shortend.

[    2.164463] iommu: Default domain type: Translated 
[    2.380366] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    2.380410] pci 0000:00:01.0: Adding to iommu group 0
[    2.380419] pci 0000:00:01.1: Adding to iommu group 1
...
[    2.381188] pci 0000:0e:00.3: Adding to iommu group 33
[    2.381202] pci 0000:0e:00.4: Adding to iommu group 34
[    2.384780] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    2.385949] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).

Did you resolve this? I am currently having the same issue (I don’t see any mention of IOMMU enabled), when I run sudo lsinitrd | grep vfio. After running dracut -fv I am missing etc/modprobe.d/vfio.conf but all other files are listed. Finally, after rebooting and running lspci -nnv and looking at my devices I see that the vfio-pci driver is not being used. I’d be curious to know how you resolved your issues. Thanks.

Followed this guide for customizing my initramfs since I use dracut in Archlinux and found these:

The module-setup.sh needs to be located under the new custom module directory, in your example /usr/lib/dracut/modules.d/20vfio/module-setup.sh (or /usr/lib/dracut/modules.d/30vfio/module-setup.sh if you stick to 30 priority)

The dd_dracutmodules+=" vfio " should be add_dracutmodules+=" vfio ".

When executing the dracut command check for a line in info logs (-v not required) which loads the custom module:

[..]
dracut: *** Including module: vfio ***
[..]

To verify after reboot check which modules each GPU has loaded:

sudo lspci -v -d 10de:1e84 | grep -E '(VGA|driver)'
0c:00.0 VGA compatible controller: NVIDIA Corporation TU104 [GeForce RTX 2070 SUPER] (rev a1) (prog-if 00 [VGA controller])
        Kernel driver in use: vfio_pci
1 Like

Hmm, does this Navi/Vega reset fix work for Polaris or no? I’ve got a reference RX 580 I’d like to use with a VFIO setup, but this GPU is real funky about passthrough.

Vendor-reset has worked for my rx460 for months now, with the caveat that I also needed to disable pcie power management in bios and systemd boot because that was crashing the card into an unfixable state.

Could you elaborate on the systemd boot issue?

Proxmox (my host) when installed on an efi partition ignores grub config options and used systemd-boot instead: Host Bootloader - Proxmox VE

Other distros likely still use grub only.

2 Likes

Thanks for the detail, I had been toying with the idea of using Proxmox at home as well as this is good to know.

Thank you!!! Your corrections got it working for me.

1 Like

People are reaching out to me about Q35 vs i440 and people having trouble with Q35.

Is anyone able to spot check QEMU through it’s versions of Q35 vs i440 compatibility? I know it flip flops depending on which QEMU version you have.

1 Like

Having some trouble with this command:
dmesg | grep -i -e IOMMU | grep enabled

I get no output at all. However if I run ‘dmesg | grep -e AMD-Vi’ I get this output:

[    0.542866] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.542882] AMD-Vi: Lazy IO/TLB flushing enabled
[    0.543769] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.543770] AMD-Vi: Extended features (0xf77ef22294ada): PPR NX GT IA GA PC GA_vAPIC
[    0.543774] AMD-Vi: Interrupt remapping enabled
[    0.543775] AMD-Vi: Virtual APIC enabled

So is IOMMU working or not? Fedora 33 on an OEM Dell X370 motherboard with a Ryzen 7 1700.

Kernel options line in /etc/default/grub is as follows:
GRUB_CMDLINE_LINUX="rhgb quiet amd_iommu=on iommu=pt rd.driver.pre=vfio-pci vfio-pci ids=1002:687f,1002:aaf8 video=vesafb:off video=efifb:off rd.driver.blacklist=nouveau,amdgpu,snd_intel_hda modprobe.blacklist=nouveau,amdgpu,snd_intel_hda resume=UUID=3ddd486c-7dac-4691-9096-8e2bc929c5ac"

Too much?

Since I didn’t find another (more up to date) thread I’m expecting this to be it…

My System Info:

OS: Fedora Linux 38 (Workstation Edition)
KERNEL: 6.4.6-200.fc38.x86_64
CPU: AMD Ryzen Threadripper 3960X 24-Core
GPU: AMD Radeon RX 6800 XT
GPU: NVIDIA GeForce GTX 1050 Ti
RAM: 256 GB
DE: Cinnamon (xorg)

Generally when I try to free 1 (the AMD Card) of my 2 gpus either by command or the described dracut module following happens:

  1. the gpu stays dark/off (this might be by design or a sideeffect, I can’t tell from the articles)
  2. fedora shows the login page on the secondary gpu - even when I was logged in before
  3. fedora login page refuses the login … although it looks more like it accepts it and the DE crashes right after

Anyone got an idea what the problem could be? - or is that just an unsupported usecase

AMD GPUs are plagued with reset issues for VFIO passthrough due to lack of care of supporting this use case by AMD, and should be avoided.