Weird VFIO issue on 20.04

Hi, using Pop!_OS 20.04 based on Ubuntu, I’m experiencing a strange VFIO behavior.
I’m passing through an NVIDIA GeForce RTX 2080 card with great success on a Windows 10 guest.

However, when I stop the VM, the kernel GPU driver in use changes to ‘nouveau’ and do NOT stick to ‘vfio-pci’.
Tried to blacklist ‘nouveau’ + update-initramfs and rebooted, but this would result in the card to completely disappear from the list of PCIe devices.

$ cat /etc/initramfs-tools/scripts/init-top/bind_vfio.sh
#!/bin/shPREREQS=""DEVS=“0000:0f:00.0 0000:0f:00.1 0000:0f:00.2 0000:0f:00.3”
for DEV in $DEVS;do echo “vfio-pci” > /sys/bus/pci/devices/$DEV/driver_override
done

Also, if the virtual machine is shut down and the AV amplifier still switched on (where the display of the VM is outputed), starting the VM again would crash/hang the host.

The guest VM uses the NVIDIA GeForce RTX 2080 on PCIe slot #1.
The host uses an AMD Radeon RX 5700 XT on PCI slot #2.
I’ve set the GPU boot order on this specific slot #2 in the BIOS.

Motherboard is a Gigabyte Aorus Ultra X570 with latest F12e BIOS.
Any ideas? Thank you.

Hey there! It seems the way you’re binding the card to vfio-pci differs from mine.

I added these lines to /etc/default/grub under “GRUB_CMDLINE_LINUX_DEFAULT”. The lines were “vfio-pci.ids=10de:1004,10de:0e1a rd.driver.pre=vfio-pci” The first one is gpu and gpu soundcard with their respective id’s and the second one tells to bind the gpu to vfio-pci driver.

::EDIT:: remember to run update-grub after modifying it.

Here’s the section on Isolating_the_GPU in archwiki

Hello, I’m using Pop!_OS which uses systemd-boot, not GRUB.
rd.driver.pre=vfio-pci doesn’t make any difference I’m afraid.
I’m using passthrough using PCI bus ids (not device ids).
Also tried to upgrade to kernel 5.6, but still the same problem.

If you don’t need Nvidia driver working just black list all of the Nvidia modules from loading. I am also on pop os 20.04 and pci module doesn’t bind but since Nvidia driver is blacklisted it doesn’t take the card.

How did you achieve blacklisting all of the NVIDIA modules please?

Tried to blacklist ‘nouveau’ + update-initramfs and rebooted, but strangely enough, this would result in the card to completely disappear from the list of available PCIe devices.

FYI, I’m using PCI bus addresses for passthrough (not Device ID).
Cheers.

I am using the same method. At first I was using pop os 19 and it working perfectly fine. Then I did update and upgrade to 20.04 and my passthrough GPUs disappeared from the list. I fixed that issues with qemu hook scrips that would PCI rescan on vm start which would make the card appear. After a while I just did a clean reinstall of 20.04 and cards appeared on lspci but vfio wouldn’t bind. So I just added the 4 modules related to Nvidia to the blacklist. Vfio doesn’t bind on boot but kvm does the binding by itself anyway. When you do lspci -knn it will list the 4 modules the card uses. Blacklist those.

FYI, I’m using PCI bus addresses for passthrough (not Device ID).

Could you explain why? Have you tried using the Device ID’s?

Could you try to use kernel parameters in your loaders file? and disable the script you are running to override the driver in your device.
According to archwiki they reside in “esp/loader/entries/youros.conf”
Archwiki Systemd-boot adding loaders

in the options line you can add the kernel parameters you want. Here you would want to add these lines (using your pci-id’s)

vfio-pci.ids=10de:1004,10de:0e1a
rd.driver.pre=vfio-pci