GPU passthrough on PopOS 18.04, Need help with VFIO driver override

My steps so far

sudo nano /boot/efi/loader/entries/Pop_OS-current.conf

append ‘intel_iommu=on’ to options line

bootctl update

reboot

nano ls-iommu.sh

#!/bin/bash
for d in /sys/kernel/iommu_groups/*/devices/*; do
  n=${d#*/iommu_groups/*}; n=${n%%/*}
  printf 'IOMMU Group %s ' "$n"
  lspci -nns "${d##*/}"
done

chmod +x ./ls-iommu.sh

./ls-iommu.sh

Groups show, iommu is working.

sudo nano /etc/initramfs-tools/modules

softdep nouveau pre: vfio vfio_pci

vfio
vfio_iommu_type1
vfio_virqfd
options vfio_pci ids=10de:10f0,10de:1b81
vfio_pci ids=10de:10f0,10de:1b81
vfio_pci
nouveau

sudo nano /etc/modprobe.d/nouveaugpu.conf

softdep nouveau pre: vfio vfio_pci

sudo nano /etc/modprobe.d/vfio_pci.conf

options vfio_pci ids=10de:10f0,10de:1b81

update-initramfs

reboot

lspci -nnv | grep vfio

no results, doing the same with nouveau returns the GPU is still using the wrong driver.

Any tips into what/where to look at next?

Thanks!

If you don’t have other nvidia GPUs being driven by the nouveau driver, try blacklisting it, it might be binding first.

You can try it temporarily to see if that’s the issue. At the GRUB boot screen, add the following to your linux boot options and test.

nouveau.blacklist=yes

If that works, you can make it permanent with the following change.

Edit (or create, if needed) /etc/modprobe.d/blacklist.conf

blacklist nouveau

PopOS uses systemd-boot by default, would I append the same blacklist to the boot options for it?

I didn’t realize they didn’t use GRUB. The GRUB method won’t work, but the blacklisting method still should. If there’s a way to temporary blacklist modules with systemd-boot, I’m not aware of it.

The blacklist.conf file accepts comments. You can just comment that line out or delete the file in order to boot with the module again.

I tried blacklisting, however it did not seem to make a difference and appeared to be ignored. Ultimately I’d prefer to do this without blacklisting as I intend to also pass through a Dell H200 HBA and a generic USB 3 card.

What I currently have in place for trying to enable vfio

/etc/modules

vfio
vfio_iommu_type1
vfio_pci ids=10de:1b81,10de:10f0
vfio_virqfd
vhost-net

/etc/modprobe.d/nouveau.conf

softdep nouveau pre: vfio-pci

/etc/modprobe.d/vfio.conf

options vfio-pci ids=10de:1b81,10de:10f0

cat /etc/initramfs-tools/modules

vfio
vfio_iommu_type1
vfio_pci ids=10de:1b81,10de:10f0
vfio_virqfd
vhost-net

With this setup the graphics and audio still use their original driver and not VFIO

I was able to set this up on Ubuntu 18.04 but silly me did not document the steps that I took to get it to work so I’m basically back to starting from the beginning. The biggest hurdle I had my first time was trying to get IOMMU to work because I kept applying the GRUB config however Pop!_OS uses systemd-boot. I thought once I got that corrected this would be easier, however I’m finding that this is not as trivial as I’d hoped.

Thanks in advance for anyone’s help!

iommu groups

IOMMU Group 15 04:00.0 USB controller [0c03]: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller [1912:0014] (rev 03)
IOMMU Group 1 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1070] [10de:1b81] (rev a1)
IOMMU Group 1 01:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
OMMU Group 1 02:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)

lspci -nnv

04:00.0 USB controller [0c03]: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller [1912:0014] (rev 03) (prog-if 30 [XHCI])
	Flags: bus master, fast devsel, latency 0, IRQ 18
	Memory at df400000 (64-bit, non-prefetchable) [size=8K]
	Capabilities: <access denied>
	Kernel driver in use: xhci_hcd

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1070] [10de:1b81] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Micro-Star International Co., Ltd. [MSI] GP104 [GeForce GTX 1070] [1462:3302]
	Flags: bus master, fast devsel, latency 0, IRQ 142
	Memory at de000000 (32-bit, non-prefetchable) [size=16M]
	Memory at c0000000 (64-bit, prefetchable) [size=256M]
	Memory at d0000000 (64-bit, prefetchable) [size=32M]
	I/O ports at e000 [size=128]
	Expansion ROM at df000000 [disabled] [size=512K]
	Capabilities: <access denied>
	Kernel driver in use: nouveau
	Kernel modules: nvidiafb, nouveau

01:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
	Subsystem: Micro-Star International Co., Ltd. [MSI] GP104 High Definition Audio Controller [1462:3302]
	Flags: bus master, fast devsel, latency 0, IRQ 17
	Memory at df080000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel

02:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)
	Subsystem: Dell 6Gbps SAS HBA Adapter [1028:1f1c]
	Flags: bus master, fast devsel, latency 0, IRQ 17
	I/O ports at d000 [size=256]
	Memory at df240000 (64-bit, non-prefetchable) [size=64K]
	Memory at df200000 (64-bit, non-prefetchable) [size=256K]
	Expansion ROM at df100000 [disabled] [size=1M]
	Capabilities: <access denied>
	Kernel driver in use: mpt3sas
	Kernel modules: mpt3sas

Your SCSI controller is in the same group which as far as I know isn’t good, you’ll need to run a kernel with the ACS patch to try to separate it. You should also be blacklisting nouveau, blacklisting it has nothing to do with passing through an HBA or USB card later on.

If you’re using systemd-boot I believe the configs should be in /boot/loader/entries, edit the conf file to add the options you want to the options line.

My problem isn’t with blacklisting Nouveau, but rather two of the other devices share a driver with another component that is not being passed through. Is it possible to blacklist something and use a like item on the host machine?

What driver is being shared and with what devices? Sorry I’m really confused about what you said because that has not been shown thus far.

When I pull lspci it appears that my onboard audio, and USB controllers share drivers with the Nvidia GPU audio, and then the USB card I have to pass through to my VM.

00:00.0 Host bridge [0600]: Intel Corporation Skylake Host Bridge/DRAM Registers [8086:191f] (rev 07)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [1462:7998]
	Flags: bus master, fast devsel, latency 0
	Capabilities: <access denied>
	Kernel driver in use: skl_uncore

00:01.0 PCI bridge [0604]: Intel Corporation Skylake PCIe Controller (x16) [8086:1901] (rev 07) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	I/O behind bridge: 0000e000-0000efff
	Memory behind bridge: de000000-df0fffff
	Prefetchable memory behind bridge: 00000000c0000000-00000000d1ffffff
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:01.1 PCI bridge [0604]: Intel Corporation Skylake PCIe Controller (x8) [8086:1905] (rev 07) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
	I/O behind bridge: 0000d000-0000dfff
	Memory behind bridge: df100000-df2fffff
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 530 [8086:1912] (rev 06) (prog-if 00 [VGA controller])
	Subsystem: Micro-Star International Co., Ltd. [MSI] HD Graphics 530 [1462:7998]
	Flags: bus master, fast devsel, latency 0, IRQ 151
	Memory at dd000000 (64-bit, non-prefetchable) [size=16M]
	Memory at b0000000 (64-bit, prefetchable) [size=256M]
	I/O ports at f000 [size=64]
	[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: i915
	Kernel modules: i915

00:08.0 System peripheral [0880]: Intel Corporation Skylake Gaussian Mixture Model [8086:1911]
	Subsystem: Micro-Star International Co., Ltd. [MSI] Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th Gen Core Processor Gaussian Mixture Model [1462:7998]
	Flags: fast devsel, IRQ 11
	Memory at df652000 (64-bit, non-prefetchable) [disabled] [size=4K]
	Capabilities: <access denied>

00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31) (prog-if 30 [XHCI])
	Subsystem: Micro-Star International Co., Ltd. [MSI] Sunrise Point-H USB 3.0 xHCI Controller [1462:7998]
	Flags: bus master, medium devsel, latency 0, IRQ 122
	Memory at df630000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: <access denied>
	Kernel driver in use: xhci_hcd

00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Thermal subsystem [8086:a131] (rev 31)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Sunrise Point-H Thermal subsystem [1462:7998]
	Flags: fast devsel, IRQ 18
	Memory at df651000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: intel_pch_thermal
	Kernel modules: intel_pch_thermal

00:15.0 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Serial IO I2C Controller #0 [8086:a160] (rev 31)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Sunrise Point-H Serial IO I2C Controller [1462:7998]
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Memory at df650000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: intel-lpss
	Kernel modules: intel_lpss_pci

00:15.1 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Serial IO I2C Controller #1 [8086:a161] (rev 31)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Sunrise Point-H Serial IO I2C Controller [1462:7998]
	Flags: bus master, fast devsel, latency 0, IRQ 17
	Memory at df64f000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: intel-lpss
	Kernel modules: intel_lpss_pci

00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Sunrise Point-H CSME HECI [1462:7998]
	Flags: bus master, fast devsel, latency 0, IRQ 153
	Memory at df64e000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: mei_me
	Kernel modules: mei_me

00:17.0 SATA controller [0106]: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] [8086:a102] (rev 31) (prog-if 01 [AHCI 1.0])
	Subsystem: Micro-Star International Co., Ltd. [MSI] Sunrise Point-H SATA controller [AHCI mode] [1462:7998]
	Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 139
	Memory at df648000 (32-bit, non-prefetchable) [size=8K]
	Memory at df64d000 (32-bit, non-prefetchable) [size=256]
	I/O ports at f090 [size=8]
	I/O ports at f080 [size=4]
	I/O ports at f060 [size=32]
	Memory at df64c000 (32-bit, non-prefetchable) [size=2K]
	Capabilities: <access denied>
	Kernel driver in use: ahci
	Kernel modules: ahci

00:1c.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #1 [8086:a110] (rev f1) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
	Memory behind bridge: df500000-df5fffff
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1c.2 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #3 [8086:a112] (rev f1) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 18
	Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
	Memory behind bridge: df400000-df4fffff
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1d.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #9 [8086:a118] (rev f1) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
	I/O behind bridge: 0000c000-0000cfff
	Memory behind bridge: df300000-df3fffff
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1e.0 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Serial IO UART #0 [8086:a127] (rev 31)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Sunrise Point-H Serial IO UART [1462:7998]
	Flags: bus master, fast devsel, latency 0, IRQ 20
	Memory at df64b000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: intel-lpss
	Kernel modules: intel_lpss_pci

00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a149] (rev 31)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Sunrise Point-H LPC Controller [1462:7998]
	Flags: bus master, medium devsel, latency 0

00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Sunrise Point-H PMC [1462:7998]
	Flags: bus master, fast devsel, latency 0
	Memory at df644000 (32-bit, non-prefetchable) [size=16K]

00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Sunrise Point-H HD Audio [1462:f998]
	Flags: bus master, fast devsel, latency 32, IRQ 154
	Memory at df640000 (64-bit, non-prefetchable) [size=16K]
	Memory at df620000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: <access denied>
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel

00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Sunrise Point-H SMBus [1462:7998]
	Flags: medium devsel, IRQ 11
	Memory at df64a000 (64-bit, non-prefetchable) [size=256]
	I/O ports at f040 [size=32]
	Kernel modules: i2c_i801

00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-V [8086:15b8] (rev 31)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Ethernet Connection (2) I219-V [1462:7998]
	Flags: bus master, fast devsel, latency 0, IRQ 140
	Memory at df600000 (32-bit, non-prefetchable) [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: e1000e
	Kernel modules: e1000e

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1070] [10de:1b81] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Micro-Star International Co., Ltd. [MSI] GP104 [GeForce GTX 1070] [1462:3302]
	Flags: bus master, fast devsel, latency 0, IRQ 152
	Memory at de000000 (32-bit, non-prefetchable) [size=16M]
	Memory at c0000000 (64-bit, prefetchable) [size=256M]
	Memory at d0000000 (64-bit, prefetchable) [size=32M]
	I/O ports at e000 [size=128]
	Expansion ROM at df000000 [disabled] [size=512K]
	Capabilities: <access denied>
	Kernel driver in use: nouveau
	Kernel modules: nvidiafb, nouveau

01:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
	Subsystem: Micro-Star International Co., Ltd. [MSI] GP104 High Definition Audio Controller [1462:3302]
	Flags: bus master, fast devsel, latency 0, IRQ 17
	Memory at df080000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel

02:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)
	Subsystem: Dell 6Gbps SAS HBA Adapter [1028:1f1c]
	Flags: bus master, fast devsel, latency 0, IRQ 17
	I/O ports at d000 [size=256]
	Memory at df240000 (64-bit, non-prefetchable) [size=64K]
	Memory at df200000 (64-bit, non-prefetchable) [size=256K]
	Expansion ROM at df100000 [disabled] [size=1M]
	Capabilities: <access denied>
	Kernel driver in use: mpt3sas
	Kernel modules: mpt3sas

03:00.0 USB controller [0c03]: ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller [1b21:1242] (prog-if 30 [XHCI])
	Subsystem: Micro-Star International Co., Ltd. [MSI] ASM1142 USB 3.1 Host Controller [1462:7998]
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Memory at df500000 (64-bit, non-prefetchable) [size=32K]
	Capabilities: <access denied>
	Kernel driver in use: xhci_hcd

04:00.0 USB controller [0c03]: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller [1912:0014] (rev 03) (prog-if 30 [XHCI])
	Flags: bus master, fast devsel, latency 0, IRQ 18
	Memory at df400000 (64-bit, non-prefetchable) [size=8K]
	Capabilities: <access denied>
	Kernel driver in use: xhci_hcd

05:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM951/PM951 [144d:a802] (rev 01) (prog-if 02 [NVM Express])
	Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller SM951/PM951 [144d:a801]
	Flags: bus master, fast devsel, latency 0, IRQ 16, NUMA node 0
	Memory at df310000 (64-bit, non-prefetchable) [size=16K]
	I/O ports at c000 [size=256]
	Expansion ROM at df300000 [disabled] [size=64K]
	Capabilities: <access denied>
	Kernel driver in use: nvme
	Kernel modules: nvme

I don’t think it quite works like that. I’d have to reboot my system to confirm what it shows prior to first start of my VM but I’m pretty sure that’s okay, libvirt will unbind the device and then bind it to vfio when you start the VM. For example one my USB controllers that I pass through when the VM is off

09:00.0 USB controller [0c03]: ASMedia Technology Inc. ASM2142 USB 3.1 Host Controller [1b21:2142] (prog-if 30 [XHCI])
	Subsystem: ASUSTeK Computer Inc. ASM2142 USB 3.1 Host Controller [1043:8732]
	Flags: bus master, fast devsel, latency 0, IRQ 18
	Memory at df600000 (64-bit, non-prefetchable) [size=32K]
	Capabilities: <access denied>
	Kernel driver in use: xhci_hcd

And then once started

09:00.0 USB controller [0c03]: ASMedia Technology Inc. ASM2142 USB 3.1 Host Controller [1b21:2142] (prog-if 30 [XHCI])
	Subsystem: ASUSTeK Computer Inc. ASM2142 USB 3.1 Host Controller [1043:8732]
	Flags: bus master, fast devsel, latency 0, IRQ 18
	Memory at df600000 (64-bit, non-prefetchable) [size=32K]
	Capabilities: <access denied>
	Kernel driver in use: vfio-pci

Again I’d have to reboot to confirm on first boot but my Nvidia card maintains vfio-pci whether the VM is booted or not but nouveau was blacklisted and vfio grabbed the device(s) at boot based on the config.

Feb 17 12:01:32 archlinux kernel: VFIO - User Level meta-driver version: 0.3
Feb 17 12:01:32 archlinux kernel: vfio-pci 0000:04:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
Feb 17 12:01:32 archlinux systemd-modules-load[216]: Inserted module 'vfio'
Feb 17 12:01:32 archlinux kernel: vfio_pci: add [10de:1b06[ffff:ffff]] class 0x000000/00000000
Feb 17 12:01:32 archlinux systemd-modules-load[216]: Inserted module 'vfio_pci'
Feb 17 12:01:32 archlinux kernel: vfio_pci: add [10de:10ef[ffff:ffff]] class 0x000000/00000000
Feb 17 12:01:34 archlinux kernel: vfio-pci 0000:04:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none

PopOS may have a different approach for this, I’m not familiar with it, but it should be relatively the same.

Finally I got this to work, Thanks to all whom helped.

What I ended up doing

Applied ACS patch to kernel (not really sure if it’s actually working though, my IOMMU groups did not seem to change at all

Removed blacklist for nouveau as it was causing the GPU to not load at all and be absent from the DMESG log. Once booting after the kernel patch and removing the blacklist the GPU is now loading vfio-pci

One thing to note, my first test VM had only the GPU selected and I got an error that everything from the group needed to be passed through. Not a problem for me because the other item was the HBA that I intended to pass through anyways but this also makes me skeptical that the ACS patch isn’t fully working.

I’ll clean up my configs in the next day or so and see if I can nail down a repeatable sequence and try to contribute a tutorial for the forums here to help anyone trying to tackle passthrough on Pop!_OS

Thanks again everyone!

You need to pass pcie_acs_override=downstream or pcie_acs_override=multifunction as a boot parameter to enable the ACS patch.

I think you need to do that with kernelstub, but I don’t have experience with systemd boot.

1 Like