VFIO in 2019 -- Pop!_OS How-To (General Guide though) [DRAFT]

Hm I can‘t really tell if there is any log output, since it looks like I‘m too dumb to find out :wink: I changed the boot flag loglevel to 7 but there is no output from the script at all.

BUT: I have no idea what I changed, but in the /sys/bus/pci/devices/$DEV/driver_override files there now is the vfio_pci flag (before there was always (null)). Unfortunately lspci -nnv still outputs the amdgpu kernel driver :-/

Edit: okay, seems like I found the issue … looks like the IOMMU groups are making problems. Both GPUs & the third PCI controller are on the same group. I just installed the ACS Override Patch and now the vfio-pci driver was loaded as the kernel module.
Since the ACS patch is not the best solution, I’m asking myself if there is anything that I can do?

Alright @wendell, sorry for making you wait 5 months but I’m a man of my word.

VFIO GPU Pass-though w/ Looking Glass KVM on Ubuntu 19.04

If you think there’s anything worth copying off here (steps/images) to use for your guide you’re welcome to it. Only thing I ask is I guarantee there are little imperfections (mis-information/poor elaboration/etc) in mine so when you spot things that need fixing do let me know so I can make the appropriate corrections.

:slight_smile:

1 Like

Is this still a draft?

its mostly fine, actually, I have updated it as new info has come in. :smiley:

1 Like

Hello everyone,

I’m trying to use this guide on POP OS 20.04 and I’m running into an issue getting vfio_iommu_type1 and vfio_pci to load properly.

I added the relevant lines to /etc/initramfs-tools/modules and used update-initramfs but I’m still getting the same output:

scripts/init-top/bind_vfio.sh
usr/lib/modules/5.4.0-7626-generic/kernel/drivers/vfio
usr/lib/modules/5.4.0-7626-generic/kernel/drivers/vfio/mdev
usr/lib/modules/5.4.0-7626-generic/kernel/drivers/vfio/mdev/mdev.ko
usr/lib/modules/5.4.0-7626-generic/kernel/drivers/vfio/mdev/vfio_mdev.ko

Those two modules are not being found, and my GPU is not loading with the vfio-pci kernel driver.

In trying to research this problem, I think it may be occuring because vfio is no longer a separate module, it is now built into the 5.4 and newer kernel by default, but I’m not sure what to do if that is the case.

Any help on this problem would be appreciated, and if there is a better place to post asking for help than this thread, please let me know.

Thanks.

I can help ya a little bit here.

In short, your listing here is ‘i think’ going to be the new normal with the upgrade.

In 20.04 the vfio kernel drivers were changed FROM dynamic loadable kernel modules TO statically built into the kernel modules.

Thus this process will need to change a bit. Keep the script to write the driver_override files and drop the trailing modprobe.

my script looks like this.

//my script contents
cat /etc/initramfs-tools/scripts/init-top/bind_vfio.sh

#!/bin/sh

PREREQS=""
DEVS=“0000:28:00.0 0000:28:00.1 0000:28:00.2 0000:28:00.3”
for DEV in $DEVS;
do echo “vfio-pci” > /sys/bus/pci/devices/$DEV/driver_override
done

modprobe -i vfio-pci

drop this modprobe line as the module is built into the kernel in 20.04


Also, you will NOT see in the initramfs listing some of the the vfio-pci entries. Those are are in the kernel now. You can see this with this command…

missing entries are now in…

cat /lib/modules/$(uname -r)/modules.builtin | grep -i vfio
kernel/drivers/vfio/vfio.ko
kernel/drivers/vfio/vfio_virqfd.ko
kernel/drivers/vfio/vfio_iommu_type1.ko
kernel/drivers/vfio/pci/vfio-pci.ko

and the rest are in the lsinitramfs output

lsinitramfs /boot/$(uname -r) |grep -i vfio

scripts/init-top/bind_vfio.sh
usr/lib/modules/5.4.0-7626-generic/kernel/drivers/vfio
usr/lib/modules/5.4.0-7626-generic/kernel/drivers/vfio/mdev
usr/lib/modules/5.4.0-7626-generic/kernel/drivers/vfio/mdev/mdev.ko
usr/lib/modules/5.4.0-7626-generic/kernel/drivers/vfio/mdev/vfio_mdev.ko


Lastly, there is a bug somewhere in the new VFIO setup as when I perform a ‘lspci -vnn’ the line showiing the binding to the ‘vfio-driver’ is missing altogether, it’s just not in the output, but it also doesn’t show that it is bound to the, in my case, nvidia driver as it would normally.

I spoke with the guys over at the VFIO discord server and they noted it should still work, just without D3 Power something something, but I havn’t quite fired up a VM yet as I am waiting a few days to see if more info shows up.

I did open a bug on the pop github about this issue. Please note I am not certian what the right answer is exactly, but I think this is on the right track. I also noted a few others having trouble with this on reddit etc…

Hopefully this helps a bit to get this sorted with the new pop version.

3 Likes

Thank you for your detailed reply, this is exactly what I’m looking for.

It seems like there were 2 intersecting issues conspiring against me, because I saw the exact same thing in the ‘lspci-vnn’ output that you did, but I had assumed that the driver wasn’t binding properly, not that there was a problem with the output.

Would you be able to link me the github issue that you’re using so that I can follow it? I’m hoping to follow along so that I’ll be able to get it working once this is fixed.

For sure… here: on github, find /pop-os/pop/issues/968

If you see any progress could you post back here as I think it would be good for others coming behind us.

Also feel free to add your own experiance to the issue. This site wouldn’t let me paste the link directly, but I think you can decipher the right url from the above.

btblueskies > excellent POST!
Same ‘problem’ here with VFIO not reporting on Pop!_OS 20.04.
PCIe passthrough is still working though: successfully VFIOed an RTX 2080 GPU as well as a SoNNeT Allegro Pro USB card :sweat_smile:

It’s interesting to find out that the VFIO is actually working for you Tchuyev, it’s good that this seems to be only a reporting issue.

Thank you for linking me to the issue btblueskies, I will share my experience as well on the ticket and hopefully we can get this resolved soon.

Hey Guys,
I‘d like to ask if there‘s an update regarding the POP_OS 20.04 issues with the passthrough?

It still isn‘t working for me.

Same here! Whenever I isolate the Nvidia Card and reboot, I just get a black screen. I’m positive I’m only isolating the Nvidia and it’s in it’s own IOMMU group.

Weirder still, if boot with both cards enabled and then disable the Nvidia screen or unplug the cable, the system becomes very sluggish. Plug it the monitor back in or enable it in settings and the system runs fine.

An interesting new problem has developed on my otherwise working setup. Windows updated to 1909 on it’s own and now seemingly randomly if I walk away for a period of time the VM will go to sleep (pause) and if I try to resume it it just spits out an error message:

Error unpausing domain: internal error: unable to execute QEMU command 'cont': Resetting the Virtual Machine is required

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 75, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 111, in tmpcb
    callback(*args, **kwargs)
  File "/usr/share/virt-manager/virtManager/libvirtobject.py", line 66, in newfn
    ret = fn(self, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/domain.py", line 1435, in resume
    self._backend.resume()
  File "/usr/lib/python3/dist-packages/libvirt.py", line 2012, in resume
    if ret == -1: raise libvirtError ('virDomainResume() failed', dom=self)
libvirt.libvirtError: internal error: unable to execute QEMU command 'cont': Resetting the Virtual Machine is required

I tried disabling everything I can find that pertains to sleep in Windows but the VM keeps pausing. Anybody else run into this?

1 Like

having issues with huge pages right now its causing my ram to be 19gb without even starting the vm. how would i fix this? im using pop os 20.04

Any chance we can get the OP updated to include info on how to work with kernel 5.4 since vfio is built in?

Information on how to add options to systemd-boot would be helpful as well.

I did find some of that information here. That helped me get amd_iommu=on added correctly, and I think my gpu is isolated properly.

But passing through my pci-e USB card is not working. I get a “Failed to set iommu for container: Operation not permitted.” error. It mentions the iommu group for my usb card, so I know that is the issue.

I updated AppArmor per this post. But it didn’t seem to help.

syslog does show this error:

Jul 29 17:00:55 monolith kernel: [  169.269934] vfio_iommu_type1_attach_group: No interrupt remapping support.  Use the module param "allow_unsafe_interrupts" to enable VFIO IOMMU support on this platform

So I tried to enable allow_unsafe_interrupts=1 the same way I added amd_iommu=on, but syslog is still logging the same error message after a reboot.

Edit:

Per this Proxmox forum post I was able to add the vfio_iommu_type1.allow_unsafe_interrupts=1 option. I just needed the module name in front, then add it the same way you do amd_iommu=on.

Now I think my usb card is being passed through, but when I switch monitor inputs to the Windows gpu, I don’t get any output from it. Syslog does show the external hdd I have connected to the usb card being added and removed when I turn on the vm, and/or turn it off.

Hmm…

Edit 2:
So, it seems that part of the no display issue may have just been Windows not showing anything without some kind of input from a keyboard or mouse. I added a QXL display, and after waiting a good while to be sure it booted, I hit a key on my keyboard, and the login screen shows up in the QXL window. If I switch to the GPU input on my monitor, and can see Windows as well.

Now if only my USB card was working again…

Can one pass through USB sound card dynamically? For example, I’d prefer not to have two sound cards - but when switching between the VM & host, I’d like to be able to re-assign the sound card. Otherwise, to have audio sometimes working in either, I’d have to have two sound cards and a mixer - unless someone has another idea :slight_smile:

I have a Gigabyte Z390 AORUS Master. My secondary 1080Ti and NVMe controller both have their own IOMMU group, but there is only one entry for USB.

IOMMU Group 0 00:00.0 Host bridge [0600]: Intel Corporation 8th Gen Core 8-core Desktop Processor Host Bridge/DRAM Registers [Coffee Lake S] [8086:3e30] (rev 0a)
IOMMU Group 10 00:1c.5 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #6 [8086:a33d] (rev f0)
IOMMU Group 11 00:1d.0 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #9 [8086:a330] (rev f0)
IOMMU Group 12 00:1f.0 ISA bridge [0601]: Intel Corporation Z390 Chipset LPC/eSPI Controller [8086:a305] (rev 10)
IOMMU Group 12 00:1f.3 Audio device [0403]: Intel Corporation Cannon Lake PCH cAVS [8086:a348] (rev 10)
IOMMU Group 12 00:1f.4 SMBus [0c05]: Intel Corporation Cannon Lake PCH SMBus Controller [8086:a323] (rev 10)
IOMMU Group 12 00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH SPI Controller [8086:a324] (rev 10)
IOMMU Group 12 00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (7) I219-V [8086:15bc] (rev 10)
IOMMU Group 13 04:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation E12 NVMe Controller [1987:5012] (rev 01)
IOMMU Group 14 05:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
IOMMU Group 14 05:00.1 Audio device [0403]: NVIDIA Corporation GP102 HDMI Audio Controller [10de:10ef] (rev a1)
IOMMU Group 15 07:00.0 SATA controller [0106]: ASMedia Technology Inc. ASM1062 Serial ATA Controller [1b21:0612] (rev 01)
IOMMU Group 16 08:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
IOMMU Group 1 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 0a)
IOMMU Group 1 00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [8086:1905] (rev 0a)
IOMMU Group 1 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
IOMMU Group 1 01:00.1 Audio device [0403]: NVIDIA Corporation GP102 HDMI Audio Controller [10de:10ef] (rev a1)
IOMMU Group 1 02:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] [1000:0064] (rev 02)
IOMMU Group 2 00:12.0 Signal processing controller [1180]: Intel Corporation Cannon Lake PCH Thermal Controller [8086:a379] (rev 10)
IOMMU Group 3 00:14.0 USB controller [0c03]: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller [8086:a36d] (rev 10)
IOMMU Group 3 00:14.2 RAM memory [0500]: Intel Corporation Cannon Lake PCH Shared SRAM [8086:a36f] (rev 10)
IOMMU Group 4 00:16.0 Communication controller [0780]: Intel Corporation Cannon Lake PCH HECI Controller [8086:a360] (rev 10)
IOMMU Group 5 00:17.0 SATA controller [0106]: Intel Corporation Cannon Lake PCH SATA AHCI Controller [8086:a352] (rev 10)
IOMMU Group 6 00:1b.0 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #17 [8086:a340] (rev f0)
IOMMU Group 7 00:1b.4 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #21 [8086:a32c] (rev f0)
IOMMU Group 8 00:1b.6 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #23 [8086:a32e] (rev f0)
IOMMU Group 9 00:1c.0 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #1 [8086:a338] (rev f0)

This means I’ll need a separate USB controller to pass through keyboard and mouse, right? I have one PCIe x1 slot available. Suggestions?

edit Some possibly useful links for people using Ubuntu 20.04 or Mint 20
https://mathiashueber.com/pci-passthrough-ubuntu-2004-virtual-machine/ (posted above by @jerrac)
https://heiko-sieger.info/running-windows-10-on-linux-using-kvm-with-vga-passthrough/

edit2 One of the guides above indicates that it’s not necessary to add the NVMe id to the VFIO script. Is there no performance benefit to doing so?

Well… I got as far as seeing ‘Press any key to boot from CD or DVD…’ from the Windows 10 2004 ISO, but it doesn’t respond to keypresses. Tried passing through the USB keyboard and mouse individually as well as passing through the entire xHCI USB controller. I have a second USB controller on the way. It’s the one recommended here.

edit Got the Windows installation running. I’m guessing that adding VirtIO Keyboard is what did the trick. I had just proceeded to the Looking Glass instructions since I knew I was going to want it eventually.

Trying to get Looking Glass working, but Windows 10 complains about the signature for the IVSHMEM driver.

edit Disabled Secure Boot after finding this post and I’m up and running.

1 Like

Passing a separate USB controller to the guest and using a USB switch to swap the USB DAC between the two seems like the easiest solution.

1 Like