VFIO in 2019 -- Pop!_OS How-To (General Guide though) [DRAFT]

sure as long as I can update/repost here? I had a volunteer that was going to do some screengrabs and update the todos, but they disappeared haha.

That’s fine. I won’t be able to write about Looking Glass though until I find the proper solution for the security error. The not-recommended workaround was to set:
security_driver = "selinux" to security_driver = "none"
in
/etc/libvirt/qemu.conf

The specific Looking Glass setup guide I followed was this. It did work but I don’t want to write something that isn’t recommended so I need to solve this first. Once I do I’ll have all the information I need.

Several posts here have the work around for that. It’s just adding the user and the devices in /dev/ to the users security context. Basically the problem is just that normal users don’t have permissions to manipulate devices in /dev

I’ll investigate that when I find the time. It’s going to be quite a while before I post the guide. School & work have eaten most of my free-time. Hardly have any for myself right now.

Thanks for the post. I managed to get Windows 10 pro working on my PopOs 19.04. When selecting the Nvme drive, I went ahead and installed nvme-cli using apt to workout which drive corresponds to which /dev/nvmexnx. Also, when creating the vm in the GUI I forgot to change the hypervisor settings from Bios to UEFI secureboot, had to delete the vm and try again.

Well Wendell, you are a fucking genius! Thank you so much for this. I´ve been waiting for this to happen for most of 2019. I´ve tried on my PC and it worked. Even with the ACS patch. Of course I had to hide KVM from the VM by adding stuff to the xml file so I could install the Nvidia drivers. Here´s what I added:
On the top line: <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
And at the end of the file above </domain>:

<qemu:commandline>
<qemu:arg value='-cpu'/>
<qemu:arg value='host,hv_time,kvm=off,hv_vendor_id=null,-hypervisor'/>
<qemu:env name='QEMU_AUDIO_DRV' value='pa'/>
</qemu:commandline>

Kudo´s to you. I´m doing this in Debian 10. Now to P2V my hosts Windows 10 to a VM. You help fight against the Microsoft spies! Thank You.

Uh oh. Things are about to get harder… virt-manager is now EOL/Deprecated, so it might not even make it into 20.04 LTS.

So, @Wendell, the guide needs to be updated with something like the web server based Cockpit.

2 Likes

So as a total noob to this stuff I’ve got one question before I dive into this:

If I am using a laptop with integrated graphics and a 1070, if I pass the 1070 through to the Windows VM, will I be able to use 1070 in Linux if the Windows VM is not open or active? Or will I have to rebind the 1070 to Linux every time I want to use it in Linux?

Thanks!

I usually setup two boot menu entries and do it that way. You can release/bind things dynamically but the nv driver is often unforgiving

So the best bet is to set up two boot menu entries and manually rebind the GPU on boot?

I have a quick question - hopefully it’s okay to post it here.

I’ve set up a VM with GPU passthrough. On boot, my guest GPU and its HDMI audio component are both grabbed by vfio-pci. Now, completely ignoring the VM side entirely: while I’m using my Linux host alone, I’ve noticed that there is a lot of hot air coming out of the back of my guest GPU (5700 XT) - while my host GPU (RX 550) is totally cool. The host is definitely using the RX 550, and the 5700 XT is definitely using vfio-pci.

Is this normal? I’m a little bit concerned that it might be running at too high a temperature or drawing too much power. Surely if the card is bound to vfio-pci and the VM isn’t even running (and hasn’t even been started on the current boot!), then the card should be totally idle and cool?

If I run sensors, it only shows the temperature of my CPU and RX 550. Presumably it doesn’t show the 5700 XT since it’s not using the amdgpu driver, but I don’t know. This means I can’t actually check what temperature it’s running at. I just know there’s a lot of hot air coming out of it.

Any advice? Thanks.

I need to ask. This is a bit on topic and a bit off topic but this seems like the best place to ask. I’m looking to pass-though not a GPU but a LSI HBA card and I’d like to know if the same vfio-pci script can be used to free the HBA from the host. I can’t blacklist the driver because there are 3 in the server. The other two are needed on the host.

If not I’ll try and find a solution somewhere.

Also @wendell I have not forgotten about our agreement. I’m currently starting my own version of this guide. You will be free to pull whatever information you find useful in mine when it’s done.

Hi,

thanks for this guide, @wendell!

I would like to use my PopOS 19.10 as a host for Win10 and macOS. My HW Specs are:

  • Intel i8700
  • Gigabyte z370 AORUS Gaming 7
  • Sapphire AMD RX580
  • MSI AMD RX570
  • PopOS 19.10

So, the 570 and the 580 do have the same id:

01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1002:67df] (rev e7)
01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]
02:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1002:67df] (rev ef)
02:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]

I would like to stick with systemd, so I set:

sudo kernelstub -a ‘intel_iommu=on’
sudo kernelstub -a ‘iommu=pt’
sudo kernelstub -a ‘iommu=1’

and ran sudo update-initramfs -u. Then I followed this guide, added the bind_vfio script and set the right permissions on it:

/etc/initramfs-tools/scripts/init-top/bind_vfio.sh:

#!/bin/sh
PREREQS=""
DEVS=“0000:01:00.0 0000:01:00.1”
for DEV in $DEVS;
do echo “vfio-pci” > /sys/bus/pci/devices/$DEV/driver_override
done

modprobe -i vfio-pci

After that, I added the vfio-pci to the initramfs-tools/modules (only the line vfio-pci, no softdep or anything else).

$ lsinitramfs /boot/initrd.img-5.3.0-7625-generic |grep vfio

scripts/init-top/bind_vfio.sh
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/mdev
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/mdev/mdev.ko
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/mdev/vfio_mdev.ko
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/pci
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/pci/vfio-pci.ko
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/vfio.ko
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/vfio_iommu_type1.ko
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/vfio_virqfd.ko

After all this, lspci -nnv does still output:

01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1002:67df] (rev e7) (prog-if 00 [VGA controller])
Subsystem: Sapphire Technology Limited Nitro+ Radeon RX 570/580 [1da2:e366]
Flags: bus master, fast devsel, latency 0, IRQ 151
Memory at d0000000 (64-bit, prefetchable) [size=256M]
Memory at e0000000 (64-bit, prefetchable) [size=2M]
I/O ports at e000 [size=256]
Memory at efe00000 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities:
Kernel driver in use: amdgpu
Kernel modules: amdgpu

01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]
Subsystem: Sapphire Technology Limited Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1da2:aaf0]
Flags: bus master, fast devsel, latency 0, IRQ 155
Memory at efe60000 (64-bit, non-prefetchable) [size=16K]
Capabilities:
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel

So it seems the kernel driver will not change to vfio-pci. Am I missing something? Would really appreciate your help!

Edit: I’m not sure if this is important with passing the pci addresses, but both GPUs are in the same IOMMU Group:

IOMMU Group 1 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
IOMMU Group 1 00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [8086:1905] (rev 07)
IOMMU Group 1 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1002:67df] (rev e7)
IOMMU Group 1 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]
IOMMU Group 1 02:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1002:67df] (rev ef)
IOMMU Group 1 02:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]

Thanks!

Lsinitramfs and confirm that the initial ramdisk contains the vfio driver? If it does then the script is not running or. It running early enough

Check logs and see if you see output from the script

Thanks for your reply!

lsinitramfs looks like this:

lsinitramfs /boot/initrd.img-5.3.0-7625-generic |grep vfio
scripts/init-top/bind_vfio.sh
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/mdev
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/mdev/mdev.ko
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/mdev/vfio_mdev.ko
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/pci
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/pci/vfio-pci.ko
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/vfio.ko
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/vfio_iommu_type1.ko
usr/lib/modules/5.3.0-7625-generic/kernel/drivers/vfio/vfio_virqfd.ko

Other than your output, the script is on top listed and not on the bottom. Does this have an impact ? If yes, do u know if there is a way to order this?

Edit: looks like there is no output in any log :frowning: What do I miss, that the script does not get executed?

Put an echo hello world in the script to see if it runs

Check the device paths are what you expect e g.

/sys/bus/pci/devices/$DEV/driver_override
done

Exceptb$dev is the path to your gpu ?

Hm I can‘t really tell if there is any log output, since it looks like I‘m too dumb to find out :wink: I changed the boot flag loglevel to 7 but there is no output from the script at all.

BUT: I have no idea what I changed, but in the /sys/bus/pci/devices/$DEV/driver_override files there now is the vfio_pci flag (before there was always (null)). Unfortunately lspci -nnv still outputs the amdgpu kernel driver :-/

Edit: okay, seems like I found the issue … looks like the IOMMU groups are making problems. Both GPUs & the third PCI controller are on the same group. I just installed the ACS Override Patch and now the vfio-pci driver was loaded as the kernel module.
Since the ACS patch is not the best solution, I’m asking myself if there is anything that I can do?

Alright @wendell, sorry for making you wait 5 months but I’m a man of my word.

VFIO GPU Pass-though w/ Looking Glass KVM on Ubuntu 19.04

If you think there’s anything worth copying off here (steps/images) to use for your guide you’re welcome to it. Only thing I ask is I guarantee there are little imperfections (mis-information/poor elaboration/etc) in mine so when you spot things that need fixing do let me know so I can make the appropriate corrections.

:slight_smile:

1 Like

Is this still a draft?

its mostly fine, actually, I have updated it as new info has come in. :smiley:

1 Like