VFIO/Passthrough in 2023 - Call to Arms

i dont have anything to actively contribute to this, but i want to add to the pool of people who are interested in setting up something on VFIO gaming ('nix host, windows guest) because 1, thats something really cool, and 2; i have enough to deal with windows 10 data gathering as it is, i dont want to have to deal with whatever windows 11 has baked in it.

Here is my setup.

CPU: AMD Ryzen 9 7950X 16-Core Processor
RAM: 128GB@6000MT/s (Kingston FURY Beast DDR5-5600 EXPO 2x32GB, Kingston FURY Beast DDR5-6000 EXPO 2x32GB)
GPU: ZOTAC GAMING GeForce RTX 4070 Ti Trinity (12GB)
Main Board: MSI MPG X670E CARBON WIFI (MS-7D70)
Storage: 2x Intel SSD 670p M.2 NVMe 1TB, WD 8TB SATA(WD80EDAZ), WD 14TB Ultrastar DC HC530 SATA
Network: 10Gb Intel x540-T2 (SR-IOV Enabled)
PSU: Corsair RM750x

The host is mainly used as a NAS server backed by ZFS. On top of the ZFS storage, it hosts couple of LXD Containers, VMs. The RTX 4070 Ti is passed through to one of the VM. Everything is backed up by ZFS. I donā€™t use mirror. I do have backup. If anything fails, I just recover from the back up. Not a big deal for me.

Host: Ubuntu Server 23.04
Guest:

  1. Windows 10 Pro for gaming and working.
  2. Ubuntu container, as Plex Server.
  3. Ubuntu container, as Unifi Controller.
  4. Ubuntu container, as Web Server.

VM performance.

I appended my scripts to this post for anyone interested in VFIO with Radeon 7000 series cards or virtualization with the 7950X3D.

You would get better PCIe throughput to the GPU if you were running a Q35 system. NVidia drivers change some parameteres based on the PCIe bus width and speed reported, but i440FX never had PCIe and as such reports it as a PCI device, which had no concept of bus width or speed.

There is an entire discovery thread on here about it from a few years ago where this was proven and the feature to pass the bus parameters to the guest were added to QEMU.

Edit: Increasing VFIO VGA Performance - #58 by gnif

Good point. The software stack has a long history all the way back to my Z87 Haswell 4770K system. I upgrade the hardware along the years. The windows VM was created a long time ago, too. At that time, I just passed throught the intel igpu, so that I could get display output directly. Yes, i440FX is outdated now. If let me do it right now, I will choose Q35.

Even though, I donā€™t notice the PCIE bandwidth issue. Even Resizable Bar is Enabled.

This is awesome, just got into this been a long time VMware user till now. I have VFIO for the most part working on I9, With Nvidia 2080 as main card and Nvidia 1050 for GPU pass thru. Building it basicly for Adobe Products and Office 365 suite of apps including the need fo Microsoft teams. My only issue at the moment to work is out is once I shutdown the VM, I can not start again and have the GPU start. It requires a full reboot of host/guest. Works once, but I do not date stop the VM and add a device, if so a reboot is needed. Other than the ā€œUse it onceā€ issue at the moment it seems working well on Garuda Linux. I just wish I could shutdown the VIM when I do not need it and start it again without a reboot.

I had the same issue when doing passthrough with a GTX 1070. The workaround I used was removing the device from the PCI bus with sysfs, then triggering a rescan to reattach it. I had to do this before every boot of the VM. I think I found this method somewhere on Arch Wiki. Maybe it will work for you?

The script I used:

reattach-vfio-devices
#!/bin/sh

# Edit these to point to the appropriate addresses on your system
gpuAddress='0000:09:00.0'
gpuAudioAddress='0000:09:00.1'

devices="$gpuAddress $gpuAudioAddress"

for dev in $devices; do
    devpath="/sys/bus/pci/devices/$dev"
    if test -d $devpath; then
        echo -n "Removing device: $dev ... "
        echo 1 | sudo tee "$devpath/remove" >/dev/null
        sleep 2
        echo "done"
    else
        echo "error: $dev not found in device tree" >&2
    fi
done

echo 'Rescanning PCI bus'
echo 1 | sudo tee /sys/bus/pci/rescan

If removing the ā€œGPU audio deviceā€ stalls and throws errors in the kernel log, try reattaching just the main GPU device.

Awesome thank you, I will try this. I am not passing the audio part of the 1050 NVidia. Even though it shows it as using the VFIO driver for it, when the audio part of card is placed into the VM it breaks looking glass. I am fine without it, if I can just stop the need for rebooting the whole setup. I will try this tonight if possible. Thank you so much.

Care to share how it breaks it? It should not be possible.

There might be a bug with vfio-pci, linux kernel 6.x, and Ryzen 7000 igpu.
The issue is when you try to load vfio-pci module for dgpu, it will trash the frame buffer of the igpu. It causes there is no more video output from the igpu after vfio-pci is loaded.
It doesnā€™t happen with linux kernel 5.x. It doesnā€™t happen with other gpu, either. The simple way to fix that is to put in another gpu (e.g. a cheap R7 240) for the host instead of using Ryzen 7000 igpu.

Iā€™ve been rocking a ryzen 9 5900x with 64 gigs of ram, a 6800xt for host, and a 3090 for passthrough. I wanted to say thank you to all those who have contributed and helped make this possible for mere mortals. Huge shoutout to the developers like gnif. Looking glass is amazing and I would happily pay for it and look forward to supporting the project further.

I recently fell into the k8s rabbit hole again, and I found Kubevirtā€¦then I said ā€œhmm this seems complex but also cool, lets mess around.ā€

I took a look at Rancher Labs Harvester a year or so ago and found it interesting but I decided to stick with the tried and true Proxmox and I couldnt be happier with my two proxmox nodes. But I wanted to try Harvester to see if I could really manage vms with kubernetes and do things like vfio with it. So for science I bought a minisforum hx99g and got to tinkering.

Itā€™s pretty amazing, especially if you use rancher to deploy k8s clusters onto harvester as a ā€œcloud targetā€. As far as vfio goes, they make adding the pcie devices incredibly easy and simple. What I havenā€™t figured out is why my vm fails to boot up when I attach the 6600m.

Iā€™m sure itā€™s a user error and adding the k8s ontop of everything adds another layer of complexity. All that being said, VFIO is awesome and I have enjoyed the past couple years of learning and tinkerning and look forward to more! Thank everyone!!

Sure if I can. Basicly running with just the GPU life is fine, Everything works as it should for the most part. Once I add the NVidia Audio portion of the card and start the VM up I get a Broken Glass icon in the upper right section of the screen. Looking Glass still shows login in window, but the Keyboard and Mouse no longer work. If I remove the audio section of the card, it works again.

Both the GPU and Audio are excluded and show VFIO as their driver if I list them out with LSPCI -nnv. I think that was the command I used. I am work at this time and can not access my home PC. As long I do not add the Nvidia Audio life is pretty good. Again though once I add it, I get what looks like a broken glass icon in the upper right of the screen and I canā€™t login using looking-glass.

This is to let you know that the LG host application canā€™t capture anything and/or start and you have fallen back to spice video. When in this mode input only works if you press Scroll Lock to capture the input, this is a failure recover feature.

You need to check why the GPU is not working, have a look in device manager.

1 Like

Thank you I will check that tonight and see whats up. Very odd. So for now I just do not add the Audio from Nvida card. Havent really needed it anyway. ;-). Thanks again I will check this. I got Microsoft Teams working, audio is off on the video by a second. But tolerable. Works really well other than that.

When I add the audio portion of 1050 Gpu, I get the dreaded (Code43) on the video side. But the audio card installs fine. Once I remove the audio part of the card it goes to back to loading fine.

Well, I have a new desktopā€¦ and so far the VFIO passthrough is a failā€¦

AMD R7 7700
b650e taichi motherboard
RX 6900 XT (host)

Iā€™m trying to pass the iGPU from the APU (integrated graphics passthrough), but I get a error code 43. This is an all AMD system. Iā€™m sure discrete GPU passthrough would be fine, but I donā€™t want to add another GPU when I already have one built into the CPU.

I havenā€™t tried anything with GVT-d, but I donā€™t even know if that is possible on an AMD APU. It works on my intel laptop.

No glory here darn it. But thank you for the script and suggestion.

Drilled down this rabbit hole a bit more in recent days. I found a couple of surprises. So a quick update for folks interested in this topic.

AVIC has two parts. The so called ā€œSVM AVICā€ (related to the CPU/software interrupts) and ā€œIOMMU AVICā€ (interrupts from devices).

Now the bizzareness. Zen 2 and Zen 3 do not have ā€œIOMMU AVICā€ (!!) support in hardware. AMD seems to remain silent about it for years. Zen 1 and Zen+ have both ā€œSVM AVICā€ and ā€œIOMMU AVICā€ (!!) support in hardware.

Iā€™m not sure about Zen 4. And curious about it. I wonder what people get from ā€˜dmesgā€™ in Linux. Here is an example output from Zen 2:

$ dmesg|grep AMD-Vi
[    0.151457] AMD-Vi: Using global IVHD EFR:0x0, EFR2:0x0
[    0.730724] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.731501] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.731502] AMD-Vi: Extended features (0x58f77ef22294a5a, 0x0): PPR NX GT IA PC GA_vAPIC
[    0.731505] AMD-Vi: Interrupt remapping enabled

Itā€™s from a Ryzen but I expect EPYC has the same (look forward to hearing more surprise).

Software wise it also requires support from Guest OS. Windows 10/11 seems fine with ā€œSVM AVICā€ but not clear about ā€œIOMMU AVICā€. Linux seems fine with both. MacOS is not okay with either unsurprisingly.

My machine spent most of its up time in MacOS VM. So seems Iā€™ve been chasing after thin air. LOL

I have turned mitigations=off in the grub (kernel) boot parameters and I think I gained some performance while running VFIO and my Windows VM. What are the implications of this, do VMs benefits as well or I am making things up?

I use the PC as workstation only, there are no processes running which belong to other users.

Iā€™ve been running with ā€œmitigations offā€ too. For years nowā€¦

2 Likes