VFIO/Passthrough in 2023 - Call to Arms

kagayaki · December 10, 2023, 1:19pm

My first post here but pretty long time vfio user. I built a Team Red system early this year but hadn’t gotten VFIO working until recently. I was previously using an Intel Skylake & an RX580 which is where I had the most experience with VFIO.

Might also be worth mentioning that I mostly use VFIO to mess around with other Linux distributions so it’s possible I’m missing some Windows nuances. Of particular note, my interest in getting VFIO working again was renewed because of the Plasma 6 beta and managing 4+ distroes in a multi-boot scenario is annoying.

I built the following system:
AMD 7950x
ASUS B650 PRIME PLUS
32GB memory
PowerColor RX 7900XT
Fresco Logic FL1100 USB 3.0 Host Controller (PCIe USB expansion card)
Gentoo Linux as the host OS, 6.6.5-gentoo-dist (sys-kernel/gentoo-kernel), OpenRC

I haven’t tried to get resizable bar working yet, so I have Above 4G Decoding Enabled and Resizable BAR disabled in my BIOS. Otherwise my BIOS setup was pretty standard (SVM, IOMMU, etc). I have an iGPU on my 7950x so I switch that to my primary display when I want to boot in VFIO mode.

These are the key adjustments I made that seemed to be important to getting it working that maybe weren’t exactly straight forward:

I’ve never had luck with the q35 chipset on the old Skylake system. That seems to be the case on this system as well. I tried using q35 initially for a long time, but I didn’t have any results until I switched to i440FX. The guest would initialize but the hardware wouldn’t.
Disable ROM BAR on the Graphics/VGA portion. libvirt threw errors when trying to start a VM that it couldn’t get a ROM for 03:00.0
My 7900XT has four “components” and I couldn’t get things working until all of those components were bound to vfio-pci. The Graphics and Audio portions were easy to get bound but there were two other components that were getting bound to i2c_designware_pci and xhci_pci. I made some kernel config adjustments for those to be compiled as modules instead of builtin so vfio-pci could bind to those components without having to restort to shell scripting shenanigans in my initramfs.
Some guests seem like they may be sensitive to the defined CPU topology. I was having horrible micro freezes in an Arch Linux guest that seemed to go away after defining the topology more accurately

Related to the ROM BAR error, this was what I saw in the libvirt logs when I had that option enabled:

2023-12-07T18:28:33.359928Z qemu-system-x86_64: vfio-pci: Cannot read device rom at 0000:03:00.0
Device option ROM contents are probably invalid (check dmesg).
Skip option ROM probe with rombar=0, or load from file with romfile=

Related to the CPU topology thing, this is what I added to my xml that seems to have been the trick:

  <cpu mode="host-passthrough" check="none" migratable="on">
    <topology sockets="1" dies="1" cores="10" threads="2"/>
    <feature policy="require" name="topoext"/>
  </cpu>

I’m using 10c/20t of my 16c/32t which is how I got those numbers.

Not sure if it was the topoext feature or the topology definition itself. I think by default libvirt was treating each “thread” as its own CPU, like I had 20x 1c/1t CPUs.

Related to the components bullet, I found that libvirt logs at least a portion of the output when it’s starting a VM to /var/log/libvirt/qemu/[VM name].log. Sometimes the errors that are displayed in virt-manager aren’t particularly useful, but for example I saw this error in one of those log files:


2023-02-27T22:18:41.369723Z qemu-system-x86_64: vfio: Cannot reset device 0000:03:00.0, depends on group 15 which is not owned.

That’s what helped me determined that I actually needed to worry about the stuff getting bound to i2c_designware_pci and xhci_pci even though they were in separate IOMMU groups from my graphics and audio. Of course, “group 15” relates to the IOMMU group.

Dratatoo · December 12, 2023, 9:20am

Finished my ASRock B650 Live Mixer Build with AMD 7950X3D, 96GB RAM, AMD RX 6600, GeForce 3080ti with Proxmox. Initial Impression is a bit mixed.

Unfortunately I couldn’t pass through any other onboard USB controller besides group 24 (one row USB 2.0 and another row 3.0 ports, plenty for me) anything else will instantly reboot the host. It doesn’t help that 80% of the internal devices have no device descriptor at all Devices in other ports can still be accessed and grabbed via the USB port passthrough. So I had to forgo the 10GB NIC and added a Fresco 3.0 USB card instead. The mechanical HDDs of my storage PC are limiting the transfer speed anyways, so 2,5 GB is still plenty. Boot up takes a long time, 1 min to 2 total after enabling IOMMU and the second GPU. A lot of time is spend booting the host OS - I’ll dig through dmesg to find out if anything is stalling. It’s a fresh install with nearly no modification.
UEFI display is a big buggy, set to iGPU it will display a blinking cursor unless the screen is switched on and off, but the Grub bootloader and Login prompt will show normally without intervention.

In the VM Latency is reduced and the system feels less congested while doing data transfers. This is especially noticeable in VR.

adham · December 22, 2023, 6:51am

Proxmox Build

I opted for a Proxmox build as I found it easier to pass my single GPU. I found most guides for KVM/QEMU hooks to be Nvidia-only or dated.

Specs

Category	Part
GPU	XFX Speedster SWFT 210 RX 6600
CPU	R9 7950X
Mobo	x670e taichi
RAM	G.Skill Trident Z5 Neo RGB (2x32GB) DDR5-6000 CL30
PSU	EVGA SuperNOVA 850 G7

Software

Proxmox 8.1.3
Using Proxmox as a host, I passed the GPU to a Windows 11 virtual machine. I also had success passing the GPU to a MacOS Sonoma virtual machine.

Image of PCI configuration for the Windows 11 VM.

Motherboard

Disable resizable bar
Use internal GPU
Running the RAM speed on auto, which sets it to 4800; I have not experimented with changing this.

What Does Not Work

iGPU passthrough at the moment causes the system to crash.

Resources

Windows 11 VM for gaming setup guide (Proxmox Forum)
A recent and useful guide. A lot of tutorials out there for a Windows VM have you install virtio tools which seem to be no longer necessary.

thomasp · January 9, 2024, 10:38pm

Subbed! Can you share them?

Susanna · January 11, 2024, 1:01am

I’m getting really annoying crashes passing through my new 4090 to a Windows gaming VM. No reset bug though, which is a big upgrade from my Radeon card.
B650I Aorus Ultra, 96GB RAM@6400 (stable for 48 hours of memtest86), 7950X, running NixOS. Guest receives CCD1 and 64GB memory, and of course the GPU.

What happens is, after between fifteen minutes and 24 hours, the system will just reset. Screen blanks, fans max out, then the computer POSTs normally. journalctl is empty, and dmesg --follow and journalctl --follow over ssh also report no errors within the last couple minutes. Nothing particularly odd appears to happen in either host or guest before a reset, nor does there seem to be a pattern in what I do (it’s happened overnight when the computer was just idling, mid-game in Starfield and Baldur’s Gate 3 a few times, when I’ve been watching video streams, and in Solidworks.
I’d read that booting with pci=noaer could help for situations such as these, but it hasn’t. System is otherwise stable for weeks at a time if I don’t run the VM, including activities like gaming (on Linux). But if I boot the Windows guest, the computer eventually crashes. The computer won’t reset if the guest isn’t running, if the guest shuts down it seems to be just as stable as if it had never been started in the first place.

quilt · January 11, 2024, 9:50am

Is this the only RAM stability test you have done? If yes, first thing I’d try is stock RAM settings, because

Memtest is not the most stressful test. The ram oc people seem to use Karhu, hci memtest, etc.
RAM that runs stable on its own may not be stable anymore when a 4090 is dumping 500W of heat into the case.
I experienced myself worse stability in VMs than in native linux (including while gaming) on AM5 which I could solve by going back to stock settings.

Susanna · January 11, 2024, 10:40am

I have tried running the memory at stock speed, it still crashes.

quilt · January 11, 2024, 11:36am

WHEA errors in windows? Is the CPU at stock too?

Susanna · January 11, 2024, 12:24pm

I googled this and think I can figure out how to check it. Can I do so in the VM or do I need to install and run Windows on bare hardware?
CPU is at stock, but stock for this motherboard is PBO on. I can lock the frequency to something definitely safe, or set a positive curve optimiser, and check if it crashes, but is CPU likely to be the issue when Linux will run stable on the same hardware for weeks at a time? Uptime is only limited by when I feel like update and reboot is in order. Never random crashes like this, when I get kernel panics it’s always because I just did something stupid. It’s only when I pass through the GPU to the VM that I get these crashes. Without VM it seems entirely stable, as does the non-passthrough version of this same VM.

quilt · January 11, 2024, 12:58pm

You should do it in the VM. WHEA shows hardware/kernel/etc. errors in windows (similar to dmesg I guess). So it is to check if there was an issue windows detected at/before reset,

I’d give it a try with CPU completely stock. I feel like VMs can be less stable than bare metal, even if the bare metal has been stress tested. At least that has been my anecdotal experience. I never had the time to scientifically test this though.

wendell · January 11, 2024, 4:34pm

Put a big box fan pointed at the side of your computer with side off? I have had wild issues with memory running at around 50c with no apparent source except “well maybe one memory chip is a bit warm”

Susanna · January 11, 2024, 7:44pm

Nope, nothing. It’s the same on Linux, the screen just instantly goes black without warning. Log doesn’t save anything useful to disk and monitor with --follow over ssh also doesn’t manage to grab anything.
CPU at stock settings.

Memory sits around 55 degrees with a 40mm fan strapped on with rubber bands and the panels on. Without the fan they shoot up to 70, which is supposed to be fine, but scares me. They’re rated for 95 degrees, which just seems insane. Even 55 is much though, I agree. I’ll take the cover off and see if that helps, memory should sit around 30 degrees then.

lI_Simo_Hayha_Il · January 12, 2024, 8:56am

@Susanna I am having the exact same issue in both Windows 10 and 11. I had it before two years, when I first installed a 6900XT, and it was solved by using 2 cables for PCI-E power, and not one with splitter.

So, when I replaced it with a RTX4080, I had to use the +12VHPWR cable, directly from my PSU. Yet, it happens again. Believing it was the PSU, I replaced it with a better quality and larger (1000w > 1200w). But unfortunately it keeps happening.

What I found out though is when it happens, not the root cause.

VM settings:

    <feature policy="disable" name="svm"/>
    <feature policy="require" name="hypervisor"/>

Windows Hyper-V installed but cannot run.

This way it never happens. I have been playing for months, and it is rock stable.

VM settings:

    <feature policy="require" name="svm"/>
    <feature policy="disable" name="hypervisor"/>

Windows Hyper-V installed and running.
This is when it happens, even when playing or Idle. Alhtough I am not sure about “idle”, as Windows tend to do lots of stuff in the background without asking.

I am using Hyper-V to bypass certain anti-cheat, ex PUBG/BattlEye, and works fine. So what I do is, when I want to play PUBG, I am switching my settings, play, and then shut it down. When I am playing anything else, I am using the “safe” settings and it is rock solid.

Let me know if that works for you.

Here is my full XML in case you want to read it.

lI_Simo_Hayha_Il · January 12, 2024, 9:58am

Is there any way to measure and log memory temps, either in Linux or Windows, to check when it resets, what was the temp?

I don’t think this is the cause, since I am not even using EXPO, so they shouldn’t be hot, but I just want to exclude it as a possible cause.

I also had a conversation with a friend of mine, that builds systems for a large OEM, and he told me that memory is possible cause. They didn’t measure temps however, but they replacing kits until it stops resetting. I cannot do that though, I only have one DDR5 kit…

wendell · January 12, 2024, 2:27pm

“sometimes” hwinfo64 on windows. Not sure on Linux.

lI_Simo_Hayha_Il · January 13, 2024, 3:10pm

Not working in the VM… I don’t think memory temps are the issue anyway, but would like to be 100% sure.

My guess is something in Hyper-V is accessing “forbidden” parts of the CPU and some sort of watchdog shuts the system down. I might be completely wrong though.

login_collector · January 20, 2024, 11:39am

I want this in my life. I need this. It’s the only thing preventing me from switching into Linux for good and that’s because I can’t stand dual booting.

If it’s A-OKAY with mods I’m offering 100 USD bounty if anyone can help me get this setup on my machine (We will setup a Discord call). If not, then you guys can edit out / remove this paragraph.

i9-10850k @ 3.60Ghz
64 gb ram
RX 6900 xt

I’ve followed so many OLD and useless (partially complete or intentionally left out details) guides that I don’t know what’s going on anymore.

My best attempt was on Pop!os, where I managed to get everything setup and installed and when I ran the VM my screen went black and nothing happened. Couldn’t figure out what to do next.

My GPU is “sharing” with the PCIe lane it’s connected to. I’m not exactly sure what that means. Maybe it means it’s sharing the connection with the m.2 that’s right under the GPU and below the PCIe lane? I don’t know.

Also, if you guys decide to do a go-fund me I’ll pledge 800 usd to the cause. That’s how bad I want this.

Posting here in pure desperation so mods… be gentle.

PS: I understand some people don’t want the bounty for themselves so I can offer to donate to a project / charity of their choosing.

lI_Simo_Hayha_Il · January 20, 2024, 12:55pm

First things first…

It is much easier to use a 2nd cheap VGA for your host, and pass-through your 6900XT to the VM. You need a monitor with two inputs, so VGA=>Input1 and 6900XT=>Input2.

Then, you just follow this guide to make it work.

tomgg · January 20, 2024, 1:27pm

This is also a great guide, much simpler and works with 6900xt: GitHub - thecmdrunner/vfio-gpu-configs: KVM VFIO configuration for my system with 2 GPUs that runs Linux and Windows at the same time..

PS. Use fedora if you can

powerhouse · January 20, 2024, 9:01pm

Just noticed the thread and thought to throw in my experience.

I’m running GPU passthrough for about 12 years now, on an almost daily basis for photo editing etc. My first system was an Intel i7-3930K system with an Nvidia Quadro 2000 GPU running Windows 7. The host OS was Linux Mint (a Ubuntu variant) running a Xen hypervisor. The tutorial I wrote back then is on the Linux Mint forum and became quite popular.

Sometime in 2015 I switched to a kvm hypervisor. It seemed to offer broader support for graphics cards. Again I wrote a tutorial (“Running Windows 10 on Linux using KVM with VGA Passthrough”) that’s still being read, although it’s quite old. It uses a shell script to run the qemu-system-x86_64 command.

In 2020 - after using my Intel 3930K CPU for 8 years - I finally switched to an AMD 3900X platform, which I’m using now. At first I totally disliked Virtual Machine Manager and virsh/libvirt, but by then (2020) the virt-manager GUI had become kinda usable. It’s been improved since, though it does have its quirks. I wrote yet another tutorial based on my experiences - “Creating a Windows 10 kvm VM on the AMD Ryzen 9 3900X using VGA Passthrough”.

Today my host is running Manjaro, but in the past I also used PopOS! (only for a short time) and Linux Mint. I’ve also successfully ran a Windows VM with GPU passthrough in an Intel i3 CPU, though I wouldn’t recommend that. My other PCs run Linux Mint.

In retrospect, Linux Mint with the Xen hypervisor was the most solid VM. Back then I solved the audio issues with a USB audio stick that I passed through to Windows (using PCI) passthrough. Problems are there to be solved.

Next in line of stability is Linux Mint with kvm and the qemu start script. Following that is Linux Mint with libvirt/virt-manager, finally Manjaro with libvirt/virt-manager. Like Arch Linux, Manjaro is a rolling distribution and offers bleeding edge packages. Don’t expect the “stable” branch to be even closely as stable as Ubuntu LTS (or Linux Mint). While it can be annoying at times, I’m usually able to fix things or to find someone who narrowed the bug and found a workaround or solution.

The reason I run Manjaro is so that I can experience (or suffer) from the latest and greatest software releases and post about issues when I run into them. Still, Manjaro is enjoyable and has a good user forum, perhaps second to the Arch Linux forum.

Here a list of the passthrough GPUs I used or tried:
Nvidia Quadro 2000 which is a “professional” GPU so the Nvidia driver back then would support it running in a VM (domU in Xen-slang)
AMD Radeon 6450
Nvidia GTX 970
Nvidia GTX 1060
Nvidia RTX 2070 Super

I always use a dual-GPU setup, though I did a single GPU passthrough on the Intel i3, just for fun.

I tried both BIOS and UEFI passthroughs. Today, of course, it’s UEFI.

I’ve written a lot about the convenience of using vfio / GPU passthrough to run Windows and other OSes. Performance is top notch, at least for my purposes. I do heavy photo editing and sometimes 4k or 8k video editing in a VM. This is CPU and GPU intensive stuff, with a lot of I/O too.

While more and more hardware is capable of running vfio, it hasn’t necessarily gotten easier. We have more choices and tools today, like virtiofs to access a host drive, or different block or scsi storage drivers. But CPU designs have become more complex, too. Performance tweaking can be challenging with these new CPU designs. Luckily a simple configuration will most likely be all one needs to get good VM performance.