VFIO PCI Passthrough "works", but the VM is unusably slow

It’s another one of these posts!

I’m trying to get a Windows VM working under Arch with VFIO. I’ve run into the issue that I’ve seen all over the place where the VM takes forever to boot and then runs incredibly slow (we’re talking 1 frame every 5 seconds), but none of the things I’ve tried have had any real impact. After some experimentation I assume this has at least something to do with the amount of memory I’m trying to give it (originally 16gb, down to 8gb), but none of the resolutions I’ve found online fix it.

Here’s my system:

Motherboard: Asus PRIME X399-A
CPU: AMD Threadripper 2950X
Memory: 64GB DDR4 (4x16GB)
GPU 1 (in use by Arch): Nvidia GTX 1060
GPU 2 (passed into guest): Nvidia RTX 2080

Here’s a gist with everything I could think that would be useful: https://gist.github.com/mcmillion/7cee49cdb9f27449c546accf822120fe

Here’s where I’m at:

  • AMD virtualization settings are enabled on the motherboard
  • I’ve also enabled ACS, IOMMU+IVRS, and set Memory Interleave to Channel as per https://tripleback.net/post/chasingvfioperformance/
  • AMD IOMMU is enabled in kernel options, as well allow_unsafe_interrupts
  • The 2080’s VGA, Audio, USB, and Serial Bus are all in a single group and passthrough to the VM fine (I get video output when I start it)
  • I’ve pinned the CPUs based on diagram from running lstop -l as per https://bytee.net/blog/amd-threadripper-kvm-windows-gpu-passthrough
  • I’ve set up hugepages and tried both the default 2048kb size and 1GB size
  • I’ve confirmed that info kvm shows KVM is running in the guest

This is my first time using QEMU/KVM and my first time attempting a VFIO setup, so I’m sure I’m missing something really basic.

Any help is greatly appreciated!

can you give us your VM’s XML config file?

As I just posted in another thread. This maybe your problem also. I was getting 3 different outcomes before I found this on arch wiki, unusable slow VM, bsod’s or blackscreens.

QEMU 4.0: Unable to load graphics drivers/BSOD after driver install using Q35

Starting with QEMU 4.0 the q35 machine type changes the default kernel_irqchip from off to split which breaks some guest devices, such as nVidia graphics (the driver fails to load / black screen / code 43). Switch to full KVM mode instead with <ioapic driver='kvm'/> under libvirts <features> tag or kernel_irqchip=on in the -machine qemu arg.

XML should look like so:

  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <vendor_id state="on" value="whatever"/>
    </hyperv>
    <kvm>
      <hidden state="on"/>
    </kvm>
    <vmport state="off"/>
    <ioapic driver="kvm"/>
  </features>

You may want to try turning sme off in your uefi or booting with the flag mem_encrypt=off as some uefi implementations cause problems.

Could you past performance windows from win10?

I have experienced this and similar problems due to poor CPU performance even on a really fast CPU. I managed to isolated once I found the source of the load. It was KDE’s plasma taking down one CPU thread completely and the VM was switching back and forth from this thread.

I run Manjaro with the linux-hardened kernel on an FX 8370 with an Rx 570 passed through to the guest. One of the best tweaks for latency was to use real time schedulers on the pinned cpu cores. I also reserve the RAM in grub at boot time to prevent memory fragmentation.

I’ve posted my libvirt xml here. As well as the GPU I passthrough whole disks with a small SSD for the Windows OS. I also pass a whole USB3 controller & the motherboard’s built in Audio (& use a cheap discrete sound card for the Linux host).

I also use some libvirt hooks to set the Linux frequency-governor to performance with cpupower when the Windows 10 vm starts & switch it back to ondemand when it stops.

The performance is good enough to play GTA5 & the Division online / Prey / Hitman / Dishonored 2 / Doom all on high / ultra @ 1080p.