Windows stability issues with hardware gpu passthough

I have been setting up gpu passthough to a windows vm. I have the hardware passthough working, but I am having stability issues with windows. Every game I have tried will ether freeze, crash or cause a BSOD.
My system specs are:
Host OS:Arch Linux, Kernel 5.2.2
CPU: Intel Xeon E5 2640
RAM: 32 GB
Mobo: ASUS x79 ws
Host GPU: RX 580
passthough GPU: r9 280x

VM XML config:

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>win10</name>
  <uuid>696ec8b4-ade1-487b-a3ed-c6cea9793a64</uuid>
  <metadata>
    <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
      <libosinfo:os id="http://microsoft.com/win/10"/>
    </libosinfo:libosinfo>
  </metadata>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <vcpu placement='static'>8</vcpu>
  <os>
    <type arch='x86_64' machine='pc-q35-4.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/ovmf/x64/OVMF_CODE.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram>
    <boot dev='cdrom'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
    </hyperv>
    <vmport state='off'/>
    <ioapic driver='kvm'/>
  </features>
  <cpu mode='host-passthrough' check='partial'>
    <topology sockets='1' cores='4' threads='2'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
    <timer name='hypervclock' present='yes'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/sda'/>
      <target dev='sdb' bus='sata'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/sdb'/>
      <target dev='sdc' bus='sata'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <controller type='usb' index='0' model='qemu-xhci' ports='15'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x12'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x13'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x14'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0x15'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </controller>
    <interface type='network'>
      <mac address='52:54:00:51:31:ad'/>
      <source network='default'/>
      <model type='e1000e'/>
      <link state='up'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <sound model='ich9'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1b' function='0x0'/>
    </sound>
    <hostdev mode='subsystem' type='usb' managed='yes'>
      <source>
        <vendor id='0x19ff'/>
        <product id='0x0238'/>
      </source>
      <address type='usb' bus='0' port='4'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='yes'>
      <source>
        <vendor id='0x040b'/>
        <product id='0x2000'/>
      </source>
      <address type='usb' bus='0' port='5'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </hostdev>
    <redirdev bus='usb' type='spicevmc'>
      <address type='usb' bus='0' port='2'/>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
      <address type='usb' bus='0' port='3'/>
    </redirdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </memballoon>
  </devices>
  <qemu:commandline>
    <qemu:env name='QEMU_AUDIO_DRV' value='pa'/>
    <qemu:env name='QEMU_PA_SERVER' value='/run/user/1000/pulse/native'/>
  </qemu:commandline>
</domain>

I have added a modprobe file (/etc/modprobe.d/kvm.conf) with

options kvm ignore_msrs=1

but this did not solve my game crashing issues.

Any errors before the crash or in the event viewer? Was it when the sound was supposed to kick in perhaps? No sound? Sound but then gets stuck in an endless loop?

Might be pulseaudio that isn’t setup right. Just shooting in the dark since I’m not sure of the details. One quick check is to backup the config, remove virtual hardware one at a time until the problem goes away or you confirm it’s not driver/virtual hardware problem.

No errors that I have seen. Most games will run but only for a few minutes before crashing. Audio works fine until the crash, it is ether an endless loop or just stops.

Roll back to the latest 5.1 kernel, I’m seeing issues with 5.2 as well. At first it was either BSOD or the game would crash with an error, 5.2.4 now I just get driver hangs at random intervals, can be a few minutes or close to an hour. No issue on 5.1.18.

/facepalm. I had heard about there being a regression in 5.2 and it didn’t even click when I read the OP. My bad. Which is, incidentally, also one of the reasons I decided to hold off on getting a Ryzen 3000. As I figured fixes/features for that would come to 5.2 first which would then be a problem for VFIO.

Just rolled back to 5.1 and it seems to have fixed my issue. Thanks for all the help!

1 Like

The patches should be backported to other stable kernels as well but I’m sure it will be fixed. :slight_smile: I actually couldn’t find much information about it specifically but I didn’t look too hard. I just saw mention about it on a Phoronix article comment page. Mostly had issues inside the VM but I did have a display driver crash while compiling something so back to 5.1 for now.

Give 5.2.5 a shot and see if it’s any better, I’m building linux-ck 5.2.5 right now to try it out but someone said it worked for them here.

EDIT: So far so good for me, game has been running for over an hour and it hasn’t crashed but I was mostly AFK and just letting it run.

EDIT2: Seems resolved to me with 5.2.5, no crashes or hangs, solid game play.

Oh, that is good news. Thanks for the update!

Now if they get the Zen2 thing patched the only issue left is Navi support? I think…

Noob question, but do you need 2 GPUs to make this work? One for the host and one to pass through to the guest?

Yes, sir.

We have a guy here named gnif who is working on this stuff. So if you have any other questions, he’s your guy

1 Like

Thanks. I have a spare 1050Ti with a host running KVM if anyone needs me to do any testing.