**SOLVED**Performance issues in some games with KVM

I recently set up Windows to run under Linux following the amazing guide posted here. I have played a few games, and everything was great, Doom : Eternal give me within 5% of bare metal(Over 150fps) Until I tried Metro: Exodus. This game runs at between 0 and 30 FPS and is basically a stuttery slide-slow even in the menus.
I’ve tried to play around with CPUPinning with some success. It seems once I specified the IOThread the performance doubled. However, this is still unplayable with dips to 0 and spikes to 50.
All game files are stored on a nvme drive that was passed through completely to the guest. I’ve set up LookingGlass as well but I see the same stutters using a physical display connected to the guest GPU.
I’ve tried searching but the info I’ve found is above my ability to understand at this point. I’m not sure how to improve from here.

Side note, The Mobo does not support x8/x8 and I’ve been unable to get the main GPU to successfully pass through (I get stuck at the mobo splash screen when I try). This means the guest GPU is running at x4. However, this is Gen 4 x4 and since I can push 150fps in other games and that is what this GPU does in this game normally I would assume that is not the issue. I do have another mobo coming that will support x8/x8
-----The Specs------
Ryzen 5900X 12C 24T
32GB Ram
Host GPU - 3080
Guest GPU - 3080ti
Asus Tuf Pro X570 (Soon to be replaced with Asus Pro WS X570-ACE
Target resolution and quality - 1440p 144hz, High/Ultra
—Host Info—
Ubuntu 21.04
Linux ubu 5.11.0-31-generic #33-Ubuntu SMP Wed Aug 11 13:19:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
—Guest Info—
Windows 10 21H1
6 C 12T
16GB Ram
—Virt Manager Config—

<domain type="kvm">
  <name>win10</name>
  <uuid>76c29c2e-5d4b-4938-8d66-00449a60b6b1</uuid>
  <metadata>
    <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
      <libosinfo:os id="http://microsoft.com/win/10"/>
    </libosinfo:libosinfo>
  </metadata>
  <memory unit="KiB">15624192</memory>
  <currentMemory unit="KiB">15624192</currentMemory>
  <vcpu placement="static">12</vcpu>
  <iothreads>1</iothreads>
  <cputune>
    <vcpupin vcpu="0" cpuset="6"/>
    <vcpupin vcpu="1" cpuset="7"/>
    <vcpupin vcpu="2" cpuset="8"/>
    <vcpupin vcpu="3" cpuset="9"/>
    <vcpupin vcpu="4" cpuset="10"/>
    <vcpupin vcpu="5" cpuset="11"/>
    <vcpupin vcpu="6" cpuset="18"/>
    <vcpupin vcpu="7" cpuset="19"/>
    <vcpupin vcpu="8" cpuset="20"/>
    <vcpupin vcpu="9" cpuset="21"/>
    <vcpupin vcpu="10" cpuset="22"/>
    <vcpupin vcpu="11" cpuset="23"/>
    <emulatorpin cpuset="12-17"/>
    <iothreadpin iothread="1" cpuset="0-3"/>
  </cputune>
  <os>
    <type arch="x86_64" machine="pc-q35-5.2">hvm</type>
    <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE_4M.ms.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/ubu10_VARS.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state="on"/>
      <vapic state="on"/>
      <spinlocks state="on" retries="8191"/>
      <vpindex state="on"/>
      <synic state="on"/>
      <stimer state="on"/>
      <reset state="on"/>
      <vendor_id state="on" value="1234567890ab"/>
      <frequencies state="on"/>
    </hyperv>
  </features>
  <cpu mode="custom" match="exact" check="partial">
    <model fallback="allow">EPYC</model>
    <topology sockets="1" dies="1" cores="6" threads="2"/>
    <feature policy="require" name="topoext"/>
  </cpu>
  <clock offset="localtime">
    <timer name="rtc" tickpolicy="catchup"/>
    <timer name="pit" tickpolicy="delay"/>
    <timer name="hpet" present="no"/>
    <timer name="hypervclock" present="yes"/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled="no"/>
    <suspend-to-disk enabled="no"/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type="block" device="disk">
      <driver name="qemu" type="raw" cache="directsync" io="native"/>
      <source dev="/dev/md0"/>
      <backingStore/>
      <target dev="sdb" bus="sata"/>
      <boot order="1"/>
      <address type="drive" controller="0" bus="0" target="0" unit="1"/>
    </disk>
    <controller type="usb" index="0" model="qemu-xhci">
      <address type="pci" domain="0x0000" bus="0x02" slot="0x00" function="0x0"/>
    </controller>
    <controller type="sata" index="0">
      <address type="pci" domain="0x0000" bus="0x00" slot="0x1f" function="0x2"/>
    </controller>
    <controller type="pci" index="0" model="pcie-root"/>
    <controller type="pci" index="1" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="1" port="0x10"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x0" multifunction="on"/>
    </controller>
    <controller type="pci" index="2" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="2" port="0x11"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x1"/>
    </controller>
    <controller type="pci" index="3" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="3" port="0x12"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x2"/>
    </controller>
    <controller type="pci" index="4" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="4" port="0x13"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x3"/>
    </controller>
    <controller type="pci" index="5" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="5" port="0x14"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x4"/>
    </controller>
    <controller type="pci" index="6" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="6" port="0x15"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x5"/>
    </controller>
    <controller type="pci" index="7" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="7" port="0x8"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x0" multifunction="on"/>
    </controller>
    <controller type="pci" index="8" model="pcie-to-pci-bridge">
      <model name="pcie-pci-bridge"/>
      <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
    </controller>
    <controller type="pci" index="9" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="9" port="0x9"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x1"/>
    </controller>
    <controller type="pci" index="10" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="10" port="0xa"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x2"/>
    </controller>
    <controller type="virtio-serial" index="0">
      <address type="pci" domain="0x0000" bus="0x03" slot="0x00" function="0x0"/>
    </controller>
    <interface type="bridge">
      <mac address="aa:aa:aa:aa:aa:aa"/>
      <source bridge="virbr0"/>
      <model type="virtio-net-pci"/>
      <address type="pci" domain="0x0000" bus="0x01" slot="0x00" function="0x0"/>
    </interface>
    <serial type="pty">
      <target type="isa-serial" port="0">
        <model name="isa-serial"/>
      </target>
    </serial>
    <console type="pty">
      <target type="serial" port="0"/>
    </console>
    <channel type="spicevmc">
      <target type="virtio" name="com.redhat.spice.0"/>
      <address type="virtio-serial" controller="0" bus="0" port="1"/>
    </channel>
    <input type="mouse" bus="ps2"/>
    <input type="keyboard" bus="ps2"/>
    <input type="keyboard" bus="virtio">
      <address type="pci" domain="0x0000" bus="0x09" slot="0x00" function="0x0"/>
    </input>
    <graphics type="spice" autoport="yes">
      <listen type="address"/>
      <image compression="off"/>
    </graphics>
    <sound model="ich9">
      <address type="pci" domain="0x0000" bus="0x00" slot="0x1b" function="0x0"/>
    </sound>
    <video>
      <model type="none"/>
    </video>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x05" slot="0x00" function="0x0"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x05" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x05" slot="0x00" function="0x1"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x06" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x01" slot="0x00" function="0x0"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x0a" slot="0x00" function="0x0"/>
    </hostdev>
    <redirdev bus="usb" type="spicevmc">
      <address type="usb" bus="0" port="2"/>
    </redirdev>
    <redirdev bus="usb" type="spicevmc">
      <address type="usb" bus="0" port="3"/>
    </redirdev>
    <memballoon model="virtio">
      <address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
    </memballoon>
    <shmem name="looking-glass">
      <model type="ivshmem-plain"/>
      <size unit="M">64</size>
      <address type="pci" domain="0x0000" bus="0x08" slot="0x01" function="0x0"/>
    </shmem>
  </devices>
</domain>

> cat /proc/meminfo | grep Huge
AnonHugePages:  15607808 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:    4096
HugePages_Free:     4096
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:         8388608 kB

Scripts associated with KVM

bind_vfio.sh
#!/bin/sh
PREREQS=""
DEVS="0000:05:00.0 0000:05:00.1"
for DEV in $DEVS;
  do echo "vfio-pci" > /sys/bus/pci/devices/$DEV/driver_override
done

modprobe -i vfio-pci

vmup.bash
#!/bin/bash
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "performance" > $file; done
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

vmdown.bash
#!/bin/bash
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "schedutil" > $file; done
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

rebuild_kvm.bash
#!/bin/bash
# steps to rebuild the Disk
#confirm drive doesn't already exist
if [ ! -f /dev/md0 ]; then
    sudo modprobe loop
	sudo modprobe linear
	LOOP1=$(sudo losetup -f)
	sudo losetup ${LOOP1}  /opt/kvm/efi1
	LOOP2=$(sudo losetup -f)
	sudo losetup ${LOOP2}  /opt/kvm/efi2
	sudo mdadm --build --verbose /dev/md0 --chunk=512 --level=linear --raid-devices=3 ${LOOP1} /dev/nvme1n1p4 ${LOOP2}
    exit 0
fi

—Guides I’ve used—
Main setup guide - VFIO in 2019 – Pop!_OS How-To (General Guide though)
Optimization guide - Performance optimizations for gaming on virtual machines
Guide to Boot physical Windows install - Boot volume KVM
(I wanted to use my existing dual boot windows install. With that in mind, I followed the guide listed above to just boot that install.)

I have done everything listed in the guides above except- Enable HugePages, Enable MSI Interrupts, adjust any cache settings inside KVM.
Again, If any additional info is needed/helpful please let me know. Thank you for your time and knowledge.

------------Edit1----
I noticed something off with the reported clock speeds inside the VM. Only Core 0 is hitting it’s normal speed 3.7. All other cores are reporting under 1ghz for the most part. I’ve attached 2 pics that were taken a couple seconds after minimizing the game. Also not all of the cores are showing in task manager or HWinfo.
The first is from Doom, you can see good utilization of the GPU and CPU.
2 seconds after leaving doom eternal

The Second is from Metro, The CPU, and GPU are barely being used. This time Metro didn’t go above 10FPS.
1 second after leaving game

If those reported clock speeds are correct that would most likely be either all or most of the issue. I have PBO enabled now but, I’m going to set a manual OC and see if that helps.

------------Edit 2-----------
I’ve been able to get the VM to display the correct number of cores/threads by setting the multiplier to 42(4.2ghz) did not help performance. The reported clock speeds of all cores other than 1 are still below 1ghz. I imagine this is just reporting incorrectly as I ran CineBench and it scores about 1/2 of the expected score. This makes sense since I’ve only passed through half the CPU. I’ve tested the Disk Speeds vis crystal disk mark, CPU via Cinebench(Both single and multi-core) and the GPU via games and they are all as expected when compared to Bare metal. I can’t find any bottleneck with the system and so far this is the only game I’ve found that shows this behavior.
----------------------Solution-------------------------------
The issue was with using the physical windows install and drive. After removing that from the setup it’s working great. I was never able to get Core isolation working correctly but after finding this issue it was not needed. The additional settings regarding “CPUTune” I don’t really think are needed. The performance was great without them but hey.

I still use the noted scripts to remove devices from the host when the VM starts and return them when it stops. These scripts also create/destroy huge pages and adjust the CPU governor. Again, I don’t really think this is needed but I had already done the work.

–Hardware passed to the guest
12 cores (6 cores x 2 threads per)
16gb of memory (Huge pages enabled)
3080ti
–Other
I’m using Scream for audio when not gaming and just passing through a headset when gaming
Looking-glass for video(Still testing if this impacts gaming)

—Virt Manager Config

<domain type="kvm">
  <name>Windows10</name>
  <uuid>99703d6f-909d-445d-b574-1fd80b2c066d</uuid>
  <description>WIndows 10 for gaming</description>
  <metadata>
    <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
      <libosinfo:os id="http://microsoft.com/win/10"/>
    </libosinfo:libosinfo>
  </metadata>
  <memory unit="KiB">15624192</memory>
  <currentMemory unit="KiB">15624192</currentMemory>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
  <vcpu placement="static">12</vcpu>
  <iothreads>1</iothreads>
  <cputune>
    <vcpupin vcpu="0" cpuset="6"/>
    <vcpupin vcpu="1" cpuset="18"/>
    <vcpupin vcpu="2" cpuset="7"/>
    <vcpupin vcpu="3" cpuset="19"/>
    <vcpupin vcpu="4" cpuset="8"/>
    <vcpupin vcpu="5" cpuset="20"/>
    <vcpupin vcpu="6" cpuset="9"/>
    <vcpupin vcpu="7" cpuset="21"/>
    <vcpupin vcpu="8" cpuset="10"/>
    <vcpupin vcpu="9" cpuset="22"/>
    <vcpupin vcpu="10" cpuset="11"/>
    <vcpupin vcpu="11" cpuset="23"/>
    <emulatorpin cpuset="5"/>
    <iothreadpin iothread="1" cpuset="17"/>
    <vcpusched vcpus="0" scheduler="rr" priority="1"/>
    <vcpusched vcpus="1" scheduler="rr" priority="1"/>
    <vcpusched vcpus="2" scheduler="rr" priority="1"/>
    <vcpusched vcpus="3" scheduler="rr" priority="1"/>
    <vcpusched vcpus="4" scheduler="rr" priority="1"/>
    <vcpusched vcpus="5" scheduler="rr" priority="1"/>
    <vcpusched vcpus="6" scheduler="rr" priority="1"/>
    <vcpusched vcpus="7" scheduler="rr" priority="1"/>
    <vcpusched vcpus="8" scheduler="rr" priority="1"/>
    <vcpusched vcpus="9" scheduler="rr" priority="1"/>
    <vcpusched vcpus="10" scheduler="rr" priority="1"/>
    <vcpusched vcpus="11" scheduler="rr" priority="1"/>
    <iothreadsched iothreads="1" scheduler="fifo" priority="98"/>
  </cputune>
  <os>
    <type arch="x86_64" machine="pc-q35-5.2">hvm</type>
    <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE_4M.ms.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/Windows10_VARS.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state="on"/>
      <vapic state="on"/>
      <spinlocks state="on" retries="8191"/>
    </hyperv>
    <vmport state="off"/>
  </features>
  <cpu mode="host-passthrough" check="partial" migratable="on">
    <topology sockets="1" dies="1" cores="6" threads="2"/>
  </cpu>
  <clock offset="localtime">
    <timer name="hpet" present="yes"/>
    <timer name="hypervclock" present="yes"/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled="no"/>
    <suspend-to-disk enabled="no"/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type="file" device="disk">
      <driver name="qemu" type="qcow2"/>
      <source file="/home/pop/Documents/personal/Windows10.qcow2"/>
      <target dev="vda" bus="virtio"/>
      <boot order="1"/>
      <address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
    </disk>
    <controller type="usb" index="0" model="qemu-xhci" ports="15">
      <address type="pci" domain="0x0000" bus="0x02" slot="0x00" function="0x0"/>
    </controller>
    <controller type="sata" index="0">
      <address type="pci" domain="0x0000" bus="0x00" slot="0x1f" function="0x2"/>
    </controller>
    <controller type="pci" index="0" model="pcie-root"/>
    <controller type="pci" index="1" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="1" port="0x10"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x0" multifunction="on"/>
    </controller>
    <controller type="pci" index="2" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="2" port="0x11"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x1"/>
    </controller>
    <controller type="pci" index="3" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="3" port="0x12"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x2"/>
    </controller>
    <controller type="pci" index="4" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="4" port="0x13"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x3"/>
    </controller>
    <controller type="pci" index="5" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="5" port="0x14"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x4"/>
    </controller>
    <controller type="pci" index="6" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="6" port="0x15"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x5"/>
    </controller>
    <controller type="pci" index="7" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="7" port="0x16"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x6"/>
    </controller>
    <controller type="pci" index="8" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="8" port="0x17"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x7"/>
    </controller>
    <controller type="pci" index="9" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="9" port="0x8"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x0" multifunction="on"/>
    </controller>
    <controller type="pci" index="10" model="pcie-to-pci-bridge">
      <model name="pcie-pci-bridge"/>
      <address type="pci" domain="0x0000" bus="0x09" slot="0x00" function="0x0"/>
    </controller>
    <controller type="pci" index="11" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="11" port="0x9"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x1"/>
    </controller>
    <controller type="virtio-serial" index="0">
      <address type="pci" domain="0x0000" bus="0x03" slot="0x00" function="0x0"/>
    </controller>
    <interface type="bridge">
      <mac address="52:54:00:f3:87:c2"/>
      <source bridge="virbr0"/>
      <model type="virtio"/>
      <address type="pci" domain="0x0000" bus="0x01" slot="0x00" function="0x0"/>
    </interface>
    <serial type="pty">
      <target type="isa-serial" port="0">
        <model name="isa-serial"/>
      </target>
    </serial>
    <console type="pty">
      <target type="serial" port="0"/>
    </console>
    <channel type="spicevmc">
      <target type="virtio" name="com.redhat.spice.0"/>
      <address type="virtio-serial" controller="0" bus="0" port="1"/>
    </channel>
    <input type="mouse" bus="ps2"/>
    <input type="keyboard" bus="virtio">
      <address type="pci" domain="0x0000" bus="0x0b" slot="0x00" function="0x0"/>
    </input>
    <input type="keyboard" bus="ps2"/>
    <graphics type="spice" autoport="yes">
      <listen type="address"/>
      <image compression="off"/>
    </graphics>
    <sound model="ich9">
      <address type="pci" domain="0x0000" bus="0x00" slot="0x1b" function="0x0"/>
    </sound>
    <video>
      <model type="none"/>
    </video>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x0a" slot="0x00" function="0x0"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x06" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x0a" slot="0x00" function="0x1"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x05" slot="0x00" function="0x0"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x08" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="usb" managed="yes">
      <source>
        <vendor id="0x046d"/>
        <product id="0xc33f"/>
      </source>
      <address type="usb" bus="0" port="1"/>
    </hostdev>
    <hostdev mode="subsystem" type="usb" managed="yes">
      <source>
        <vendor id="0x046d"/>
        <product id="0xc539"/>
      </source>
      <address type="usb" bus="0" port="4"/>
    </hostdev>
    <hostdev mode="subsystem" type="usb" managed="yes">
      <source>
        <vendor id="0x0b0e"/>
        <product id="0x2465"/>
      </source>
      <address type="usb" bus="0" port="2"/>
    </hostdev>
    <memballoon model="virtio">
      <address type="pci" domain="0x0000" bus="0x05" slot="0x00" function="0x0"/>
    </memballoon>
    <shmem name="looking-glass">
      <model type="ivshmem-plain"/>
      <size unit="M">64</size>
      <address type="pci" domain="0x0000" bus="0x0a" slot="0x01" function="0x0"/>
    </shmem>
  </devices>
</domain>

Thanks to everyone who helped. In my case, the solution was to just use a normal qcow2 disk image rather than trying to boot my physical windows install.

its a long shot, but what happens if you look at your various task manager tabs and then aggressively and quickly move the window about your desktop?
make sure looking-glass is running and connected to its client when you do this.
if you observe a noticeable spike in resource usage during this, then i might have an idea as to what’s going on here.

If I understood you correctly I was moving the task manager windows inside the Windows VM while ensuring LookingGlass was connected. If this is the case the screenshot below shows after about 10 seconds of this. The CPU and GPU hit about 40% while moving that window as fast as I could but as soon as I stopped they both dropped back to under 5. I’m not sure what this performance looks like on Bare metal though. I also tried closing the LG agent on Windows, ensuring the Service was stopped and running the game again. This results in the same behavior as before.

Here are a few benchmarks

The screenshot below is on Bare Metal
In both of these, the GPU was the 3080TI in the second x16 slot running at PCI-E Gen 4 x4. You can confirm this from the Cinebench screenshots. I’m not sure why Windows uses this as it’s not in the primary slot unless it just uses whichever is more powerful.
C: is PCI-E Gen 3 drive
G: is PCI-E Gen 4, also this is the games drive

The screenshot below is the VM with Looking Glass running
C: is the physical windows install wrapped with EFI and passed through per the guide linked above.
G: was passed through via PCIe passthrough

I did notice if I change the CPU type the perforamnce drops dratically, It went from average of 60 to 5. This is what changed about the config

From -
  <cpu mode="custom" match="exact" check="partial">
    <model fallback="allow">EPYC</model>
    <topology sockets="1" dies="1" cores="6" threads="2"/>
    <feature policy="require" name="topoext"/>
  </cpu>
To -
  <cpu mode="host-passthrough" check="none" migratable="on">
    <topology sockets="1" dies="1" cores="6" threads="2"/>
    <cache mode="passthrough"/>
    <feature policy="require" name="topoext"/>
  </cpu>

How are you getting sound from the VM? For me, the only way I could play games without stutters (that would reduce the framerate as well in somw titles and outright crash them in others) was to pass through an USB controller and use an external sound card … I could play without any issues fortnite/fall guys while apex legends would stutter and die in minutes …

Have you tried shielding the VM processes from normal OS processes, and maybe pinning the interrupts and the cores you are assigning to the VM?
If not, this may help in achieving that:

The sound is via Scream. I’ve set it up to send the sound to the virbr0, then the host will grab it there. I removed a lot of the “Tweaks” present in the config and I see more stable performance. I’m still only seeing about 30% utilization on the GPU while gaming with the CPU at around 60%.

I’ve also played around with different cpu types(Host-pass through, host-model, EPYC). From the ones I’ve tested the only one that delivers decent performance in this game is the built in EPYC-Rome.
This is the new version of the config.

<domain type="kvm">
  <name>win10</name>
  <uuid>76c29c2e-5d4b-4938-8d66-00449a60b6b1</uuid>
  <metadata>
    <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
      <libosinfo:os id="http://microsoft.com/win/10"/>
    </libosinfo:libosinfo>
  </metadata>
  <memory unit="KiB">15624192</memory>
  <currentMemory unit="KiB">15624192</currentMemory>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
  <vcpu placement="static">12</vcpu>
  <os>
    <type arch="x86_64" machine="pc-q35-5.2">hvm</type>
    <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE_4M.ms.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/ubu10_VARS.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode="custom" match="exact" check="none">
    <model fallback="allow">EPYC-Rome</model>
    <topology sockets="1" dies="1" cores="6" threads="2"/>
  </cpu>
  <clock offset="localtime">
    <timer name="rtc" tickpolicy="catchup"/>
    <timer name="pit" tickpolicy="delay"/>
    <timer name="hpet" present="no"/>
    <timer name="hypervclock" present="yes"/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled="no"/>
    <suspend-to-disk enabled="no"/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type="block" device="disk">
      <driver name="qemu" type="raw" cache="directsync" io="native"/>
      <source dev="/dev/md0"/>
      <backingStore/>
      <target dev="sdb" bus="sata"/>
      <boot order="1"/>
      <address type="drive" controller="0" bus="0" target="0" unit="1"/>
    </disk>
    <controller type="usb" index="0" model="qemu-xhci">
      <address type="pci" domain="0x0000" bus="0x02" slot="0x00" function="0x0"/>
    </controller>
    <controller type="sata" index="0">
      <address type="pci" domain="0x0000" bus="0x00" slot="0x1f" function="0x2"/>
    </controller>
    <controller type="pci" index="0" model="pcie-root"/>
    <controller type="pci" index="1" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="1" port="0x10"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x0" multifunction="on"/>
    </controller>
    <controller type="pci" index="2" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="2" port="0x11"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x1"/>
    </controller>
    <controller type="pci" index="3" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="3" port="0x12"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x2"/>
    </controller>
    <controller type="pci" index="4" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="4" port="0x13"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x3"/>
    </controller>
    <controller type="pci" index="5" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="5" port="0x14"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x4"/>
    </controller>
    <controller type="pci" index="6" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="6" port="0x15"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x5"/>
    </controller>
    <controller type="pci" index="7" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="7" port="0x8"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x0" multifunction="on"/>
    </controller>
    <controller type="pci" index="8" model="pcie-to-pci-bridge">
      <model name="pcie-pci-bridge"/>
      <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
    </controller>
    <controller type="pci" index="9" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="9" port="0x9"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x1"/>
    </controller>
    <controller type="pci" index="10" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="10" port="0xa"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x2"/>
    </controller>
    <controller type="pci" index="11" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="11" port="0xb"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x3"/>
    </controller>
    <controller type="virtio-serial" index="0">
      <address type="pci" domain="0x0000" bus="0x03" slot="0x00" function="0x0"/>
    </controller>
    <controller type="scsi" index="0" model="lsilogic">
      <address type="pci" domain="0x0000" bus="0x08" slot="0x02" function="0x0"/>
    </controller>
    <interface type="bridge">
      <mac address="52:54:00:41:c8:cf"/>
      <source bridge="virbr0"/>
      <model type="virtio-net-pci"/>
      <address type="pci" domain="0x0000" bus="0x01" slot="0x00" function="0x0"/>
    </interface>
    <serial type="pty">
      <target type="isa-serial" port="0">
        <model name="isa-serial"/>
      </target>
    </serial>
    <console type="pty">
      <target type="serial" port="0"/>
    </console>
    <channel type="spicevmc">
      <target type="virtio" name="com.redhat.spice.0"/>
      <address type="virtio-serial" controller="0" bus="0" port="1"/>
    </channel>
    <input type="mouse" bus="ps2"/>
    <input type="keyboard" bus="ps2"/>
    <input type="keyboard" bus="virtio">
      <address type="pci" domain="0x0000" bus="0x09" slot="0x00" function="0x0"/>
    </input>
    <graphics type="spice" autoport="yes">
      <listen type="address"/>
      <image compression="off"/>
    </graphics>
    <sound model="ich9">
      <address type="pci" domain="0x0000" bus="0x00" slot="0x1b" function="0x0"/>
    </sound>
    <video>
      <model type="none"/>
    </video>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x05" slot="0x00" function="0x0"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x05" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x05" slot="0x00" function="0x1"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x06" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x01" slot="0x00" function="0x0"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x0a" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="usb" managed="yes">
      <source>
        <vendor id="0x046d"/>
        <product id="0xc539"/>
      </source>
      <address type="usb" bus="0" port="4"/>
    </hostdev>
    <redirdev bus="usb" type="spicevmc">
      <address type="usb" bus="0" port="2"/>
    </redirdev>
    <redirdev bus="usb" type="spicevmc">
      <address type="usb" bus="0" port="3"/>
    </redirdev>
    <memballoon model="virtio">
      <address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
    </memballoon>
    <shmem name="looking-glass">
      <model type="ivshmem-plain"/>
      <size unit="M">64</size>
      <address type="pci" domain="0x0000" bus="0x08" slot="0x01" function="0x0"/>
    </shmem>
  </devices>
</domain>

I will try disabling the sound setup, and Lookingglass again for testing.

CPU pinning helps host/guest to use certain CPUs only. That does not mean, they take full control of these CPUs/Cores. I had the same issue and managed to fix it by using CPU isolation. This, completely removes access from host to certain CPU Cores, which are then used by your guest. The process is initiated by a script on VM start, and when you shutdown, it reverts the process, and they are available again to the host.

Take a look in this thread for @anon86748826 solution that worked for me:

Thanks for the info. I will poke around that link as well. I went through and disabled a bunch of Windows Services and it is better. I’m still not to the level I’ve seen on other programs so I know there’s more in it. It seemed like CPU Pinning/isolation should help so your post makes sense as to the missing piece.

I did successfully get GPU passthrough performance on my Ryzen 7 1700. But for whatever reason, performance was roughly 2/3rds to 3/4ths native performance. I decided to it was better (and more performant, usually) to simply run games in WINE or Proton instead.

I never did get to the bottom of why performance was apparently so bad for me in KVM. It could have been an issue with CPU pinning, as some in this thread have stated. But I had already wasted enough time on the project.

FYI: Metro Exodus has a Linux native client (also works great through proton/wine if for whatever you can’t get the Linux version working).

I haven’t had to use my GPU pass through for any gaming since early January (played cyberpunk in VM) this year due to how great the performance of Proton/Wine is getting.

As for sound, I usually just pass through my main sound-card, since if I’m gaming in the VM I didn’t really care about host sounds while I was gaming. And Linux can just grab control of it automagically when you power down your VM.

I’ve also found that with my 5900x there’s very little as far as optimizations required to get good performance in the VM’s (less is more, just having the correct settings works better than any “tweaks”).

The only tweak I have to the defaults is making sure that the topology is set correctly.

image

My 5900X had severe issues with frame drops and stuttering without the tweaks I posted above. Still has few in certain games, but overall the performance is good so I don’t bother to experiment more.

I haven’t been able to set up the pinning yet but in response to the posts about just gaming on Linux. I would like to just do this however I recently tried to play Horizon Zero Dawn. According to ProtonDB, this is a well-supported game and for most people, it “Works Great”. Well not so much for me. It won’t even launch, I then tried all the suggestions in the thread on ProtonDB for this game with no success. I also tried playing half Life. This game would launch but the performance was not good enough. There were a lot of frame drops and hitches. Those two experiences caused me to just wait for the situation to improve as when I have time to game It’s not much and I really want to spend it enjoying the game. I’m going to try the CPU pinning and post back I will also look into using Proton/Wine More.

I installed vfio-isolate via github and later pip3, Then followed This comment to allow it via AppArmor. However, I’m still getting the error below when I try to call it. If I set the file in /etc/libvirt/hooks/qemu then start the VM and run a stress test the VM grinds to a halt. It won’t even do simple things like open the start menu in under 30 seconds. I’ve confirmed my system has Cgroups V2 and modified the script accordingly. As noted in the thread linked above I did not have systemd.unified_cgroup_hierarchy explicitly stated in my grub file. I added it and set it equal to 0 with no change to the error. I’ve also tried disable apparmor completely although I’m not sure if this is fully disabled.

cat /etc/default/grub

GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=hidden
GRUB_TIMEOUT=2
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amd_iommu=on acpi_enforce_resources=lax iommu=pt apm=off kvm.ignore_msrs=1 systemd.unified_cgroup_hierarchy=0"
GRUB_CMDLINE_LINUX=""

--removed all comments
----------------
mount | grep cgroup

tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,size=4096k,nr_inodes=1024,mode=755,inode64)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)

--------
cat /etc/libvirt/hooks/qemu (This is executable)

#!/bin/bash
exec 1>>/home/pop/qemu_hook.log 2>&1

HCPUS=0-5,12-17
MCPUS=6-11,18-23

UNDOFILE=/var/run/libvirt/qemu/vfio-isolate-undo.bin

disable_isolation () {
    vfio-isolate \
        restore $UNDOFILE

    taskset -pc 0-23 2  # kthreadd reset
}

enable_isolation () {
    vfio-isolate \
        -u $UNDOFILE \
        drop-caches \
        cpuset-modify --cpus C$HCPUS /system.slice \
        cpuset-modify --cpus C$HCPUS /user.slice \
        compact-memory \
        irq-affinity mask C$MCPUS

    taskset -pc $HCPUS 2  # kthreadd only on host cores
}

case "$2" in
"prepare")
    enable_isolation
        echo "running Prepare" >> /home/pop/qemu_hook.log
    ;;
"started")
    ;;
"release")
    disable_isolation
        echo "Running release" >> /home/pop/qemu_hook.log
    ;;
esac
----------
This is the output of that logfile
cat /home/pop/qemu_hook.log

FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/vfio-isolate", line 8, in <module>
    sys.exit(run_cli())
  File "/usr/local/lib/python3.9/dist-packages/vfio_isolate/cli.py", line 200, in run_cli
    executor.run()
  File "/usr/local/lib/python3.9/dist-packages/vfio_isolate/cli.py", line 191, in run
    for undo in e.action.record_undo(e.params):
  File "/usr/local/lib/python3.9/dist-packages/vfio_isolate/action/cpuset_modify.py", line 39, in record_undo
    cpus=cpu_set.get_cpus(),
  File "/usr/local/lib/python3.9/dist-packages/vfio_isolate/cpuset.py", line 69, in get_cpus
    return self.impl.get_cpus(self)
  File "/usr/local/lib/python3.9/dist-packages/vfio_isolate/cpuset.py", line 232, in get_cpus
    CGroupV2.ensure_cpuset_controller_enabled(cpuset)
  File "/usr/local/lib/python3.9/dist-packages/vfio_isolate/cpuset.py", line 228, in ensure_cpuset_controller_enabled
    CGroupV2.enable_controller(cpuset, "cpuset")
  File "/usr/local/lib/python3.9/dist-packages/vfio_isolate/cpuset.py", line 223, in enable_controller
    f.write(f"{prefix}{controller}")
FileNotFoundError: [Errno 2] No such file or directory
pid 2's current affinity list: 0-23
pid 2's new affinity list: 0-5,12-17
running Prepare
Traceback (most recent call last):
  File "/usr/local/bin/vfio-isolate", line 8, in <module>
    sys.exit(run_cli())
  File "/usr/local/lib/python3.9/dist-packages/vfio_isolate/cli.py", line 199, in run_cli
    cli(standalone_mode=False, obj=executor)
  File "/usr/lib/python3/dist-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1289, in invoke
    rv.append(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3/dist-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context().obj, *args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/vfio_isolate/cli.py", line 171, in restore
    with open(undo_file, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/var/run/libvirt/qemu/vfio-isolate-undo.bin'
pid 2's current affinity list: 0-5,12-17
pid 2's new affinity list: 0-23
Running release
-------
sudo aa-status

apparmor module is loaded.
13 profiles are loaded.
13 profiles are in enforce mode.
   /snap/snapd/12704/usr/lib/snapd/snap-confine
   /snap/snapd/12704/usr/lib/snapd/snap-confine//mount-namespace-capture-helper
   /snap/snapd/12883/usr/lib/snapd/snap-confine
   /snap/snapd/12883/usr/lib/snapd/snap-confine//mount-namespace-capture-helper
   snap-update-ns.bitwarden
   snap-update-ns.snap-store
   snap-update-ns.spotify
   snap.bitwarden.bitwarden
   snap.snap-store.hook.configure
   snap.snap-store.snap-store
   snap.snap-store.ubuntu-software
   snap.snap-store.ubuntu-software-local-file
   snap.spotify.spotify
0 profiles are in complain mode.
0 profiles are in kill mode.
0 profiles are in unconfined mode.
1 processes have profiles defined.
1 processes are in enforce mode.
   /snap/snap-store/547/usr/bin/snap-store (2240) snap.snap-store.ubuntu-software
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.
0 processes are in mixed mode.
0 processes are in kill mode.

--------
systemctl status apparmor.service 
● apparmor.service - Load AppArmor profiles
     Loaded: loaded (/lib/systemd/system/apparmor.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
       Docs: man:apparmor(7)
             https://gitlab.com/apparmor/apparmor/wikis/home/

I wanted to update this for anyone else that may have the same issue. I finally decided to start from scratch with a new install not using the physical drive and the method outlined in my first post. On a fresh install of Windows, gaming worked as you’d expect, great. I then installed Scream to allow audio pass-through. This also worked great, with no noticeable impact on performance. From there, I set up Looking-Glass. With Looking-glass and Scream(Scream was using the network, not shared memory as this is preferred in their documentation), there was noticeable hitching in the gameplay. This was totally different from what I saw before, as I was able to hit the expected FPS. There would just be frame drops. I finally settled on just passing my headset through to eliminate this issue and using Looking-Glass for video. This final setup is working perfectly. It seems the biggest issue was something to do with the Windows Install. Either the way I was loading it, or something else. That process works great for everything except gaming. I will update the first post with the final configuration. Thank to everyone who helped for their time and knowledge.

What resolution are you playing at?

On dual-channel systems, I’ve noticed memory bandwidth limitations at high resolutions. My primary display is 3840x1600 and I can only eek out ~72fps on indie games, as the game’s CPU intensity creeps up, the Looking Glass performance creeps down. It probably doesn’t help that I’ve got a 5700G, so the second GPU is internal and using system memory for output, but that’s the ITX life.

I’m running dual channel 3600MHz CL16 on my 5700G, but with quad channel 3200 CL14 on my old 1950x, the same display didn’t seem to have bandwidth limitations.

1 Like

I am using 1440P. So far everything has been stable after I moved to a disk image from mounting a real windows install in the way mentioned above.