RX 6900 XT vfio --> win10 low performance

Hello guys I have been using vfio pt for a long time. I ve got %99.2 bare metal performance with these cards :
RX 580 , RX 590
RTX 2070 Super - RTX 2080 Super - RTX 3060 - RTX 3090
GT 710 - GT 1030

using Ryxen 5 5900X

However I cannot get it my win10 guest work with RX 6900 XT at full performance it stutters dpc latency is high.
I couldn’t figure it out with RX 5700 XT too , even if I use pci-e bridges and create same layout in lspci -tv of host in guest. It did not work. However both cards are (were) working fine in macOS guest. ( Just tested with some games and Unreal Engine 5 early access)

I am on ubuntu 18.04.6 kernel is 5.4.0-91
win 10 is 202H

I have also optimized virtual pci-e bridges so that virtio disks are in one lane , ethernets are in another, and GPU is a totally seperate lane. KVM enlightenment and vendor id is also used. CCX cores are pt but not pinned. It was working flawlessly with NVIDIA gpu’s

Any pointers here ? Any of you succeeded with 6900 XT win10 pt ?

Double check the interrupt in device manager is a negative one. That means message signaled interrupts is on? Which is good.

Any thing in dmesg about hugepages or ? I’d recommend enabling hugepages

post libvirt XML or qemu script of the VM in question.

<domain type='kvm' id='2'>
  <name>win10-gaming</name>
  <uuid>16b5f1f5-bd3d-767a-38e1-d7b18693327f</uuid>
  <memory unit='KiB'>24576000</memory>
  <currentMemory unit='KiB'>24576000</currentMemory>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
  <vcpu placement='static' current='8'>16</vcpu>
  <resource>
    <partition>/machine</partition>
  </resource>
  <sysinfo type='smbios'>
    <bios>
      <entry name='vendor'>CLOCKWORK</entry>
      <entry name='version'>1.02</entry>
      <entry name='date'>10/10/2020</entry>
      <entry name='release'>1.0.0</entry>
    </bios>
    <system>
      <entry name='manufacturer'>Rocksolid</entry>
      <entry name='product'>RCKSLD-WS</entry>
      <entry name='version'>1.0</entry>
      <entry name='serial'>BBB740JSWIGV</entry>
      <entry name='uuid'>16b5f1f5-bd3d-767a-38e1-d7b18693327f</entry>
      <entry name='sku'>22DC3DDA-7029-133B-7717-24614E630F5C</entry>
    </system>
    <baseBoard>
      <entry name='manufacturer'>CLOCKWORK</entry>
      <entry name='product'>R539NVR39VM</entry>
      <entry name='version'>0B92312 Pro</entry>
      <entry name='serial'>1092956J3U4QMRGUAA2B</entry>
    </baseBoard>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-q35-2.11'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/win-virtio-2_VARS.fd</nvram>
    <boot dev='hd'/>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vpindex state='on'/>
      <synic state='on'/>
      <stimer state='on'/>
      <reset state='on'/>
      <vendor_id state='on' value='KVM Hv'/>
    </hyperv>
    <kvm>
      <hidden state='on'/>
    </kvm>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='16' threads='1'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hpet' present='no'/>
    <timer name='hypervclock' present='yes'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/bin/kvm-spice</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/media/gediz/VMs/win10-20H2-vrtio-updated.qcow2'/>
      <backingStore/>
      <target dev='vdc' bus='virtio'/>
      <alias name='virtio-disk2'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/media/gediz/VMs/NTFS_DISK.raw'/>
      <backingStore/>
      <target dev='vdd' bus='virtio'/>
      <alias name='virtio-disk3'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </disk>
    <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
    </controller>
    <controller type='pci' index='1' model='dmi-to-pci-bridge'>
      <model name='i82801b11-bridge'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x8'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x9'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0xa'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0xb'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0xc'/>
      <alias name='pci.6'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0xa'/>
      <alias name='pci.7'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='8' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='8' port='0xb'/>
      <alias name='pci.8'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
    </controller>
    <controller type='pci' index='9' model='pci-bridge'>
      <model name='pci-bridge'/>
      <target chassisNr='9'/>
      <alias name='pci.9'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </controller>
    <controller type='pci' index='10' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='10' port='0xc'/>
      <alias name='pci.10'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='usb' index='0' model='nec-xhci'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <interface type='network'>
      <mac address='52:54:00:e0:b9:bc'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </interface>
    <input type='mouse' bus='ps2'>
      <alias name='input0'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input1'/>
    </input>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x2f' slot='0x00' function='0x3'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x2d' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev1'/>
      <rom file='/usr/share/vgabios/RTX3060.rom'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x2d' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev2'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>
    </hostdev>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+0:+0</label>
    <imagelabel>+0:+0</imagelabel>
  </seclabel>
</domain>

I am using RTX 3060 with ryzen 7 3700X atm. I have handed my reasearch computer with 6900 XT to the company it belongs to.

However xml is the same put rom file of 6900 XT instead of 3060 and increase the cores to 12 instead of 8. Other than that everything is the same.

Ok - it is SOLVED - achieved pass-through of 6500 XT

History:
So far all GTX and RTX model has worked fine with ubuntu 18.04 kernel 5.4.0-97 win10-202H (including gtx 710 - 1030 - RTX 3090 everyone) with vendor id . I have managed dual xeon macOS-KVM and all tiers of ryzen cpu’s and also RX 400 500 series cards were working also fine and RDNA and RDNA2 cards were working up to the point of ! AMD driver update !.

Symptomps:
After driver update RDNA and RDNA2 started to work sluggishly. Meaning,
High latency dgxkrnl , HdaBus , wdf1000 , storport dll 's in Latency Monitor.
It nearly gave half frame. I started pinning cpu’s for consistent cache usage (never needed before) , There was an overall fluency increase however whenever win10 starts a DirectX application stuttering began.

Unsuccessfully tried:

  • Putting audio on same slot but different function
  • Putting card on a separate bus
  • Omitting sound device of Graphics Card
  • Optimizing Timers hpet etc …
  • cpu pinning - isolating ( I didnt think this would work - just practice )
  • replicating ioh3420 pcie-root-port and pci-e downstream -upstream switches like VEGA on libvirt on qemu 2.11 on qemu 6.1 (compiled from source)
  • SMT enable disable , PSS enable (MSI X570 PRO - A)
  • Easing ram timings (16x4 GB 2666 Kingston CL19 modules I have maybe they are sluggish but using hugepages with 12 GB of ram.
  • From bios adjusted number of numa nodes and their split NSS4 I guess
  • Onboard sound card pass-through - omitting sound device
  • Editing radeon driver showing it like antoher card
  • Could not use other bios techpowerup does not have RX 6500 XT bios dumped
  • pcie_aspm=0 or off whatever grub kernel parameter
  • pcie_resources=lax

REALIZED THAT GPU-Z showing bus as PCI then I have tried lots and lots more I wont bother you guys

At final point I ve got itchy and said to myself huge amount of these operations are kernel dependent why bother. I have installed archlinux and everyting has worked out of the box. It even comes host-passthrough ticked by default …

Versions:
Linux ggArch 5.16.14-arch1-1 #1 SMP PREEMPT Fri, 11 Mar 2022 17:40:36 +0000 x86_64 GNU/Linux
libvirtd (libvirt) 8.1.0
QEMU emulator version 6.2.0

Generally I hate to update because , servers and production systems I build for customers are tested for months and stabilized. When I update zillion things can go wrong (and they do ) So I refrain updating software systems unless I must.

But I am gonna probably switch to archlinux from ubuntu 18.04

2 Likes