G’day everyone, so as of about two weeks ago I upgraded my GTX 1070 with an RX 6700XT Nitro+, as I had read the 6000 series had the reset issues resolved. Unfortunately I haven’t been able to successfully use the GPU after I shut down a VM with it passed through until I restart my PC.
The output of dmesg | grep -i vfio
after shutting the VM down shows this:
[ 1.079915] VFIO - User Level meta-driver version: 0.3
[ 1.087855] vfio-pci 0000:10:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[ 1.100448] vfio_pci: add [1002:73df[ffffffff:ffffffff]] class 0x000000/00000000
[ 1.112375] vfio_pci: add [1002:ab28[ffffffff:ffffffff]] class 0x000000/00000000
[ 68.337019] vfio-pci 0000:10:00.0: vfio_ecap_init: hiding ecap 0x19@0x270
[ 68.337026] vfio-pci 0000:10:00.0: vfio_ecap_init: hiding ecap 0x1b@0x2d0
[ 68.337030] vfio-pci 0000:10:00.0: vfio_ecap_init: hiding ecap 0x26@0x410
[ 68.337032] vfio-pci 0000:10:00.0: vfio_ecap_init: hiding ecap 0x27@0x440
[ 71.558017] vfio-pci 0000:10:00.0: No more image in the PCI ROM
[ 71.558040] vfio-pci 0000:10:00.0: No more image in the PCI ROM
[ 118.152883] vfio-pci 0000:10:00.0: refused to change power state from D0 to D3hot
[ 118.164981] vfio-pci 0000:10:00.1: refused to change power state from D0 to D3hot
…which as far as I can tell basically means the GPU couldn’t reset properly. When I attempt to passthrough the GPU a third time, the dreaded reset error appears and stops me from trying to boot my VM with the GPU at all:
Failed to reset PCI device: internal error: Unknown PCI header type '127' for device '0000:10:00.0'
My IOMMU groups are pretty much fine without ACS, as shown here:
IOMMU Group 0:
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 1:
00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU Group 2:
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 3:
00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 4:
00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU Group 5:
00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 6:
00:05.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 7:
00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 8:
00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]
IOMMU Group 9:
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 10:
00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]
IOMMU Group 11:
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 61)
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 12:
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 0 [1022:1440]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 1 [1022:1441]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 2 [1022:1442]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 3 [1022:1443]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 4 [1022:1444]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 5 [1022:1445]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 6 [1022:1446]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 7 [1022:1447]
IOMMU Group 13:
01:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Matisse Switch Upstream [1022:57ad]
IOMMU Group 14:
02:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge [1022:57a3]
IOMMU Group 15:
02:03.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge [1022:57a3]
IOMMU Group 16:
02:05.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge [1022:57a3]
IOMMU Group 17:
02:08.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge [1022:57a4]
0b:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP [1022:1485]
0b:00.1 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]
0b:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]
IOMMU Group 18:
02:09.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge [1022:57a4]
0c:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU Group 19:
02:0a.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge [1022:57a4]
0d:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU Group 20:
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon RX 550 640SP / RX 560/560X] [1002:67ff] (rev ff)
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0]
IOMMU Group 21:
04:00.0 PCI bridge [0604]: ASMedia Technology Inc. ASM1184e 4-Port PCIe x1 Gen2 Packet Switch [1b21:1184]
IOMMU Group 22:
05:01.0 PCI bridge [0604]: ASMedia Technology Inc. ASM1184e 4-Port PCIe x1 Gen2 Packet Switch [1b21:1184]
IOMMU Group 23:
05:03.0 PCI bridge [0604]: ASMedia Technology Inc. ASM1184e 4-Port PCIe x1 Gen2 Packet Switch [1b21:1184]
IOMMU Group 24:
05:05.0 PCI bridge [0604]: ASMedia Technology Inc. ASM1184e 4-Port PCIe x1 Gen2 Packet Switch [1b21:1184]
08:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
IOMMU Group 25:
05:07.0 PCI bridge [0604]: ASMedia Technology Inc. ASM1184e 4-Port PCIe x1 Gen2 Packet Switch [1b21:1184]
IOMMU Group 26:
0a:00.0 Network controller [0280]: Intel Corporation Wi-Fi 6 AX200 [8086:2723] (rev 1a)
IOMMU Group 27:
0e:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch [1002:1478] (rev c1)
IOMMU Group 28:
0f:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch [1002:1479]
IOMMU Group 29:
10:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 22 [Radeon RX 6700/6700 XT / 6800M] [1002:73df] (rev c1)
IOMMU Group 30:
10:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 HDMI Audio [Radeon RX 6800/6800 XT / 6900 XT] [1002:ab28]
IOMMU Group 31:
11:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function [1022:148a]
IOMMU Group 32:
12:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP [1022:1485]
IOMMU Group 33:
12:00.1 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP [1022:1486]
IOMMU Group 34:
12:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]
IOMMU Group 35:
12:00.4 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller [1022:1487]
My VM’s XML (if of any use for this problem):
<domain xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0" type="kvm">
<name>win10-NOHOOK-AMD</name>
<uuid>d09d5700-72da-4b55-83c8-7f8187bbfae9</uuid>
<metadata>
<libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
<libosinfo:os id="http://microsoft.com/win/10"/>
</libosinfo:libosinfo>
</metadata>
<memory unit="KiB">16777216</memory>
<currentMemory unit="KiB">16777216</currentMemory>
<vcpu placement="static">8</vcpu>
<os>
<type arch="x86_64" machine="pc-q35-6.2">hvm</type>
<loader readonly="yes" type="pflash">/usr/share/edk2-ovmf/x64/OVMF_CODE.fd</loader>
<nvram>/var/lib/libvirt/qemu/nvram/win10-NOHOOK-AMD_VARS.fd</nvram>
<bootmenu enable="yes"/>
</os>
<features>
<acpi/>
<apic/>
<hyperv>
<relaxed state="on"/>
<vapic state="on"/>
<spinlocks state="on" retries="8191"/>
<vendor_id state="on" value="WhyAMD"/>
</hyperv>
<kvm>
<hidden state="on"/>
</kvm>
<vmport state="off"/>
<ioapic driver="kvm"/>
</features>
<cpu mode="host-model" check="partial"/>
<clock offset="localtime">
<timer name="rtc" tickpolicy="catchup"/>
<timer name="pit" tickpolicy="delay"/>
<timer name="hpet" present="no"/>
<timer name="hypervclock" present="yes"/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<pm>
<suspend-to-mem enabled="no"/>
<suspend-to-disk enabled="no"/>
</pm>
<devices>
<emulator>/usr/bin/qemu-system-x86_64</emulator>
<disk type="file" device="disk">
<driver name="qemu" type="qcow2"/>
<source file="/var/lib/libvirt/images/win10.qcow2"/>
<target dev="vda" bus="virtio"/>
<boot order="1"/>
<address type="pci" domain="0x0000" bus="0x03" slot="0x00" function="0x0"/>
</disk>
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none" io="native"/>
<source dev="/dev/sdb1"/>
<target dev="vdb" bus="virtio"/>
<address type="pci" domain="0x0000" bus="0x09" slot="0x00" function="0x0"/>
</disk>
<disk type="file" device="cdrom">
<driver name="qemu" type="raw"/>
<source file="/home/jeames8kin/Downloads/virtio-win-0.1.208.iso"/>
<target dev="sdb" bus="sata"/>
<readonly/>
<boot order="3"/>
<address type="drive" controller="0" bus="0" target="0" unit="1"/>
</disk>
<controller type="usb" index="0" model="qemu-xhci" ports="15">
<address type="pci" domain="0x0000" bus="0x02" slot="0x00" function="0x0"/>
</controller>
<controller type="sata" index="0">
<address type="pci" domain="0x0000" bus="0x00" slot="0x1f" function="0x2"/>
</controller>
<controller type="pci" index="0" model="pcie-root"/>
<controller type="pci" index="1" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="1" port="0x10"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x0" multifunction="on"/>
</controller>
<controller type="pci" index="2" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="2" port="0x11"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x1"/>
</controller>
<controller type="pci" index="3" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="3" port="0x12"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x2"/>
</controller>
<controller type="pci" index="4" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="4" port="0x13"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x3"/>
</controller>
<controller type="pci" index="5" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="5" port="0x14"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x4"/>
</controller>
<controller type="pci" index="6" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="6" port="0x15"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x5"/>
</controller>
<controller type="pci" index="7" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="7" port="0x8"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x0" multifunction="on"/>
</controller>
<controller type="pci" index="8" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="8" port="0x9"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x1"/>
</controller>
<controller type="pci" index="9" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="9" port="0xa"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x2"/>
</controller>
<controller type="pci" index="10" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="10" port="0xb"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x3"/>
</controller>
<controller type="pci" index="11" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="11" port="0xc"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x4"/>
</controller>
<controller type="pci" index="12" model="pcie-to-pci-bridge">
<model name="pcie-pci-bridge"/>
<address type="pci" domain="0x0000" bus="0x05" slot="0x00" function="0x0"/>
</controller>
<controller type="pci" index="13" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="13" port="0xd"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x5"/>
</controller>
<interface type="network">
<mac address="52:54:00:df:95:66"/>
<source network="default"/>
<model type="virtio"/>
<link state="up"/>
<address type="pci" domain="0x0000" bus="0x01" slot="0x00" function="0x0"/>
</interface>
<input type="mouse" bus="virtio">
<address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
</input>
<input type="keyboard" bus="virtio">
<address type="pci" domain="0x0000" bus="0x08" slot="0x00" function="0x0"/>
</input>
<input type="mouse" bus="ps2"/>
<input type="keyboard" bus="ps2"/>
<sound model="ich9">
<address type="pci" domain="0x0000" bus="0x00" slot="0x1b" function="0x0"/>
</sound>
<audio id="1" type="spice"/>
<hostdev mode="subsystem" type="pci" managed="yes">
<source>
<address domain="0x0000" bus="0x10" slot="0x00" function="0x0"/>
</source>
<rom file="/usr/share/vgabios/Sapphire_RX6700XT.rom"/>
<address type="pci" domain="0x0000" bus="0x06" slot="0x00" function="0x0"/>
</hostdev>
<hostdev mode="subsystem" type="pci" managed="yes">
<source>
<address domain="0x0000" bus="0x10" slot="0x00" function="0x1"/>
</source>
<address type="pci" domain="0x0000" bus="0x0a" slot="0x00" function="0x0"/>
</hostdev>
<memballoon model="virtio">
<address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
</memballoon>
</devices>
<qemu:commandline>
<qemu:arg value="-object"/>
<qemu:arg value="input-linux,id=mouse1,evdev=/dev/input/by-id/usb-Logitech_G502_HERO_Gaming_Mouse_176139713634-event-mouse"/>
<qemu:arg value="-object"/>
<qemu:arg value="input-linux,id=kbd1,evdev=/dev/input/by-id/usb-Keychron_Keychron_K10-event-kbd"/>
</qemu:commandline>
</domain>
When searching through the Level1Techs forums I came across a post where two scripts (Linux Host, Windows Guest, GPU passthrough reinitialization fix) could help with the reset issue, and thought I’d give it a shot. However, when testing the scripts on startup/shutdown and manually running them before shutting down, it was still unable to be reset properly and gave the same errors as before.
Before I shut the VM down, the GPU works completely fine; everything in the VM works fine with no issues, right until I shut the VM down.
I’m at a bit of a loss here, mainly because I believe this GPU flat out can’t reset itself, no matter what I try. Others seem to have success resetting an RX 6700XT, but I’ve not found anything particular to the card I’ve got, so I thought I’d come here to see if anyone else has had success with the same card.
My PC’s specs are as follows:
Ryzen 5 5600X
ASRock X570 Steel Legend
Sapphire RX 6700XT Nitro+ (guest GPU)
Sapphire RX 550 Pulse (host GPU)
48GB of RAM (odd I know, works fine)
Guest in top slot, host in bottom slot
Thanks in advance.
Edit: I made a post to the VFIO subreddit (https://www.reddit.com/r/VFIO/comments/rqyas3/rx_6700xt_appearing_to_not_reset_correctly_after/) if any information provided there could be of use.