Hello all, i finally managed to get my hands on a 6800XT! and i was stoked since the reset bug was supposed to be fixed. and while it indeed does reset properly for like switching to a different os vm or just rebooting it. it would seem that it still cant reset after a crash of the GPU which means i have to reboot the entire host so i may as well still be on 5700XT. sometimes when playing games like dyson sphere program i will just get a random GPU crash at which point the host DMESG gives me the following
[74544.471599] vfio-pci 0000:23:00.1: can't change power state from D3hot to D0 (config space inaccessible)
[74544.471900] vfio-pci 0000:23:00.1: vfio_bar_restore: reset recovery - restoring BARs
[74544.552306] vfio-pci 0000:23:00.0: vfio_bar_restore: reset recovery - restoring BARs
[74544.556718] vfio-pci 0000:23:00.1: can't change power state from D3hot to D0 (config space inaccessible)
[74546.821513] pcieport 0000:22:00.0: not ready 1023ms after bus reset; waiting
[74547.941476] pcieport 0000:22:00.0: not ready 2047ms after bus reset; waiting
[74550.031439] pcieport 0000:22:00.0: not ready 4095ms after bus reset; waiting
[74554.181357] pcieport 0000:22:00.0: not ready 8191ms after bus reset; waiting
[74562.501197] pcieport 0000:22:00.0: not ready 16383ms after bus reset; waiting
[74579.140870] pcieport 0000:22:00.0: not ready 32767ms after bus reset; waiting
[74612.430202] pcieport 0000:22:00.0: not ready 65535ms after bus reset; giving up
[74612.563987] vfio-pci 0000:23:00.1: can't change power state from D3hot to D0 (config space inaccessible)
[74614.810155] pcieport 0000:22:00.0: not ready 1023ms after bus reset; waiting
[74615.870132] pcieport 0000:22:00.0: not ready 2047ms after bus reset; waiting
[74617.950095] pcieport 0000:22:00.0: not ready 4095ms after bus reset; waiting
[74622.110018] pcieport 0000:22:00.0: not ready 8191ms after bus reset; waiting
[74630.339855] pcieport 0000:22:00.0: not ready 16383ms after bus reset; waiting
[74646.979526] pcieport 0000:22:00.0: not ready 32767ms after bus reset; waiting
[74680.258872] pcieport 0000:22:00.0: not ready 65535ms after bus reset; giving up
[74680.261208] vfio-pci 0000:23:00.0: can't change power state from D0 to D3hot (config space inaccessible)
[74681.445191] vfio-pci 0000:23:00.1: can't change power state from D3hot to D0 (config space inaccessible)
it seems the card just drops off the bus entirely and it cant be reset. is there anything i can do to attempt to recover or am i just doomed to restarting my server whenever the driver wants to crash? i wish GPU just had a physical reset on them lol PCIE supports hot swap (well if anyone cared to actually implement it) so a hard reset would be fine with VFIO