It seems to me that Wendel needs to retract that video where he cheered about the 6000 series not having the bug. This is too many samples now from people that seem to know what they’re doing (all three of us have other working cards but can’t get the 6700XT specifically to work).
Hope Wendell is watching this forum, or somebody brings it to his attention.
Mario Limonciello (4):
drm/amd: smu7: downgrade voltage error to info
drm/amd: Check if ASPM is enabled from PCIe subsystem
drm/amd: Refactor `amdgpu_aspm` to be evaluated per device
drm/amd: Use amdgpu_device_should_use_aspm on navi umd pstate switching
I may look into compiling a custom kernel for Proxmox but if anyone can try it out and report if they’ve had any luck, please do…
Hey, I encountered the exact same bug so you can probably extend your list.
I switched from a Vega 56 Sapphire Pulse to a 6700XT PowerColor Red Devil. Like you, I don’t know what to try especially because most people simply just say, that there’s no reset bug for AMD 6000 gen anymore. Hopefully there either already is some sort of fix or there will be one in the near future.
Chiming in here to add, I encountered the exact same bug when I try to power cycle a VM which was using my Radeon RX 5600XT.
The first boot up works well, then the first poweroff throws these entries:
[13003.863261] vfio-pci 0000:43:00.1: can't change power state from D0 to D3hot (config space inaccessible)
[13003.863323] vfio-pci 0000:43:00.0: can't change power state from D0 to D3hot (config space inaccessible)
The VM is offline successfully, and Proxmox continues running.
If I try to power up the VM again, I get the following dmesg stream until my proxmox host goes unresponsive:
This may not be directly applicable because you’re on a 6700XT, but it might seed some ideas…
I fixed my instance of this issue on my Radeon RX 5600XT with the help of github project gnif/vendor-reset. (I can’t post links, likely my account is too new)
I did an apt install -y dkms pve-headers, and then followed the instructions on the readme, and it all worked for me.
I used the latest version of the project from the master branch (master is currently pointing to commit 7d43285 from 2021-10-19), and it worked flawlessly for me.
The “released” version (v0.1.0 from 2021-01-23) did not work for me.
I now have no issues power cycling the VM which receives the passthrough Radeon RX 5600XT GPU.
The vendor-reset only works for older cards, as far as i know. I used it for my Vega 56 before upgrading to a 6700XT and it worked. I already tested the 6700XT with and without the vendor reset but it had no impact at all. If you want to get more information about my situation, you can check this reddit thread where I explained everything a bit more detailed. I’m actually using Arch instead of Proxmox FYI: https://www.reddit.com/r/VFIO/comments/tcxhkw/6700xt_no_ouput_after_reboot/
I would be really grateful if you can actually help me
Hey, I’m using a B550 Aorus Pro from Gigabyte.
The monitor which is connected to the gust GPU is completely blank until I start a VM, even at boot time. Any other monitor that is connected to my host GPU obviously shows some stuff.
I’m also using a Gigabyte motherboard, the X570S Aero G.
I am thinking this may be related to the motherboard and not the GPU. As others have pointed out, BIOS settings can be important but no matter what I tried it has failed.
When I boot up I see “EFI stub: loaded initrd from command line option” but nothing else after that. I see it across both screens (I have the 6700XT alongside an RX550 each with a scren connected) .
If anyone is still following this thread, I have a new development. An X.org developer on phoronix claimed the 6000 series support SBR and that it should be possible to properly reset 6700XTs using that method.
I have tried (unsuccessfully so far), but perhaps someone else may be more lucky and get it to work for them? If you want to give it a go, I’ve posted instructions on THIS reddit thread to help reach more of an audience. I’d appreciate some feedback from people that try it, on whether it helped resolve this for them.
@Mechanical Do you by any chance have a post in forums (reddit / level1techs) where you discussed your issue with the Asus Dual RX 6700XT along with any steps you took to debug/fix it? Something like this thread, but that you created to discuss your own case?
In short: my configuration was 100% correct. I found two cards [EDIT: two 6000 series cards, plus my older RX550 and RX480 with vendor-reset patch] that “just work” as opposed to my Gigabyte Aorus Elite 6700XT which refused to work no matter what.
Clearly there’s something vendor/board-specific that causes this. Please refer to my Reddit post for a list of reported working/non-working cards before you purchase.
I personally highly recommend OCUK for a no-fuss experience if you are in the UK.
I had some similar issue with my gigabyte rx580 being on the 2nd slot on Asus Strix x570-f. I’ve created a repo how I managed to fix it. Viktor Koteski / proxmox-gpu-reset · GitLab