6700XT - reset bug?

I have a 6700XT- I followed this obscure post about flashing vbios with a specific one and it worked! amdgpu: GPU reset for Radeon RX 6700XT completely broken (#2709) · Issues · drm / amd · GitLab

2 Likes

THIS IS HUGE!

Never in my life thought I’d be able to get my Asus RX 6700 XT Dual OC working with a reset, in fact I have tried flashing different card VBIOSes previously with so-called “working” 6700 XT cards, but nothing have worked. I am legit flabbergasted that this specific card with a VBIOS from 2022 worked.

Let me just say this again:

IT WORKS

I want to kiss these guys. Every single one of them.

I can’t say I have used this for long as I’m posting this immidiately after attempting this and using the VM for a few minutes, rebooting 5 times and testing stuff out. So keep this in mind. Things may go haywire, but so far so good. Will report back in a week.

Any thoughts on this “solution” @gnif @wendell ? It really seems to be the GPU vendor screwing up the VBIOS or something since that actually “solves” the issue? Or am I talking out of my ass?

As for a guide how to do it:

First off you need to check your card’s memory vendor and you need amdvbflash. I used version 4.71.
You should be able to find your memory vendor by checking GPU-Z in Windows using the stock VBIOS on your current card. Pass it through to your WIndows guest as usual and check GPU-Z for further information.

The memory vendor and chip needs to be one of the following:

12288 MB, GDDR6, Hynix H56G42AS8DX014
12288 MB, GDDR6, Micron MT61K512M32C
12288 MB, GDDR6, Samsung K4ZAF325BM

If it isn’t you’re taking a gigantic risk flashing anything.

Next up is dumping the VBIOS. You can also do that through GPU-Z, but I’d do it through Linux as well to ensure the VBIOS is matching. You need to turn off VFIO anyway to flash a VBIOS. Comment out vfio-pci binding the GPU at boot like in /etc/modprobe.d using # and rebuild initramfs using something like mkinitcpio -P and rebooting. The GPU needs to be taken by amdgpu module.

The command for dumping the BIOS using amdvbflash is:

# ./amdvbflash -s 0 MyOriginalVBIOS.rom

You can also just dump it through sysbus:

# cat /sys/bus/pci/devices/0000:0X:00.0/rom  

Where the X is the hexadecimal digit for where your GPU is located. You can find it using lspci -vvv . You probably know yours already.
Dump multiple version of your VBIOS and ensure their checksum are the same so that you don’t sit there with a corrupted VBIOS backup and put it on another machine/cloud to ensure you have it around.

As everyone points in every guide about these kind of things, but it needs to be said anyway:

**Flashing any VBIOS, especially cross-vendor like this, will 100% destroy your warranty and you might sit there with a brick. **

Use caution and remember it is your own fault if it never boots up ever again. You should have dual-BIOS possibilities on your card to be safer. Single-BIOS is very dangerous. You have been warned.

So to flash the thing, get the working VBIOS from here: VGA Bios Collection: Sapphire RX 6700 XT 12 GB | TechPowerUp
The MD5 for this should be bbcf8fd1e226609094cd2283b3ea2259

Next up you need to flash the thing.
The -p 0 here means card 0, make sure you’re using the correct card and not your host’s card!!!

# ./amdvbflash -i # Check that 0 is the card you want to flash first!
# ./amdvbflash -fs -fp -fv -p 0 ./249630.rom # This is the actual flashing

That will essentially flash and change the IDs of the card to match the ROM.

You can verify that the ROM was correctly flashed by using:

# ./amdvbflash -v 0 ./249630.rom

If everything looks ok, shut down the system completely, cold boot and pray things work out. If it boots up in Linux now without any issues you should re-enable the VFIO-PCI module so it binds the card on the next boot. Don’t forget to rebuild initramfs (mkinitcpio for Arch).

Lastly, use your original VBIOS and pass that to the guest. That way the card will use the correct clocks for your card and make it run like it used to.

I would NOT recommend using the Sapphire RX 6700 XT on the guest, use your original VBIOS so it runs correctly.

Doing so is the standard fare
In Virsh:

<rom file="/Path/To/Your/Original_VBIOS.rom"/>

Enjoy your resetting 6700 XT.

Hopefully that was helpful for someone, but holy damn I did not expect a “solution” for this to actually pop up. Thanks to everyone involved, I’ll now go test this further.

6 Likes

Same, though. I just got this card and was disappointed to see that it had a reset bug and is indeed a vbios issue. It was a nice “holiday treat” to be able to get this card to work this way.

1 Like

I’m the author of that amdgpu reset issue that was referenced above. I think most likely scenario is that both AMD and Vendor are at fault. AMD for not supporting FLR and inventing instead this fragile hack which is very sensitive to pairing of specific chip and current loaded firmware and its state. And vendors which were shipping misbehaved vbios which was plaguing users for years with black screen issues regardless of OS or driver stack. It will be hilarious if the problem turn out to be that at some point in time vendors were soldering new chip revisions alongside old chip revisions. But specific vbioses are required for each revision to function properly but all batch had been shipped with the same vbios regardless.

As for using this vbios long term I’m daily driving this vbios (dubbed NAVI22XTLH) on my Sapphire Nitro+ for ~1,5 month already without any issues. And if anything after reflash card has become rock solid. No more random crashes, black screens and even rocm with pytorch is now working for hours without gpu freezing or producing random half precision errors. Though same as with stock vbios I’m undervolting it to 1125mV, 2424Mhz gpu clock and 950Mhz memory which is around 140W down from 210W. So it is much cooler, less power hungry and have only 5% performance loss. Otherwise on stock settings my unit had ~30C delta between temp and hotspot on some workloads under full load and I don’t want to have hotspot over 85C anyway.

5 Likes

THANK YOU SO MUCH!

This appears to have fixed my XFX 6700 XT. Your guide worked perfectly, the only issue i had was using a newer version of amdvbflash which complained with an “SSID mismatched” error, but using 4.71 worked.

Thank you so much, I was having a similar issue with my RX 6600 (Asus Dual), I could reboot the VM and it looked like it worked, but as soon as I opened a game the entire pc would hang for a few seconds and then reboot (X570 Gaming X motherboard, F39 BIOS). But reseting it that way works perfectly.