Graphics corruption with amdgpu driver (Fury & Vega)

I’ve been running into some odd issues with the amdgpu driver lately. Sometimes when using Proton with a Vulkan or DX12 game, or recording video with OBS, the game will turn into a rainbow colored puddle, like so:

Usually switching to a console, then restarting the display manager will get me back to the desktop. So far the only way I’ve found to consistently avoid this issue is 1) use DX11 (if available) or 2) not put the system under load and skip recording with OBS.

Does anyone know what might be causing this? It’s happened on two separate systems, one with an R9 Fury and the other with a Vega 56.

What are the idle temps for the card? I’ve had that happen when the card is stuck with 0 rpm mode or not ramping up the fan speed on demand.

Interesting. Can you log your temps and your memory clock and voltages as well as your core clock. Lets see what might be going on. Ive had this happen on both AMD and NVidia when the thermal paste starts drying out. Also does setting a custom fan curve remove the issue?

R9 Fury and the other with a Vega 56.

looks like a temp or memory issue, but on both cards?

check high precision event timer is enabled in bios/eufi
if not enable it and give it another go.

Setting a fan curve didn’t seem to resolve the problem. I can probably work out a script to dump lmsensors output but I’m suspecting that temps aren’t the issue.

Idle temps on the Vega are 35C - 40C (as reported by lmsensors). That’s really not too bad. So far I haven’t been able to get mangohud working to monitor temps in-game.

Another data point: the graphics corruption in one game (Baldur’s Gate 3) is a known issue when using the Vulkan API, and DX11 is recommended instead (per protondb.com.) That was the Vega machine, and I’ll try a few more games and see what happens with it.

Yeah that seems fine, If you manually set the vulkan icd loader with:

VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/radeon_icd.x86_64.json

You can throw that in the steam launch options for a game does it help?

as i said before you may need hpet (high precision event timer) enabled in bios/uefi.
if its disabled then mangohud will throw errors.

when hpet is off and a game crashes it will do one of a couple of things.
freeze with a banned colour screen, and looping audio.

freeze with screen buffer overflow
your image where the wrong part of vram is being displayed to the screen, resulting garbage output.
this is often just a soft crash where you can close the game and restart the drivers.

and finally.
random bsods with apparently random causes, none of which appear connected, but are because of hpet use they just fail randomly.

while i am talking from windows experience, a lot of linux distros also uses hpet and for some reason it still ships in a default off state on some boards.
so worth checking at least :slight_smile:

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.