RX 5700 XT crashing

Certain games are causing my card to crash. Basically all of my monitors go into power saving mode. Audio still works, the system is still responsive, if I’m in mumble I can continue talking to people, the card just dies. The only way to get video back is to reboot my system. This happens when running Unigine superposition, playing nier automata, playing ark survival evolved and probably other things that cause high load. This happens under kernel 5.3, 5.4, and 5.5. Syslog doesn’t provide any useful information, there are absolutely no amdgpu print outs at all which I find unusual. One other thing to note is if I have sensors running and outputting to a file all of the readings for both temperature and power go to N/A when this happens, my guess is either the card has a physical problem or the driver is just dying on me but I don’t know how to troubleshoot this as syslog is quiet. Any ideas? My system details are below

OS: Debian sid
Kernel: 5.5.2
Mesa: 19.3.3
Card: Sapphire Nitro+ RX 5700 XT

You’re seeing a GPU hang. This could be due to a lot of things, but the most likely reason is that the driver sends a ‘bad’ shader program to the GPU that causes it to hang. Some versions of LLVM have some known issues like this.

What you can do:

  • Install LLVM 9.0.1 and see if the problem is still there (both 9.0.0 and 10 RC1 have issues)
  • Try using ACO by setting the RADV_PERFTEST=aco environment variable
  • Upgrade to a newer mesa release

Let us know how it goes.

1 Like

So, few updates to this that I sort of forgot about. First off I actually have logs from these crashes, REISUB did not sync my FS and so syslog was not written fully. I guess I didn’t wait long enough after the S. Second I have an open bug which has those logs and other information about this problem https://gitlab.freedesktop.org/drm/amd/issues/1047. However to address your points. I’ve been using LLVM 9.0.1 specifically 9.0.1-8 from the debian repositories. ACO does not help. Upgrading to mesa 20.0-rc3 fixes OpenGL however not Vulkan which still dies both with LLVM and ACO.

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.