RX5700XT Crashes while Gaming

Any help is VERY appreciated. Sorry for the long post. I wanted to be as detailed as I possible could.

My RX 5700XT crashes on both Linux and Windows on default clocks, undervolted, underclocking, limiting power, and limiting clock speed. I could be doing something wrong since I’ve never messed with this stuff before, but it crashes and I’m annoyed.

All my hardware, I even rebuilt around Dec 2020 before RMAing in Jan. It would crash around then, but I haven’t really gamed a lot because of school and I was playing more retro games that didn’t peg the GPU at 99%. I haven’t tested the RMA’d refurbished card with any other hardware.

Hardware:

  • CPU: Ryzen 7 5800X
  • MOBO: Gigabyte X570 AORUS
  • GPU: MSI RX 5700XT Gaming X
  • RAM: Corsair Vengeance 32GB (4x8GB), before the rebuild was 2x8GB.
  • PSU: Seasonic FOCUS GX-850

Solutions I’ve tried:

  • Re-seating GPU
  • Changing voltage, clock speed, fan curves, P-states. I could be doing this wrong though. I also couldn’t find any proper testing to test the stability without paying like $30 for a test suite.
  • Bypassing surge protector ??? just trying everything
  • Changing PCIE 4.0 to PCIE 3.0 for BIOS

Nothing worked.

I’m trying to find a way to test stability, adjust P-states for a RX 5700XT (perhaps a voltage chart or some way of knowing how to test the changes, etc).

How it crashes (i think)
To be honest, it could be completely random, but I will list the games that I have noticed crashes in (since it doesn’t crash in all). It generally would crash after about 1-2 hours of gameplay, but sometimes it can crash only 15-30 minutes in.

Crashes experienced in:

  • Phantasy Star Online 2 New Genesis (GPU utilization at 99% all the time at P3-state)
    Max P3-state was tested with 1800MHz and using default 2055MHz and different voltages
  • Valorant (crashed, even with FPS limited, didn’t do testing at the time)
  • Risk of Rain 2 (Linux, never tested yet)
  • Sea of Thieves (99% utilization on Linux, several times no testing of different settings)
  • Halo Infinite (99% utilization on Windows, several times no testing of different settings, it hit P3)
  • VRChat on Windows sometimes.
  • Bloons TD 6 (Doesn’t use much GPU but would crash when getting to higher levels with a lot of particles, which could be when it is peaking

The thing is that it didn’t crash in VR a lot of the times. I’m not sure why.

Now this isn’t a concrete benchmark or anything, but I ran a GPU-Z log on Halo Infinite just sitting at main menu where it was pegged at 99% utilization at max P3-state. It crashes where the clock randomly just stops at 2055. This is using the AMD Radeon default settings for the MSI RX5700XT Gaming X. The max P-state is normally 2055 or something. Like I’ve stated before, I’ve tried downclocking this and it still results in crashes with different voltages like 1200, 1100, and 1000.

Oh, also, on my Linux box I set the clock speed to limit at 1800MHz and it seemed to make games last longer but still crashed in games like Bloons, Risk of Rain 2, and Sea of Thieves. (Games that use a lot of processing power?).

sorry if i get in trouble for including a link (it didn’t let me), but I feel its relevant for diagnosing. Feel free to not use it. Crashes at line 625-637 or row if you are changing it to a CSV and looking Excel.
pastebin . com/w5DERWnf

What kind of BSOD code do u get under windows when it crashes? That usually gives an indication

Try gaming with only one ram stick at a time and work through them, that way see if it continues to crash

How’s the psu? U have a spare? Run through ram sticks first, then run checks on the psu. psu testers are cheap

Refurb card? Some shops send already broken or semi broken stuff to people. It happens, sometimes a bit often. If you’ve CHECKED ram properly and the psu, it’s almost certainly the graphics card that is borked

@0comment
I don’t get a BSOD crash and never have gotten a full crash until I started changing voltages for undervolting. Its always a driver crash both on Linux and Windows. Haven’t tested the RAM, since it happened before without it, but I will try it out (never hurts right). I know its not the PSU since I used a brand new one when I rebuilt and tried my old one just in case.

I’ll get back on the RAM test. The problem is its hard to cause the crash in benchmarking utilities. The best way that I could get a normalized crash was just buy leaving Halo Infinite open on the home screen lol, or PSO2 open.

The problem is that if I contact them for an RMA again (I have before I think Dec 1), I feel like I’ll run into the same issue :/. I also didn’t want to pay $40 for shipping again.

Don’t use benchmarking utilies or memtest whatever. Pull ram sticks out, add one at a time and work through them.

Check the psu, if u have a spare or something. Do it anyway, it could still cause the kind of errors that youre describing.

If u end up RMA’ing it, send them a notice, email or whatever of what happened and how. Ask them kindly to send a card that isn’t refurbed. Ask for a sealed box, if need be. Its basic consumer rights u have, so use them :slightly_smiling_face: :upside_down_face:

For trying to get the GPU to crash or test its stability, is there a utility for that? I tried Furmark but it limited the GPU to 1438MHz. Or should I just do my Halo Infinite test and leave it open lol.

Loads of utilities available for that, but may throttle the card as to not letting it crash. So take your pick, its fine.

If the shop wants 40$ then ask to get those 40$ back if the card IS FAULTY, if not you can make the payment in regards to it

For testing the PSU, I did a test in June and crashes still happened with PSU swap. I can re-do it again, but I rather not, since its a lot of cables.

For RAM, I’ve tried a RX 480 and 1070 on the same system and had no crashes with that. I think the 1070 has similar power requirements, so I think that would mean both the PSU and RAM are fine, right? I’ll still be testing the RAM just to make sure.

The reasoning for the tests is because crashes are not consistent, and I could be mislead by a game not crashing. Anyways, I’ll get back to you on the RAM since it won’t take long to do.

Why do i keep mentioning ram? Because it’s the peskiest of components, hard to diagnose but very easy to solve, thats why. Also usually a common problem can’t or dont know how to fix themselves. Perhaps because people generally think GPU > PSU > RAM in that order when it’s more likely like this RAM > PSU > GPU

From what youre describing it does sound like it’s the gpu thats the problem. Also the 5000 series are known to cause issues, many people have had problem with them. There is also that.

To add some more. The 5000 series does not have have the stability that rx 400 + 500 series offer. Also the 6000 series have fixed those 5000 series issues it seems :slightly_smiling_face:

@anon94072931
So I’ve just ran for about 1.5hrs, and I have not had any crashes with 1 RAM stick. If it is the RAM, should I just test each RAM stick individually? I have two pairs. One is DDR4 3200 with 16-18-18-36 and the other is DDR4 3000MHz with 15-17-17-35.

CMK16GX4M2B3200C16 2x8GB
CMK16GX4M2B3000C15 2x8GB

Again, these crashes are sometimes random and dont happen within that interval, so I could have been lucky and just didn’t get a crash. I used the default Radeon settings.

edit: funny thing is. the ram is the only thing I carried over from the old build with the GPU.

oh I should also note that the average clock rate with 1 RAM stick was 2001 versus 2030 with all RAM.

Yes. That way is more secure in testing vs several ram sticks at a time :+1:

I’ve tested each individual RAM stick with at least 60 minutes to 90 minutes of testing each, and I have yet to have a crash on any of them. If it is crashing due to RAM, I think it might be because I am mixing two different kits (one 3000MHz CL16 and one 3200MHz CL18).

I’m gonna run on the two new DDR4 3200MHz CL18 kits for two days worth of gaming and see if it crashes.

1 Like

@lordnature, as soon as you mentioned you were running two other kits, I said to myself that was the problem. I would be surprised if changing your ram tricks didn’t fix the problem.

1 Like

So far I have had no crashes since removing the two extra kits. Also an XMPP profile was enabled on all four kits last time. I might be able to run both kits if I did some custom timings. Not sure how I would work that out though. The best case would probably be to flip both kits and get a new 32GB DDR4 kit. Going to run it for two more days just in case it happens to not be the issue, but it looks like that is.

Also after further reading, the 5000 series is VERY sensitive to RAM timings being even the smallest bit unstable, so this could be why the 1070 had no issue when I tested it. I’m guessing the RX 480 would have bottlenecked the RAM so it wasn’t running at max speed (or it could have been a heat issue).

Well this could basically be possible however this highly depends on the said kits.
If you are kinda a novice user in regards to memory timings and tweaking,
then i wouldn’t really recommend it.

I would suggest it might be better to just buy a 2x 16GB CL16 3600mhz kit,
if you need 32GB of memory.
Something like Cruicial Balistix those are generally reasonable priced.
And work fairly well with Ryzen from what i have read.

@MisteryAngel @anon94072931 @Shadowbane
Just ran into another driver crash after 2.5hrs of gaming today. This is with just the one kit in with the correct XMP profile. I’m going to try disabling the XMP profile and see if I still run into crashes. Not sure what’s causing it :/.

It’s really hard to diagnose the problem when I don’t have a repeatable way of just running a stress test for like 3 hours. I have to play games to test it.

edit: could it possible be the RAM overheating since its next to a CPU cooler? I mean I have adequate cooling and generally everything runs around 60*C, but I’m just taking guesses. I’ll try moving the RAM to the other paired slots too.

No memory overheating issues are not very likely.

Also in regards to the gpu maybe you could try to run it at stock voltages no undervolting.

Currently running on stock. I did the undervolting, etc to try to stop it from crashing, but it never worked. Temperatures aren’t an issue from what I see (60°C regularly with hot spot at 100°C)

I have a 5800x + 5700xt system and play Bloons TD6, so I can look into this when I get home.

Thing is: I also had random crashes for a long time, also thought it was graphics and it turned out to be the motherboard only just not being okay at 2800MHz RAM clock. (CMK16GX4M2B3200C16 on ASRock X370 Gaming K4)

Take a spare case fan and some string and hang it where the side panel would be pointing towards the RAM :man_shrugging:

I am surprised running your ram at the same speed, and timing did not fix your crashes. Have you tried using graphic driver remover software and reinstalling the graphic cars drivers? I have heard AMD’s 5000 series graphic cards have many bad batches that need to be returned to the manufacturer to be fixed. So before returning your graphic card or considering purchasing a new AMD 6000 series graphic card, I would try removing the graphic card driver and then reinstalling it.

Yea, hopefully this is the case because I don’t want to RMA the GPU :/.

While I haven’t tried DDU recently, I tried it back in July, and it didn’t fix any issues. Also tried reinstalling Windows.

I also use Linux where I had the same crashes (driver crashes to green screen).

edit: also I couldn’t even swallow the pill of buying a new GPU when they are like 2x the MSRP right now due to inflation and shortages.