Intermittent Hardware Crashing Ryzen 7000 series

I picked up a MicroCenter 7700x bundle in march. I was able to build, get it to post, update all firmware, and install windows 11 pro successfully. It runs great 99% of the time but I have been chasing a persistent issue ever since I built it. At random it will crash hard to what I describe as a black screen because the monitor loses connection. The power light on the case remains illuminated but the motherboard displays a solid yellow DRAM error light. Fans continue to spin but it is clear the machine has experienced some kind of hardware issue and is no longer in the OS. Holding the power button on the case will not kill power to the machine. I must switch it off at the PSU. After restarting it the computer boots and runs normally. I run sfc /scannow and often it finds errors and corrects them but not always. The problem happened most often when I was running my RAM at 6000 but since dropping it to 4800 I think it happens less often. I have tried many changes in my bios to try to resolve this issue but I am at the end of my google and troubleshooting abilities. Sometimes I go a month without a crash but sometimes I get two in one day. Most of the time it crashes while idling overnight but it has crashed twice during a game. I have run HCI Memtest to 400% with no errors. Has anyone ever experienced anything like this? I might just have to suck it up and start RMAing components starting with the motherboard but I have been trying so hard not to. Any insight or recommendations are much appreciated.

Build:
Amd Ryzen 7700X
MSI Pro B650-P wifi
GSkill Flare X5 32Gb DDR5-6000
Intel Optane 905p 960GB (OS)
Samsung 990 pro 2TB (Games)
MSI GeForce GTX 1080 Duke
Fractal Torrent Compact
Thermalright Phantom Spirit 120 SE
Thermaltake Toughpower GF A3 - TT Premium Edition 850 W

Welcome!

I had the exact same symptoms happen to me on a 5800x system and after very painful troubleshooting and disk corruption I did end up narrowing it down to memory corruption; the interesting thing was during troubleshooting it would always pass memtest with no errors. What finally clued me in to memory was trying to clone my SSD while using the system (I thought I had to have a failing SSD at the time) and an image checksum error came back.

It was the CPU that was the cause of all the crashes and memory corruption as opposed to a DIMM for me, but I’d suggest trying run on a single DIMM and monitoring for the issue, if the crashes keep happening try switching to the other DIMM and between memory slots, as it is more common for a stick of memory to go bad than the CPU’s memory controller.

2 Likes

Thank you for your timely and helpful reply. I have removed one RAM stick and will run it like this for a while. Here’s hoping for the best.

Make sure you’re running the latest UEFI version for your motherboard, current versions are far more stable with memory and they tend to release once a month. If you need to update make sure to load the UEFI defaults afterwards.

Are you running two or four memory modules?

The free version of MCI’s Memtest is extremely limited and not stressful on the system. I am a big fan of the MemTest Pro version as that can saturate a system & memory in one go, but the free version is capped to a single ~3GB thread and a 400% test run isn’t sufficient for how little it does. For free memory testing https://www.memtest.org/ is probably more useful.

I’d recommend Prime95 in torture test mode. Kick off a Blend run and monitor the temps for a bit, and if none of the threads show an error then it can run overnight. It will load down the CPU and memory both. It can be very good for tracking down CPU and/or memory instability. Doesn’t hurt to also use OCCT’s free version to run an artifact scan on the GPU, but I recommend doing it separately from Prime95.

1 Like

Update: system ran fine on one ram stick until today. Another black screen with dram error. This time cutting power and turning back on did not recover the system. It had an illuminated solid yellow DRAM and solid red CPU lights. I pulled the graphics card and all non essential cables, pulled the ram stick inside it currently and swapped it for the other stick i pulled out before in a different slot. This allowed the machine to post and get back into windows. Will update again if this configuration crashes.

1 Like