Ryzen crashing while idle

Yep i adjusted those voltages(and ram but that is separate) and set ram to 3733 with tight timings. It’s rock solid ever since i did that, windows and linux. Sometimes it was left compiling for hours on 24 threads with no problems whatsoever.

I RMA’ed my 3700X and received a new just before Christmas. While things looked promising initially, the computer crashed again shortly after logging into Windows. I haven’t had any problems with Linux yet. MSI has released a beta bios update for my motherboard. I may try that in the meantime.

Nvm, the beta bios is far too buggy right now.

I do have to wonder though about my power supply. I’ve only had it for a few years, but it is an older model. It’s a Antec HCP 1000W Platinum. Would it being old matter? Also, would it matter if the CPU cable is plugged into the 12v1 slot as opposed to the 12v2, 12v3, or 12v4? I really know next to nothing about power supplies.

Almost certainly the psu. I had an old antec with the racing stripe and it was just unusable even with first gen ryzen.

Wonderful. Well, hopefully that’ll be an easy enough fix.

Alright, I replaced the psu and it happened again. Motherboard maybe? I could also try reinstalling the os. Beyond that, idk what to do.

idle current still set for typical not low?

and what kernel version?

The idle current is set to auto and the kernel version is 5.9.11-3

set to typical not auto, auto is low on almost all boards.

what psu is the new psu, and date of manufacture?

Ok. It’s a RM850x 2018. The model number is CP-9020180-NA.

EDIT: Just crashed again after setting the idle current to typical.

2 Likes

hmmm disconcerting
try disabling c states?

I’ve tried it before, but it didn’t seem to make much of a difference. That WAS before the psu replacement though, so I’ll try again.

I’ve been trying all the things I can find in other topics across bugzilla, reddit etc, but nothing worked so far, except disabling SMT. This is right now 100% issue free configuration. Ran it for 4 days without a single reboot, ran it another 1-2 days after enabling SMT, but then issue cam back, switched it back off and now running for 2 days again without a single reboot, so I guess it narrows it down… .UPD: as I was writing it, I’ve got my reboot…so I guess there is no configuration when it works now…I don’t know what those 6 days over Christmas was, I guess I got lucky?
After some conversation with AMD support they asked to open an RMA, but I don’t think that new CPU will fix it, because there are bunch of new people just getting their 5000 series with this exact problem.
Fedora is testing 5.10 kernel next week, but as far as I can tell this is not of any improvement.

I think I’ll try to swap CPU between my machine and my wife’s, they are almost identical, except for the GPU (nVidia RTX 3070 in hers and RX 5700XT in mine and the CPUs 5800X vs 5900X in mine). Hers is running stable from day 1 (I’ve built both machines on a same day), so that would be a great test to isolate the problem.

@wendell Alright, just crashed again. Had c-states disabled in both the bios and through zenstates.py. > Dec 29 12:23:21 Compy-3700X kernel: mce: [Hardware Error]: Machine check events logged

Dec 29 12:23:21 Compy-3700X kernel: mce: [Hardware Error]: CPU 1: Machine Check: 0 Bank 5: bea0000000000108
Dec 29 12:23:21 Compy-3700X kernel: mce: [Hardware Error]: TSC 0 ADDR 7fed6dbdea40 MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Dec 29 12:23:21 Compy-3700X kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1609262595 SOCKET 0 APIC 2 microcode 8701021

@agurenko What motherboard do you have? Also, is yours and your wife’s ram on the ryzen compatible list?

We’re using MSI X570 Tomahawk Wifi with latest (beta) BIOS - 7C84v151. Our memory is technically not on that list for 2 reasons:

  1. There are no Ryzen 5000 compatibility in this list at all :slight_smile:
  2. There is almost our memory module, but not quite: Ours is G.Skill F4-3600C16D-32GTZRC and there is a F4-3600C16D-32GTZNC no idea what is the difference between RC and NC

That makes sense. It’s also probably not a memory incompatibility if you both have the same cpu and memory, but her’s works and yours doesn’t.

Well, she has 5800X and I have 5900X, but given that most people (I guess) don’t have this problem, is either combination of factors or a bad cpu sample.

BTW, can you try SMT off for a few days to see if that helps in any way?

This is going to sound weird but… turn xmp off? And let’s see … leave c states off as well.

Look for gear down mode in your memory settings and disable that.

If that’s stable let’s only re enable xmp

@wendell I disabled xmp, but I don’t see anything resembling a ‘gear down mode’. Doesn’t mean it’s not there, I just can’t find it.

@agurenko I’ll try that if disabling xmp fails.