3600x random reboots

Hi everyone.

I finally built a new PC and I haven’t been able to use it in over a month now… I’ve been getting random reboot once or twice a week - as if someone pulled the plug and then it would restart. No warnings, no logs, no nothing - only the generic lost power message in event viewer. All temperatures are well within normal limits. I found I could reproduce this restart within seconds by running the AIDA64 cache stress test.

Asus Rog Strix x570 gaming-e
Ryzen 3600x
GSkill TridentZ neo 3200, cl 16, 4x8gb
Rog Strix RTX 2070 Super Gaming Advanced
Seasonic Focus gx-1000 (gold 1000w)

I’ve tried two different power supplies, both work fine in my old (current) system, 3 different video cards, 2 different drives, several clean, bare-bones Windows 10 installs - updates, drivers and nothing else.

I’ve tried manual memory timings, DOCP, and even running single modules at 2133. I ran 4 passes of memtest86 every night for a week, I ran the AIDA64 memory test for 10 hours straight - no errors or problems.

I just received a new motherboard from Asus yesterday and the problem is still ongoing. The only things left I can’t 100% rule out are ram or the cpu, but it seems unlikely that all 4 modules would be bad, especially if they aren’t producing errors. It has to be the cpu, right? I’ve opened an RMA with AMD, but in the meantime I’d be greatful for any thoughts or suggestions.

What are the temps like ?

hello please search the forums as this is something others have dealt with

The cpu idles around 30-35c, gaming (before it reboots) it’s around 55-65c, and under heavy load like handbrake or stress tests it usually maxes out around 85c. I don’t have any specifics on board temps, but nothing struck me as unusual. It’s funny, I could run OCCT for hours, but that Aida64 cache test would cause a reboot in as little as 2 seconds.

What is your ram voltage ? Check your power profiles.

RAM voltage was the default at 1.35v. I tried running it as high as 1.4v and it made no difference. Even with a single stick the pc would still reboot even running 2133.

Power profiles are AMD balanced and performance. I tried both, neither made any difference.

I would test ram a single stick at a time, psu issues, short or cables. How about bios updates ? Looks for shorts …case buttons ect. bare wires or even just damaged cabling. Sometimes even a poor ly seated connection. Check uefi for temp shut off settings.

if you are doing a cache test and not a memory test, then you are testing the CPU cache. the memory test has to do with the fclk or fabric clock (infinity fabric) and the memory module clock of your ram. you might have cache out of range errors causing your machine to halt.

Good old memtest.

BIOS is updated? Just a couple suggestions:

  • Disconnect peripherals except mouse, keyboard and monitor.
  • Disable any power saving or sleep modes in the OS.
  • Disable C-states and power management / efficiency stuff in UEFI.
  • If that doesn’t help, pull CPU and check for bent pins.

@thevillageidiot I have exhaustively tested my memory, all 4 at once, in pairs, and as single modules. The asus tech and gskill tech both said it was unlikely to be the problem. I’ve tested 2 different power supplies, both of which work flawlessly in my current machine. Cables are fine, no shorts and it’s just as unstable in a testbench. I’m running the most current bios.

@xaix1999 I used the memory test to help rule out memory as a problem. The cache test ended up being the fastest way to reproduce this, though prime95 and gaming would also cause this to happen within a few minutes.

@noenken Yes, bios is updated to most current. Running bare minimum components on the test bench it’s still happening. I’ve also tried disabling sleep and power stuff in windows and the uefi. Cpu look good, no bent pins or signs of damage.

The last thing I haven’t tried, is disabling smt/hyperthreading and/or under clocking the cpu. If either of these helped that should further point to a cpu problem, right?

yes and no
the whole unit is probably not bad but sounds like your cache or an instruction accelerator is not behaving, since prime95 will flood your cache and use different instructions

Given all above troubleshooting, my best guess it’s a motherboard issue, specifically bad VRMs.
My second best guess it’s a bad CPU.
Have you tried slightly increasing vcore? What about precision boost and PBO settings, any of those enabled? If I remember correctly PBO is a feature that amd supports but somehow also voids warranty :thinking:
I would reset bios, disable all cpu auto oc functionality if present, then if it doesn’t stabelize, try to slightly increase vcore, I wouldn’t exceed 1.5v with a decent cooler or 1.4v with the box cooler.

I dunno if this helps, but I was dealing with the same issue earlier today with a cheapo Gigabyte A320M-S2H. Resets at random, everything else was tested.

Problem was the BIOS update. And a firmware update for the Gigabyte 560. For the 320m, these had to be done in a specific order:

That I missed. I don’t see a note on your mobo’s product page, but you might try to rollback the BIOS and see what happens- then update again. My problem was only fixed after updating both the 560’s firmware and the BIOS on the mobo.

Best of luck and hope it helps

@Hossam I did wonder about VRMs. I spent a lot of time with different settings, though nothing seemed to help. While not impossible, I find it really hard to believe two boards could both have faulty VRMs, especially since they are fairly robust. I didn’t manually set my vcore but I did try a few offsets both +/-. I disabled PBO and performance enhancer, no help sadly.

@Ramiel The replacement board came back with bios 1408, which was the most recent, though just a couple days ago 1409 was released. Unfortunately, it hasn’t helped.

Thanks everyone. My cpu is snugly packed into a box and going back to AMD tomorrow - I’ll update if/when I receive a replacement.

Sorry I must have missed that you changed the motherboard. Yes I agree, two motherboards having the same issue would mean it’s a design issue in the VRM, which is very unlikely since it is a 95W CPU without overclocking.

What is your current Soc voltage at?
You can try manually set a higher Soc voltage,
this might improve stability.

Set the Soc voltage to like 1.2V and see if that improves anything.
Note: don’t set the Soc voltage higher then 1.2V.

Hi everyone,

Good news… My replacement cpu arrived yesterday, and I stress tested overnight - without issue. I’m going to keep testing for the next few days, but I think I can call this fixed :smile:

Thank you to everyone who reached out to help!

2 Likes

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.