I finally built a new PC and I haven’t been able to use it in over a month now… I’ve been getting random reboot once or twice a week - as if someone pulled the plug and then it would restart. No warnings, no logs, no nothing - only the generic lost power message in event viewer. All temperatures are well within normal limits. I found I could reproduce this restart within seconds by running the AIDA64 cache stress test.
I’ve tried two different power supplies, both work fine in my old (current) system, 3 different video cards, 2 different drives, several clean, bare-bones Windows 10 installs - updates, drivers and nothing else.
I’ve tried manual memory timings, DOCP, and even running single modules at 2133. I ran 4 passes of memtest86 every night for a week, I ran the AIDA64 memory test for 10 hours straight - no errors or problems.
I just received a new motherboard from Asus yesterday and the problem is still ongoing. The only things left I can’t 100% rule out are ram or the cpu, but it seems unlikely that all 4 modules would be bad, especially if they aren’t producing errors. It has to be the cpu, right? I’ve opened an RMA with AMD, but in the meantime I’d be greatful for any thoughts or suggestions.
The cpu idles around 30-35c, gaming (before it reboots) it’s around 55-65c, and under heavy load like handbrake or stress tests it usually maxes out around 85c. I don’t have any specifics on board temps, but nothing struck me as unusual. It’s funny, I could run OCCT for hours, but that Aida64 cache test would cause a reboot in as little as 2 seconds.
I would test ram a single stick at a time, psu issues, short or cables. How about bios updates ? Looks for shorts …case buttons ect. bare wires or even just damaged cabling. Sometimes even a poor ly seated connection. Check uefi for temp shut off settings.
if you are doing a cache test and not a memory test, then you are testing the CPU cache. the memory test has to do with the fclk or fabric clock (infinity fabric) and the memory module clock of your ram. you might have cache out of range errors causing your machine to halt.
@thevillageidiot I have exhaustively tested my memory, all 4 at once, in pairs, and as single modules. The asus tech and gskill tech both said it was unlikely to be the problem. I’ve tested 2 different power supplies, both of which work flawlessly in my current machine. Cables are fine, no shorts and it’s just as unstable in a testbench. I’m running the most current bios.
@xaix1999 I used the memory test to help rule out memory as a problem. The cache test ended up being the fastest way to reproduce this, though prime95 and gaming would also cause this to happen within a few minutes.
@noenken Yes, bios is updated to most current. Running bare minimum components on the test bench it’s still happening. I’ve also tried disabling sleep and power stuff in windows and the uefi. Cpu look good, no bent pins or signs of damage.
The last thing I haven’t tried, is disabling smt/hyperthreading and/or under clocking the cpu. If either of these helped that should further point to a cpu problem, right?
Given all above troubleshooting, my best guess it’s a motherboard issue, specifically bad VRMs.
My second best guess it’s a bad CPU.
Have you tried slightly increasing vcore? What about precision boost and PBO settings, any of those enabled? If I remember correctly PBO is a feature that amd supports but somehow also voids warranty
I would reset bios, disable all cpu auto oc functionality if present, then if it doesn’t stabelize, try to slightly increase vcore, I wouldn’t exceed 1.5v with a decent cooler or 1.4v with the box cooler.
That I missed. I don’t see a note on your mobo’s product page, but you might try to rollback the BIOS and see what happens- then update again. My problem was only fixed after updating both the 560’s firmware and the BIOS on the mobo.
@Hossam I did wonder about VRMs. I spent a lot of time with different settings, though nothing seemed to help. While not impossible, I find it really hard to believe two boards could both have faulty VRMs, especially since they are fairly robust. I didn’t manually set my vcore but I did try a few offsets both +/-. I disabled PBO and performance enhancer, no help sadly.
@Ramiel The replacement board came back with bios 1408, which was the most recent, though just a couple days ago 1409 was released. Unfortunately, it hasn’t helped.
Thanks everyone. My cpu is snugly packed into a box and going back to AMD tomorrow - I’ll update if/when I receive a replacement.
Sorry I must have missed that you changed the motherboard. Yes I agree, two motherboards having the same issue would mean it’s a design issue in the VRM, which is very unlikely since it is a 95W CPU without overclocking.