I’m hoping someone can help me out with a new Ryzen build. I’ve built a half-dozen or so PCs in the past, but it’s been about 5 years since, so I feel a little green again.
I have the following new build (presented as Newegg Wish List), and although it is running fine, and went through 6 passes of memtest86 successfully, there appears to be some instability when running an OCCT stress test.
What happens is when I run either the CPU : OCCT or CPU : Linpack tests, it gets partway through and then the computer just shuts down. It seems to be occurring when the CPU temperature reads 65°C (in HWMonitor), but that could just be a coincidence. I’ve read that the 1700X reports a temperature of 20° hotter than it actually is, so I think that a 45°C CPU temp shouldn’t be causing an issue.
I’m running everything at stock, no OC. I’ve also updated the motherboard to the latest firmware version.
I know of Ryzen systems rebooting due to instability issues, but if it’s fully powering down that usually indicates a different type of fault or something tripping the PSU’s overcurrent protection.
Something you can try is Disable C6 state in the BIOS and retest again. But that shouldn’t be it.
Also avoid HWMonitor and get HWInfo64 https://www.hwinfo.com/download.php Much better and more reliable thermal + aux info readouts. Most hardware monitor software by now also has had offset adjustments added for tCtl and tDie. tDie being the real temperature.
Also keep an eye on the Reported Current and VRM power while testing. You’re going to want to enable the logging feature from HWInfo so you can keep a log up until the moment that it fails.
Issues like this need data to diagnose. So post everything you can collect.
Everything seems fine, even got qvl ram with the FlareX. nice.
Maybe it can have something to do with:
Perhaps a ryzen cpu, with the segfault and or/other issues and newer bios’es do not go well together? I rma’ed mine, because i actually think it was the cpu that broke my bios rom on my first board. Also it’s general behaviour in the operating system, was abnormal.
Rma’ed first board. Tried second board, same cpu and the system could not boot normally. Tried updating to newest Agesa/bios update and after sucessful update, Dr. Debug basically went crazy
This shouldn’t do anything like what you’re describing, the segfault issue only seems to affect compiling in Linux and even then not every time.
I wonder if it isn’t the motherboard, but as @catsay brought up, without more data this would be very hard to diagnose. Perhaps it’s vdroop and the CPU shuts down. I have had this happen to me if the voltage drops under ~1.36v on my 1700X, but that’s running overclocked. Someone who is familiar with this BIOS might be able to give you some pointers as well for what to look out for.
There is one case I know of where a runaway voltage drop can occur on Mem_VTT that ?might? lead to this. But insofar I have not seen it happen/ Mind you this issue was on a horribly messed up engineering sample I got from a friend. Which I tested against his advice
But it’s so unique it should manifest itself with a whole host of other malfunctions.
They probably do have other issues, Other than only segfault related.
Tried 2 other brand new boards and both behave abnormally (Both non overclocked and on optimized / default settings and QVL ram). Tested everthing else with 0 errors.
So something is obviously wrong. Doubt it’s the motherboard, could be settings though as amd and partners would make this kind of mistake on default freaking settings and have users play a lottery bios game.
A LOT of people are selling their systems, because of these issues. The amd parts are everywhere now a days.
Still think it’s a faulty cpu manufacturing issue.
Disable C6 and overclock slightly with a fixed voltage setting.
That’s the main fix for a rare issue right now which leads to reboots. Idle state - power state glitch.
Most of these issues lie in BIOS / AgesA code that has to deal with CPU’s of varying quality bins. Because of Global Foundries funky 14nm process that has since been fixed. But a lot of old stock is still being cleared by vendors/sellers.
Also no need to speculate. Most people aren’t actually selling their brand new systems.
All this however doesn’t mean that this is OP’s issue. Let’s not derail this thread from the issue at hand right now.
Did that already, disabling C6 did yield more proper boot results. Though without actually fixing it. Yeah it’s exactly that glitch, idle state, dr debug insanity etc.
And no, the issues are not in Agesa/Bios. If that were the case, it would be fixed by now, end of story.
Vendors are doing something. Yet this should have never happened in the first place. Also not speculating here, check the european forums then. People do not want to mess with these issues and are selling their newly bought systems. Go see for yourself.
Not derailing anything. Doesn’t mean it’s his issue, neither does your own input.
First, until full load your voltages are all over the place. In UEFI look for “Core Enhancement” or “Performance Enhancement” and disable it. This is an auto overclock feature. In a few days I will have that MB’s bigger brother and could be more help. Look for load line calibration and select a level in the middle, usually 3. I will try to find out if this board has a VRM sensor and if HWiFO reads it correctly cause its way too toasty and can be the reason for shut down. Considering it is reading 103c even at idle I would assume no sensor or dam son!
So this board does not have any performance enhancement , nor load line calibration. So that is out.
1.35 volts is high, especially for that clock. Set voltage to 1.31875 and clock at 3.8. That should get you turbo speed on all cores. That should be stable, YMMV. If it fails, increase voltage in small steps.
As for airflow, I have the case with 2 120mm fans and the HSF listed (blows upward through the heatsink toward the top of the case). I’ve ordered 2 more 120mm fans for front and top, which should arrive tomorrow. Hopefully the top fan blowing upward will pull out some of the CPU heat.
I will try your suggested voltages in a bit, thanks!
Alright, I tried your suggested settings (except VDDP, which the BIOS doesn’t allow me to change) and as usual the OCCT test crashed almost immediately (within 30 seconds). I adjusted the voltage up in steps of .00625 and each time it crashed until I reached 1.35 again.
This is really depressing. It’s got to be a bad chip or board right… if it can’t pass a stress test at stock settings without poking around in the BIOS?
I would not say bad chip. You can run 1.42V for daily use, however, I would not go above 1.4V.
I would say that the MB’s power delivery is not optimal. It is a premium chip on a very bargain basement MB.
Also consider that synthetic benches are not real would and stress the CPU beyond what you would ever use. I do not use OCCT, so I am unfamiliar.
I would start fresh. Reset BIOS back to original settings. Unplug the PSU when PC is powered down. After 30 seconds use the clear CMOS jumper. Go to bed for the night. Clear CMOS again. Power up and boot. Do not do any changes in UEFI and try OCCT. If it passes, check what ram speed in UEFI. If it doesn’t… I would say the MB cannot keep up.
@anon25377527 Well, it does boot, it just doesn’t pass stress tests.
@Raziel I understand that it’s not a real-world situation, but I would like to have a stress test pass before committing to a machine. It seems that this build will fail if my CPU ever happens to max out or more than 20 seconds, which is not ideal. I will try your suggestion tonight, thanks!
I just realized that my Memory (G.Skill F4-2400C15D-16GFX) is not on the Memory QVL list for my board, but it is listed on the Pro4 version. I wonder if that’s the issue. I heard that Ryzen favors higher speed RAM, but I thought it would still work as Newegg listed 2400 as compatible (possible misleading info there?)
I have a friend who has bought a better board and hasn’t assembled his build yet (still hasn’t bought the CPU), so I might be able to try the CPU on a different board.