Help diagnosing instability w/ PBO2 curve optimizer

Hey all!

I previously posted about my recently built machine being unable to sleep properly on Linux due to issues with EXPO, I’ve disabled it and it appears to work properly.

Since then I’ve run into some odd behavior on my installed system which I won’t go into specifics, I tried looking into them but couldn’t find anything of use, until I started suspecting hardware issues/failure.

Long story short, I had enabled PBO curve optimizer in my BIOS a while ago, I tried all-core negative values of 10, 15, 20, even 30! I hadn’t done much else besides run some stress tests, compare the data, and left it at all-core negative 20.

Today I decided to give memtest86+ a try to see if memory could be the issue. I had previously only ran memtest86+ once with stock BIOS settings, and once w/ EXPO enabled, no errors and a full pass both times. Today with EXPO disabled but PBO CO set to -20, I got loads of errors! Particularly during the random number generation tests!

I’m new to PC building and hardware so I freaked out and went straight to an RMA request without considering what I had done first. Since I remembered that I’d messed with PBO CO previously I went and set that to negative zero.

I reran memtest86+ and I am at 0 errors even after 5 full passes and several hours, so the memory appears to be fine and messing with PBO curve optimizer seems to have been the culprit here.

I still think my findings are inconclusive and am really worried about what to do with my memory since I haven’t seen much discussion on PBO curve optimizer causing memory errors on Ryzen 7000/DDR5.

It could be possible that the memory I’m using is actually faulty/bad and that stock settings without a PBO offset simply hide it better? Or maybe I just went too hard on the curve offset and errors are to be expected?

I’m not sure what to make of this and I don’t even know what to do about my RMA since it certainly seems like user error, but I don’t know if should write it off just yet, I’d certainly appreciate any thoughts/suggestions/findings that could help me make sense of this.

Edit1: I should add that with a negative 20 offset, I got a pass with one stick of 16G, while the two matching sticks at negative 20 gave me the errors.

CO offset is very tricky. I played a bit with it but it is too much effort to properly test IMO, as in principle you need to test each core separately in various load conditions. See for example the core cycler tool. On a 16 core CPU that’s madness.

It is certainly possible for memtest to fail with an unstable CPU. It is the cpu after all comparing the data read from memory to expected values.

PS: with one stick (or without expo), obviously the bandwidth is lower, so lower load/temps on the CPU. So it is not that unreasonable
that the CPU is unstable with two but not with one stick.

2 Likes

Thanks! I do think it would be easier to just set an all-core negative value to something lower like 10 and then test, or even just disabling the curve optimizer offset altogether since I personally haven’t seen much of a difference on my Ryzen 5 7600.

I am still worried that doing so could mask potential issues with my memory, though memtest86+ at this point has done 14 full passes without errors with 0 CO offset since I left it running for a while now, so maybe it’s fine and I am considering just cancelling the RMA request (Would it be best to share this info w/ them or just quietly cancel and leave it at that? I did not consider that CO offset could be the cause when I submitted it).

CO couldn’t mask memory defects. CO determines how much voltage you feed the CPU cores for a given frequency, i.e. undervolting the CPU. It does not even change the voltage of the memory controller (which is VSOC). If the memory is fine (stock and EXPO) with no CO then it’s not a memory issue.

To be sure, also do other stress tests with the CPU at stock, memory at stock or EXPO. E.g. ycruncher, p95 (large fft) can find memory errors faster than memtest in some cases. If those run through your memory should be fine!

1 Like

A ryzen 7600 has a pretty low TDP, so with adequate cooling CO would not improve performance, just have the CPU run a bit cooler.

But even setting an all core CO of e.g. -10 needs to be tested per core! One core may crash already at -5 while another could take -10!

1 Like