AM5 Memory Issue? PSU? Other issue? SOLVED-See BLUF for fast fix

BLUF: Manually setting 1.2V for CPU VDDIO/MC and CPU SOC as suggested by @Janos did the trick for me to increase my stability without overshooting on voltage or causing my system to randomly shutdown and having to manually cycle power to my psu.
This is with a ASUS ROG STRIX X670E-E, Ryzen 7900x3d and Corsair Dominator 64gb two dimm 6000 c40 kit at 1.35v.

If you want to read more about the trials and tribulations I went through please feel free to read on :slight_smile:

SO, I searched and I didn’t find anything like this on L1 so I thought I’d put out my experience with AM5 so far as well as RAM and a Ryzen 9 7900x3d (I know everyone says I should just have a 7800x3d or 7950x3d blah blah blah, I have my reasons and logic behind it) I’ll share later maybe to start a flame war if I’m bored.

Here’s what we are working with-
Ryzen 9 7900x3d
ROG STRIX X670E-E GAMING WIFI
Trident Z5 Neo RGB DDR5-6000 CL32-38-38-96 1.40V 64GB (2x32GB) AMD EXPO
CORSAIR AXi Series, AX1200i, 1200 Watt, 80+ Platinum Certified
XFX Speedster ZERO AMD Radeon™ RX 6950XT RGB EKWB Waterblock Limited Edition with 16GB GDDR6
Samsung 980 PRO PCIe® 4.0 NVMe™ SSD 2TB
ASUS XG-C100C 10G Network Adapter Pci-E X4 Card with Single RJ-45 Port

Think that’s a full list of parts for now…unless something else might be relevant…

Obligatory system picture as a proud all AMD system pappa, My first in YEARS…Ooop, this is the AM4 build but exactly the same as AM5 just different ram only two dims…

So here’s a list of things I am tackling hopefully here for highlights to see if you want to read further… I’ll be brief as possible

1.) RX 6950 XT Overclock issues (Stress tests fine in programs like Furmark, fails to finish test in 3D Mark)- Might have solved with memory Issue?

2.) Extremely long boot times (2 min + to post) with post code 15 when using approved EXPO 6000 setting thats on QVC for ram(Seems it is retraining memory each boot, which suggests to me memory issue, passes Memtest 86 no errors so rams good) I’ve mitigated this issue bumping freq down to 5200 and adjusting clocks tighter to 30-36-36-96… needs further testing.

3.) PC randomly shutting down… while actively working via keyboard surfing the web to hardcore gaming. Stress tests wont reproduce the issue. I turned on “allow to sleep” to see how system responds. The shutdown is just like the PC going to sleep (everything shuts down but RAM leds stay lit) but wont respond to input from keyboard or power button. I have to turn off the PSU manually then reboot. Since memory down clock I have not been able to reproduce this issue.

4.) Issues with ASUS 10G Nic losing connectivity with internet ONLY- So dumb…this may be resolved from not allowing PC to “shutdown device to save power” setting in device settings… no issues so far.

General Impressions-
I’ve done a lot of reading on all of these issues… Some point to PSU some to Mobo. I dont see many references to RAM as the issue but when I went to rule it out it surprisingly fixed a number of issues…

Windows logs were useless, no Critical, Error or Warning logs of note. Only critical logs were for unscheduled power loss and restart. Have not had one since down clocking ram. I should have also been monitoring hard faults in the ram but I didnt think that was the issue for a long time…

3.) Continued PSU inspection-
This was my first suspect and biggest concern because who wants a system that randomly shuts off?
I’m lucky that my PSU has a monitor built in provided by Corsair and a internal test. I was monitoring power levels and voltages for all the rails and saw nothing out of the ordinary, I was able to turn on logging so even though the machine had powered down I was able the check the logs and there was no spikes or drops in any of the voltages when I reviewed them on reboot. There was also a question of a over heating issue, but I manually set the fan curve and this didnt resolve the issue…
So weird again. I was still suspicious. This was all before I messed with RAM timings. I had never had problems with XMP so I didnt think EXPO would be an issue either. Seems I may be wrong.

1.) Continued GPU Overclock testing- So, admittedly I am completely new to AMD GPU’s and there was a bit of a learning curve here. I was so used to using auto oc as a jump off point and being able to overclock and undervolt using the power curve in MSI afterburner with Nvidia GPU’s.
I stared with milder settings I had seen numerous times on multiple trusted websites as a starting point. These seemed to work fine as OC undervolt setting in tests like Furmark, OCCT etc.
As soon as I tried to run 3DMark Timespy poof… it would either load the test and immediately fail, OR run a few seconds then fail. This really confused me. I’m used to if the stress tests worked then the benchmarks would work and it’s only a matter of testing to see if it made a noticeable difference to heat or performance.
Even though 3DMark didn’t work, I tried gaming… Games seemed to crash randomly to desktop and I’d get a driver warning or OC reset warning. I did some digging and some of the games I played at the time (IE Watch Dogs Legion) didn’t play well with OC’ed AMD GPU’s. I was also getting random driver failures with the PC even left at idle the driver was failing WITHOUT a GPU OC with the memory overclocked. Since the down clock of the RAM in the last few days I have NOT had the driver fail once.
I have not had time to test the OC of the RX 6950XT since the downclock of the RAM…More will follow…

2 Likes

Not sure what is going on, my first guess would have been the memory as well, but when Memtest runs fine it becomes a lot harder to diagnose. I would not depend on 3DMark to much, though. I game in a virtual machine and 3DMark, at least the usual used FireStrike, crashes every single time and brings down the entire virtual machine, while every thing else, including Prime95, runs perfectly fine.

Apart from the rest I would revert the overclock on the 6950 for the time being. It only introduces one more possible source of errors. I would also remove the EXPO settings since both AMD and Intel have been cautious to make clear that this working is not guaranteed and to be seen as an overclock as well. When the computer runs fine then, it would be clear that either of both is an issue and needs to be addressed.

1 Like

Well, Bios version 1410 (2023/5/02- newest in beta) as well as version 1303 (2023/04/27) have been released that set a maximum of 1.30V to the CPU and motherboard.

From reading it seemed this was a big issue damaging both or either the CPU and the motherboard some in extreme and physical ways as seen HERE

This was a issue with both 7000 and 7000X3D chips. Tom’s hardware had a few good articles on it.

Since the upgrade to these new BIO’s version I have not had an issue “Fingers crossed.”

So it seems having a quality power supply kept me from damaging my PC aw well as keeping all voltage safeties in place. My Corsair AX1200i was doing its job tripping to save my hardware.

It was NOT due to EXPO (well sort of, bios changes needed), it was probably not a RAM failure (I still didn’t like my ram needing 1.4V to keep overclock on the G.Skills kit. I am noe running the slightly slower clocked Corsair Dominator 64GB 6000 c40 2 dimm kit at 1.35V) it was NOT due to and overly aggressive Curve Optimizer setting (Still running at old levels I tried without issue), It was not my PSU failing.

Just wow, I remember growing pains and issues with drivers BIO’s in the past but wow.

Early adopters beware lol.

3 Likes

That seems to summarize it, glad it works for you know, tough!

2 Likes

Yeah the over volting on the soc is indeed a big issue atm.
Asus did release a beta bios to fix the problem but it actually did not fix it.
So i believe they pulled it again.

1 Like

I have two bios both in beta that are working. One was just released may.

I have it installed and Im seeing no spikes in voltage and havent had a shutdown yet…

Im continuing testing :slight_smile:

1 Like

Welp… even the newest bios just shut down on me.

Going to try some manual voltages. Mem OC was done manually as of now as well.

Any suggestions welcome. I’m keeping an eye via hwinfo…as well as windows system resources to see the ram hard faults.

I also may play with an offset. The other thing i may drop to lower bios versuon that lets me adjust VDDCR_VDD… That value isnt adjustable in this version and it still jumps to 1.345V… worrysome?

I also use Bios version 14.10 with an 96GB SK-Hynix Kit at 6000MT 30-36-36-80.
Your CPU SOC V. is pretty low, try 1.2-1.25V and FCLK=UCLK 1:2, that fixed my problems.

1 Like

Sweet thanks for the tip… couldnt get over the powering down. I think I can thank my PSU for saving me. The corsair AX1200i works wonders with multirail monitoring.

I just have to be careful with the ASUS Rog Strix X670e-e… it will adjust beyond the setting.

I just wasnt sure the VDDCR_VDD not being adjustable and jumping beyond 1.3v was an issue with the newest bios.

Well, im kinda at a loss now… things just got weirder and im so scared to pull my cpu to physically inspect it and the motherboard…

Heres what made me cringe this am when i booted up the pc… happened twice

Also… im noticing at making changes to bios it will hang… consitently… cmos wont fix it I have to do a bios re-flash… ug

I had this message after a Bios update, do you use Bitlocker?
If not just press Y and load default settings for you Bios.
Then load the EXPO profile and dial down the voltages.

Try 1.2V for CPU VDDIO/MC and CPU SOC, that fixed my issue with UCLK 1:2

1 Like

Yeah, I just hit yes and had to redo my pin number for windows.

Im going to try manual voltages again now that I completely reset bios.

I had attempted that before with a manual memeory oc, but letting pc idle it turned off on its own (Power plan was to stay on) and wouldnt turn on via power button. Had to reset psu again.

Just hope nothing is damaged from previous use of expo.

Annnnnnndddddd there is a newest bios released as of today from ASUS lol

difficult to say, I think those who had the problems probably trusted too much on the claims of Buildzoid and other Youtubers that a CPU SOC of 1.45V is not a problem and then also did 24/7 burnins.
Even with the latest bios 1410, my board also sets far too much voltage for the XMP or DOCP profiles.

1 Like

Yes, I am hoping all is well. Tried to give time for platform to stabilize. I never went above voltages ANYWHERE while tuning I always try less to see if I can get better cooling and keep performance. Hopefully hardware is ok.

The random shutdowns where it wouldnt respond without a PSU manual reset really concerned me. I’ll keep playing and see how it goes. It’s just been rough because all tests come back fine, stress tests for half hour or hour wont cause it. I have no way to test conclusively besides time and varied loads. Web browsing can do it, gaming can do it, just so random.

Try JEDEC and Bios default settings a few days, if the issue still exists, I would think about a new psu.

1 Like

Well, so far 1.2v CPU VDDIO/MC and 1.2v CPU SOC (actual reading is roughly 1.234v on both settings, but no where near 1.3v) has worked all day. I haven’t had much chance to do much but run random stress tests and a bit of we browsing and 15 min of gaming twice, but I have had no issue so far fingers crossed. That is even with 6000 EXPO II with my dual dimm 64GB Corsair Dominator kit (Never would have bought but prices dropped so much/ and G.Skills ran at 1.4v and ran very warm for my comfort) which should be the most stable as it loads ALL the default values for the RAM DIMM’s.

Hopefully tomorrow I’ll be able to mess around some more.

I’m just so doubtful of the PSU as there hasen’t been any issues I would exspect like not turning on, or anything. I’ve monitored the voltages directly from the unit and all are solid and clean.

i doubt that this particular issue is related to the psu.
What is your soc voltage doing?

1 Like

It was going to 1.35 -1.375 with the newest bios thats supposed to limit it to 1.3… i had logs running via ryzen master and hwinfo, sadly either there wasnt enough time for it to log a spike, or just getting into that zone is enough to trip either a protection in the memory, board, cpu or psu.

The pc acts like its in sleep mode when it trips. The ram is lit up and everything else is off, power button wont respond, wont respond to keyboard and i have to manually power down psu with the switch, and turn it back on.

This is with bios released yesterday (1416). I manually set 1.2v CPU VDDIO/MC and 1.2v CPU SOC (actual reading is roughly 1.234v on both. So far ram is stable as is the system so far.

I may try the bios the way it comes… seems once you set the voltage any hope of an auto oc is out the window, which is fine its plenty fast enougb and id like more stability anyways at this point. :slight_smile:

1 Like

Addendum: It was not setting the voltage that limited to OC on the cpu, seems it was the expo settings. I did manage to eek out a extra 100Mhz up to 5.75Ghz on non cached cores along with a negative curve optimizer setting of -10 for cache cores and -15 on regular ccd.

So seems the big trick was the setting that @Janos recommended that hit the stability spot. At least for me so far… sure I could optimize some more tweak here and there… but I’m gonna enjoy a stable system for a while…

2 Likes

I can not follow the conversation. Did you set those memory values from Janos’s screenshot manually with hos recommended voltage or what is it that you exactly did to get it stable?

1 Like