IS my CPU dying? Cache Hierachy Error

I think its win 11 23H2 according to windows itself… whatever that means. It is updated through windows update, so who knows.

I used this for debloating:

I am definitely not a fan on win11, but after debloating and removing all the crap, it does seem to run fairly well and even has some performance gains in gaming.

Did my stream PC and at least the Elgato capture card driver is no longer causing random BSODs. Even after 2 years they couldn’t be arsed to fix the win10 driver… their solution was to buy a new card.

About 18 hrs uptime now with that voltage tweak. will declare fixed after at least 48hrs.

Good to hear it seems to be working… :slight_smile:

I thought that last time… then it happened… so looking good so far… but the universe can be a biatch sometimes. :rofl:

Machine has bee up for over 48hrs with no issue.

I don’t wanna jinx it, but just wanna thank all of you in the thread for giving me ways to diagnose the problem and suggestions on the fix.

So far everything is stable with the vcore at +25mv

3 Likes

Hello Talung,

I am replying to your Forum post here on the Level1techs forum because I am am experiencing exactly the same problems you were experiencing with your computer and I want to ask for your help.

I have the following setup:
CPU: 5950x
Mobo: Asus Crosshair Viii Extreme X570E
GPU: Msi Supreme 3090ti
PSU: Corsair Ax1600i
OS: Windows 10 22H2

About two weeks ago my computer began to randomly crash, lockup, reboot @ system idle, at first every once in a while but eventually increasing in frequency now to the point i cant use my computer normally anymore.

Just like you I also have done a TON of trouble shooting like ram, nvme, gpu, psu testing/switching (was about to change cpu) reinstalling Windows 10/11, drivers, bios tweaking etc and also searching the internetz for a solution until I found your post.

Also like you all this time I have been running my computer/bios with default settings, so no overclocking, undervolting, etc except running my ram @ expo.
And also like you because of the above I really am not at home with tinkering @ voltage settings in the bios.

So first of all I would like to ask you if your computer is still stable with the Vcore at +25mv ?

Secondly I want to ask you if you can please make a screenshot or take a picture of the Vcore number/setting @ your bios so that I have a correct example what I need to add/change correctly.
Im asking this cause I dont know the correct v number to configure, I can increase the voltage from 0.000625v to 0.45000v so do I go to 0.02500 or 0.25000?

Hope you can help me with this one because it has been driving me crazy for the past 2 weeks!

Thanks!!

Davidargai

Hi Davidargai,

Yeah, this was a frustrating problem, but glad to report as I have had zero issues with the PC since the voltage change. I never turn the pc off so it has been fine in idle mode as well as under load.

The voltage increase I used was 0.025. I didn’t try anything lower which I think would have been the right approach and then increase slowly till stable. I just rebooted the PC to take photos of the bios settings for you.


As you can see its nothing special just a slight voltage offset. I am no expert and don’t even pretend to be one on the internet, so everything you do is at your own risk. This is just what worked for me to solve my issue.

Hopefully it helps you resolve yours. I am sure other experts that helped me might be able to chime in and give you some advice.

Good luck!

I have the same issue with the following:

Error
EventID: 18
A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 0

The details view of this entry contains further information.

I have the following hardware:
CPU: AMD R9 5950x
Motherboard: Gigabyte X570S Aero G Rev 1.x
RAM: 128 (32GB x4) G.Skill - F4-3200C16-32GVK
GPU: eVGA Nvidia 3090 FTW3
GPU 2: Gigabyte AMD Radeon RX 9060 XT 16GB
PSU: Seasonic 1300W
OS: Windows 11 25H2

I started experiencing the issue about 1 week after upgrading to Windows 11 and adding a second GPU and Nvme drive.
I also updated the BIOS to Rev F7.
During the install I installed Arch on the new NVMe drive.
I think I changed the XMP setting to be profile 1. It was at JDEC before as the memory is not on the QVL. (Don’t know what I was thinking).

Initial Troubleshooting:
I thought it was an issue with memory (since the memory is not on the QVL). This didn’t fix the issue. System still rebooted on it’s own.

Today:
I applied the same settings as @talung did. Here is the picture of the settings changed for reference.
The Motherboard has the latest F7 BIOS loaded onto it. For reference this has AMD AGESA ComboV2 1.2.0.F.

The trick to get the Dynamic Vcore(DVIO) to be configurable was to set CPU Vcore from Auto to Normal. Before that it is greyed out.
I set the memory SPD voltage to match the XMP profile in case the motherboard didn’t automatically (this might and probably is optional).

Hello Talung,

Thank you very much for the screenshots!
That was exactly what I needed.

Right now I am busy with the testing process.
I am increasing voltage in increments, the first increment apparently was not enough cause after a while (way longer then the days before) the computer locked up. Now running next voltage increment.

Also I noticed that when you stress test the computer for a while and then stop the test and make idle one you can get the comp to crash/lockup quicker.

Il keep you posted about what Vcore voltage finally did the trick.

Thanks again!!

On my end for the issue:

Got a Memory Management BSOD with Win11.
I’ve reverted to JDEC 2666Mhz speed and disabled XMP profile on the memory.
I’ve left all other settings the same. Going to monitor for awhile.

In the past I’ve seen this on core 6 on the 5950X, I got the CPU warrantied, here is hoping that the CPUID 0 is not going to be the same.

My friend had issues and it was his power supply with the power bar he was using. His power supply was getting a bit old but was causing all sort of issues

That’s good news.

Glad my experience could help you. Also, very pleased that this thread has been able to help others as well. Definitely a good feeling.

Keep me posted on your results.
Thanks
Talung

Issue persists for me.
Reset everything to stock.
Just going to warranty the CPU after I purchase a 5900XT that doesn’t have issues with WHEA errors after 6 months of running stock settings.

Hello Talung,

I have been busy poking around with the Vcore voltage settings increasing it with increments but alas I cannot get the 5950x to run stable in Idle.

The more I raise de Vcore voltages the more/quicker idle lockups I get, when @ 0.025 I almost get instant lockups, even while booting.
The issue still persists…

One weird thing though. I found out that there is an automatic overclock feature in the Asus UEFi called TPU1/2. I have NEVER used this feature before (everything has run stock since 2022) so I thought, heck let give it a try and see what happens. With both TPU 1 (cpu clocked to 4.0 ghz) or TPU 2 (cpu clocked to 4.1) enabled the cpu seems to run stable and so far does NOT lockup while running @ idle.

Now at your screenshots I also noticed that your CPU is running higher clockspeeds (4.2ghz) then default (3.8ghz) so have you also used an overclock feature of your motherboard??
Can you make screenshots of all your uefi settings you are using so that I can compare?

Thnks!!

Also, well I do think my 5950x (and your 5800x) has degraded over time, otherwise it would still run fine with the settings I had in my Uefi for al long long time.

Maybe its best to switch/change the cpu since the cpu technically is not working as intended anymore. My 5950x ofcourse just ran out of its warranty period about 3 months ago so thats a bummer… But I do have a 5800X3D with 2 years warranty left laying around so that can be the new candidate… Bought the 5800X3D about 4 weeks ago purely at random because I had the opportunity soooo was it fate? Or did I tempt fate and jinxed my self?

Anyways, I am quite dissapointed in this/AMD. I have been busy/working with/in IT since 1990, have build and used soooo many computers with Intel cpus and I never have had this happen before and now I build my first AMD machine and this happens…

Do you or fellow Level1Tech users know/have any idea if this cpi degrading problem also persists in the 7000 9000 cpu series?

I think more an Asus problem than anything else. Out of the bunch of AM4 boards I’ve built it’s the Asus that’s never been entirely stable at stock despite multiple BIOS updates claiming stability improvements over the years and testing multiple processors. Asus also also pushed AMD to start policing BIOS settings more aggressively by frying 7000 series CPUs with 1.3 V Vsoc. For the past few months ASRock’s been getting a lot of attention for an apparently elevated 9800X3D failure rate, though it’s difficult to tell if it’s really elevated or an artifact of social media and press attention.

Tertiarily, it’s likely also an AMD problem. There’s always some background failure rate but a fair bit of circumstantial evidence suggests X3D has an elevated failure rate. There’s also hints safe operating area’s slightly overestimated, leading to BIOS limit guidance that’s a bit too permissive. More in the ASRock 9800X3D thread.

Intel’s degradation issues around Raptor Lake are much worse. Partly because the company knowingly continued shipping parts with fab issues for about a year. Partly because Intel also condoned BIOSes pushing the parts too far and then basically retracted its own guidance twice while trying to put all the blame on mobo vendors. See the cluster of threads here around that.

Also the DDR5 specification defines 1.4 V as the absolute max before damage might occur. A detail that’s emerged in the ASRock 9800X3D examination is running 1.4 V EXPO/XMP likely carries an elevated failure rate. Not surprising given it’s a zero margin config per spec. So G.Skill, Corsair, and most other memory vendors are probably also culpable.

As much I find many aspects of corporate capitalism frustrating, users’ inclination to prioritize performance over reliability does create a market where pushing margins and minimizing engineering investments is near optimal business practice.

mmh, I never changed the any other settings besides turning on the DOCP for the RAM and the +vcore thing. 5800X is 3.8 to 4.7Ghz so maybe with the vcore it adjusted up?

Not sure tbh. I know I did (to get it fixed while testing) set everything to default and then only turned on DOCP and +vcore. No other changes made.

It could be your issue is something else, This initial thread was aimed at what I was experiencing and fixed according to that. Not sure how new your 5950 is, but at the time of doing this, my 5800x was a few years old already.

EDIT: Ah, I see why the cpu says 4.2Ghz, in the First Screen shot it has “CPU Core Ratio” at 42.00. So this is part of Turning on the DOCP to increase the DRAM speed. Nothing that I specifically did. Just part of the BIOS thing… I actually never noticed it till you pointed it out. lol