Hello,
I have been ripping my hair out trying to figure out why my system is unstable for months now.
I don’t know what to do anymore PLEASE, PLEASE HELP ME.
Please bear with me, I have a lot to unload and tell, hopefully I even remember everything.
Since upgrading to my 3960X on an ASUS Zenith Extreme II (Non Alpha) this system has constantly been unstable, sometimes I have a few months where it is stable and then suddenly it’s gotten unstable again for no reason that sounds plausible.
I believe I tried everything in the book to check what it is, but the BSODs just keep coming in absolutely random and non-reproducable ways. (While gaming, not while gaming, under load, und no load)
The BSODs all say NMI_HARDWARE_FAILURE, I tried checking on it on the web but there is absolutely no clear thing that would point to a specific hardware thing.
To the story with my problems in order:
I upgraded from an X399 Gigabyte Designare, 1900X to the above mentioned combo, while taking everthing with it, the G.Skill RAM, PSU and so. Everything was rock stable for all the time I had it.
-
First the BIOS kept not seeing one NVMe SSD my ADATA SX8200NP on almost every restart. Which got better with a BIOS update.
-
After that installing my OS on another SSD worked and I got everything in working order, it seemed fine. BUT now I got random freezes when doing things or sometimes not doing things, turns out the G.SKILL RAM was somehow not compatible with this new CPU + MOBO (G.SKILL 3200 CL14 B-die).
Got myself new RAM after watching a ton of Buildzoids and L1T and various other techtubers got some Curcial Ballistix Elite Micron E-Die tath seemed to be very compatible from all I read and would also be fine under more heat, which was maybe the issue with the other RAM though it only got to around 55C°.
This I have made a seperate thread for that problem Here
PROBLEM SOLVED? Seemed like it, the system seemed stable now for a few months.
- Next I wanted to install a second OS (Win) on another SSD which seemed to start the trouble? The new OS seemed stable at first when about 3 or maybe 4 weeks in I got freezes again and also BSODs suddenly.
All the troubleshooting started again and I couldn’t find a hardware issue so I though maybe a software issue was the case and tried to narrow it down by going backwards through the programs I use on a dauly basis which I updated in the process of switching to the new OS. NOTHING. NOTHING helped.
Thinking it might be the OS itself which got corrupted I switched back to the first OS, which was now the backup OS.
PROBLEM SOLVED? Seemed like it again.
- Just when I thought everything was fine again, I could use my hard earned PC again, when suddenly today I got BSODs again, two this time about an hour apart.
I recently got a new GPU (RTX 3080 Ti FE) which I was happy to have got a decent deal on.
I can run every benchmark, every stress test I could think of, (Furmark, Prim95 for 24H, Memtest to 2000%) and always pass and no freeze or BSOD happens. I can’t under any circumstances get the problem to occur while testing.
Temps seem in the limits of every hardware piece:
- CPU: Max. ~85°C
- RAM: No thermal sensor onboard so the external sensor tells me about 51C° environment
- GPU 1: Max ~85°C
- GPU 2: Max ~72°C
- Chipset: Normally about 78°C Max ~90°C
Things I tried, always only changing ONE thing at a time (This will be long):
- C-States
- Various other BIOS settings in all possible combinations
- BIOS update
- BIOS downgrade
- Replace every hardware component I have on hand (PSU, SSDs, GPUs, Soundcard, Networkcard, RAM)
- RAM tmings (This took forever)
- RAM speed on lower or higher speeds
- Different OS (On different SSDs)
- Re-seat CPU
- Re-paste CPU thermal paste
- Disconnect everything from the MOBO and reconnect it again
- Try different drivers for everything where there are different drivers for
- CPU on stock OR overclocked
- CPU, RAM, IOD, SOC voltages in the safe values I know about, always checked very manu different sources for this
- Everything stock in BIOS without any adjustments whatsoever.
- PCIe GEN adjustment everything to GEN 3, in desperation
- Software downgrade to KNOW GOOD versions, I know they are stable.
- With OS encryption and without.
I probably tried even more things, which I just can’t remember atm. I currently don’t have enough money to change the motherboard or the CPU, so I can’t replace them to rule both these things out.
I’m not rich and am currently really devastated that this really great system isn’t working properly and I don’t know what to do anymore I’m at the absolute end of my wits here.
I know shouldn’t probably do this but I have to ask, @wendell maybe you have heard of weird things happening with this board or CPU?
My whole system (With current settings as far as I remember them):
- CPU: AMD TR 3960X (Undervolted by 0.12V)
- RAM: Crucial Ballistix Elite 32GB 3600 4 Sticks (3600, CL 16-18-18-38)
- GPU 1: 3080 TI FE (Stock)
- GPU 2: RX580 (Stock)
- Every NVMe and S-ATA port populated (If you need to know every make and model I will include them)
- Creative SB ZxR
- Broadcom 4port GBe NIC
- 1200W Silverstone PSU