Help with a mystery Issue no one can explain

I dont get what your saying??? A dead CPU is not possible to strip stuff out and test until another one dies…

Its a pretty stripped system to begin with… only thing is a GPU

And that’s what I am saying. The GPU is the only thing not replaced so it looks, right now, like it is the problem part. So if you have any other GPU try that for a while and see if it still happens

By stripped I mean 1 hard drive, 1 stick of ram, just the mouse, keyboard and monitor, not in a case nothing not 100% needed to boot. Literally the absolute minimum needed to get it to boot. But I know that is not practical with out finding a way to make the problem happen consistently and quickly.

so your saying, use a different GPU and strip shit out and wait a year for it to happen again???

Makes zero sense to run a limited system like that…

I’d be monitoring my voltages, or even better hand keying them in and verifying they are what they say. This is easier said than done of course. A completely different board would be my direction of choice.

Yeah I know it does but how else do you track down the problem?

Right now, I am saying use a different GPU.

This and make sure no screws or anything else in the case isn’t shorting out the motherboard.

Trust me this hasn’t been the issue since day one…

one of the 5 volt wires with a 10 ohm 5 watt resistor to ground. this will stabilize the output voltages on the rest of the supply. and you can get an accurate reading.

be careful though the resister will get hot.
or connect an old spinning rust drive just to power alone (with an insulating pad beneath it)

a psu will run when connected to a psu tester but it may not get an accurate reading unless its under load

Tested the old and new supply with a tester and both tested fine…

a conundrum for sure, you’ve changed everything except gpu even changing that may not solve it unless all other possibilities are accounted for.
consider outside influences that may cause problems.
uneven power. uneven/unstable power while it may not seem probable can indeed wreak havoc on a system, ripple voltages can exceed the upper limit for a cpu easily.
as was stated above using an oscope to monitor the power both input and output of the power supply can clarify any issues with the psu or your home electrical supply.

environment: the factors to consider here are airflow availability, static electricity, dust, animals,and humidity can have an effect on sensitive electronics.
while dust, animal hair accumulate rapidly on an unfiltered system and clog cooling fins on heat-sinks.
ineffective cooling caused by the above can rapidly kill a cpu. insufficient cooling will do this over time.( hence the reasoning for protracted temp monitoring)

static electricity: at the best of times a cooling fan can develop a static charge but its normally absorbed by the systems grounding. during winter month’s however using gas or electric heat will desiccate the air and make the static problems many times worse.
this in turn will cause a rapid build up of dust.
and finally air cleaners ( ionizers) ionizing the air around a pc can have dire consequences for the pc( or for any electronics for that matter)

while it may not be the exact cause of whats going on its the best i can tell you without being there to see for myself.

I’ve built PCs for 20+ years and am very careful and cautious about dust and keep my PC clean…

No animals and I have moved into a new house so the powers not the same… My new braided cables have resistors in them to help smooth out the power not to mention Gigabyte said there are protections built into the motherboard to protect the CPU voltages…

I’ve gone the whole way around with these suggestions already and dont apply to me as I maintain all my systems meticulously.

What gigabyte says is one thing, reality is they have nothing to lose here… you do. Trust but verify.

Lots of people were killing their CPUs on x99 back in the day though anecdotally this was a mostly Asus issue.

You’re free to ignore it because someone at gigabyte told you it’s ‘protected’ but it sure sounds like death by electromigration to me. I bet if you set your LLC up a setting or 2, those dead CPUs would live again temporarily. Could be due to poor voltage sensing or poor bios (gigabyte was known for this) or even just weird voltage scaling. Ultimately the burden of proof falls on you unfortunately.

I cant test the board like they did when they tested mine… It was on the test bench for 2 weeks in various configs and even being power cycled. I talked to the tech and the manager on the phone discussing it…

Then people keep telling me “CPU’s rarely die…” Yeah right… OK

CPUs do rarely die, which is why I’m calling shenanigans on the board being the culprit… Or rather the voltage.

I have a core 2 duo machine thats still ticking away and I personally cooked a 3900x trying to push it just a little too hard. It’s a shitty feeling. I know all too well.

OK so what am I supposed to do??? 2 different PSUs and surge protected power and on top Gigabyte even replaced my board upon request and sent it to me after testing it as well… I know its different by the serial number

And I dont run anything OC’ed so its all bone stock power and votltages

And shenanigans on the board being the culprit? what do you mean by that??

If you’ve never overclocked it, I’m assuming you’ve left the voltage on auto this whole time, on both boards. If the voltage was indeed on auto and left to its own devices, then I’m guessing you didn’t verify the voltages at least read safe in software.

If the bios is scaling voltages based on load in a not so sane way then you’d get CPU death over time due to electromigration. This is what ‘kills’ CPUs from overclocking typically because of the increased voltages. Eventually the thin features (measured in nm) become high resistance because you literally move a portion of the ‘wire’ atomically. Over time it will require more and more voltage to run at the same clocks until you must run lower clocks and back the voltage down.

I’m positing these are the kinds of shenanigans going on because of the time between failed CPUs.

As for what you should do now? Run a lower clock setting from stock to see if it wants to work (with a locked voltage) essentially underclocking.

Or potentially build a new machine.

Or scream and shout into the phone at a poor gigabyte employee? (I don’t think that will go very far but might help you)

I can say that zen2 has been a frustrating but rewarding experience.

It hasn’t been consistent on time between failures… first was 1.5 years of almost 24/7 use and then almost 1 year of same and now its been 10 months since I rebuilt it.

I do the same for every PC I ever built in the last 20 years, I NEVER mess with ANY voltages. I’m surrounded by 4-5 machines I built no differently than this one and all are still running like they did day one.

And the procs dead so cant do much to underclock it…

CPUs aren’t consistent in stability vs voltage. Hence the term silicon lottery.

I’d be happy to hop in the l1 discord and try to talk you through some bios settings sometime soon if you want to try.

Well how do you explain 3 dead ones??? or did I get the trifecta???

I’m pretty good around a bios and nothing was set crazy… let me get it to POST and I will let you know

Like I said. It’s just my hunch that the bios is actually not being safe when set to auto. It’s a common denominator. I’m looking at the ‘facks’ and ‘logick’ of your situation here. Different board, different ram, different PSU, different CPU, same results (more or less). So what didnt change? GPU, but I’m doubtful… The bios probably didnt change between boards so in theory the bad behavior could have been identical. Its odd that you would be the only one experiencing this but maybe you are indeed just that unlucky. I’m leaning towards the simplest solution here… the board is responsible for the CPU getting the juice it needs. I have a grossly over simplified understanding of how incorrect voltage can cause a CPU to fail and a first hand experience with pushing my luck.

If I’m wrong then an underclock wouldn’t matter, but it also wont hurt anything to try.

As a favorite youtuber of mine likes to say: “its already fucked, you cant fuck it any worse”

Of course if it wont even post then we’re kinda dead in the water. You probably know the standard procedure given your experience but I’ll regurgitate anyway for the back of the class in case someone else out there is listening in and doesnt know.

Pull everything non essential to get into bios, only 1 stick of RAM, reset CMOS. If still nothing, pull the RAM and see if it sings you the song of its people (assuming you have a speaker this will be long repeated beeps). If no song, then its safe to assume that CPU is indeed cooked. Newer boards may have that QLED and sometimes post code displays which always seem to display a post code that isnt in the manual (THANKS MANUFACTURERS).