I had to RMA a dead 3900x and now the replacement is flaky ( L3 ECC machine check exceptions and random reboots) after about a month.
I purchased a cheap gigabyte aorus x570 pro wifi on ebay to troubleshoot with and low and behold I can get it to be stable by either setting a fixed multiplier of around x35, disabling global C states OR setting Power Supply Idle Control to typical.
That last power supply setting I don’t think I have on my other MB, an ASUS x570 prime pro, but it seems to do the trick. It has the side effect of disabling C6, but here’s the wierd thing: if I re-enable C6 after boot using zenstates.py the system is still 100% stable, at least for a couple of days now. I confirmed the C6 states were active by monitoring with ryzen_smu and ryzen_monitor I found on github. This is in linux, BTW.
So its stable now but WWL1TD ? Don’t trust the CPU and RMA it again?
And then there’s the question of what the idle current setting is actually doing. According to the manual, its simply disabling the package level C6 state. But if I re-enable C6 with zenstates.py it says both package and core C6 are enabled. There must be something else going on with this idle current setting besides C6.