I am in the process of rebuilding my NAS with some sweet new EPYC hardware since the lower end stuff is getting relatively affordable and I yearn for PCI-E lanes. Of course, I ran into a massive issue pretty much instantly. I am fairly sure I know what the issue is, a dead motherboard that was sold as new but was actually used and fried by the previous owner, but I want to repeat my tale to people more experienced with EPYC hardware than I just in case I am being an idiot and overlooking something obvious.
Here’s my specs:
CPU: EPYC 7252 8-core
CPU heatsink: Noctua NH-U14S TR4-SP3
mobo: ASRock EPYCD8-2T
RAM: 4x Micron 32GB 3200MHz DDR4 ECC RDIMM
PSU: Seasonic Prime TX-850
Case: An oldie that I love dearly, a Lian-LI PC-A70B
So today I finally had all of the hardware in hand. My plan was to just assemble the bare minimum to get to the IPMI interface, update the firmware and the BIOS, and boot into memtest so I could get that out of the way before the real fun of setting everything back up and implementing the changes and improvements could occur over this upcoming weekend.
I installed the EPYCD8-2T onto the motherboard tray and was careful to only leave in the standoffs that match with the screw holes on the mobo. I installed the CPU and the heatsink, put the mobo tray back into the case, plugged in the mobo power connectors, plugged in the case fans and front panel connections, and dragged my case back over to my desk area to get started with updating the firmware/BIOS and do a test boot.
The IPMI pulled down an address from my DHCP server just fine. I logged in and went to start the updates except I found that both the firmware and the BIOS were already the latest versions instead of the original defaults from the factory. Weird, but I assumed it wasn’t impossible for ASRock to have been still making new models later in the products life span.
I went to the KVM and pressed the power button on the case to boot it up, and nothing happened at all. Fans didn’t spin up, no other LEDS turned on, no video output, just nothing whatsoever. I figured I must have wired something up wrong so I went back and checked all the cables with the manual and everything was definitely correct. I unplugged all of the front panel connections and fans except for the heatsink fan and tried booting again from the IPMI, again nothing happened at all.
I proceeded to completely disassemble and reassemble this computer 4 times and every single time the results were exactly the same: IPMI works but the system will not power on under any circumstances. I read all of the sections of the manual pertaining to setting up the hardware, watched a few YT videos about installing EPYC processors, reseated the processor a few times, reseated the RAM and eventually dropped it down to just 1 stick, reseated all the cables, tested my power supply on my previous mobo/CPU/RAM just in case I somehow fried it during the rebuild, all to no obvious results.
At this point I feel like I’m losing my mind and the only explanation is the mobo was fried before I got it or I am overlooking something, but hours of going over everything made me feel confident I did everything right. I started digging around in the IPMI and under the system information, it shows the current processor and memory as completely different hardware than what I have. AFAIK that information can only be determined by the IPMI after it has been booted at least once, and I can’t get it to boot at all. Between that and the already up to date BIOS/firmware, this makes me think this hardware was definitely used before me. The mobo’s box was a bit frayed at the edges, but it did contain all the original accessories.
The only flaw in my logic that I can think of is that I did not have a torque wrench that goes down to the 1.6 nm or 14 in/lbs that is specified. Hilariously enough I do own a torque wrench that only goes down to 2nm. I am fairly good at eyeballing these things though, and I was very careful not to wrench down too hard or leave the contacts too loose. In the multiple times I reseated the processor, I tried varying pressures but every time it wouldn’t power up at all. I also figured that if I had either an improperly seated processor or RAM that the system would still power up but just fail to post and just get stuck showing an error code on the mobo.
Also when I bought the processor, it came from ebay and the original owner didn’t perfectly clean all the thermal paste off. During shipping a tiny little bit of thermal paste got onto the contact pads on the bottom of the processor. I cleaned it up very gently with a little bit of isopropyl alcohol on a piece coffee filter paper to prevent lint. Again, I think if this were an issue the computer would have powered on but failed to boot.
Here’s another juicy detail, so I bought the mobo on Amazon through some reseller and it was labelled as new. When the package was delivered, it came in a Newegg box. I figured that they just reused a box that they had on hand. But after I started a return for the mobo through Amazon and it gave me a return address to send it to, the return address is none other than Newegg’s primary warehouse. Newegg is pretty notorious for reselling open box items as new, so either this Amazon seller is Newegg in disguise or they are some kind of elaborate dropshipper.
Does my suspicion that the board was already fried seem as reasonable to you as well? Am I being an idiot and overlooking something? Did I ruin the hardware somehow like an even bigger idiot? I haven’t sent out the mobo return yet, but it feels like there’s nothing left to test. Thanks for reading all this if you made it this far, happy to take all the help that I can get.