Supermicro M12SWA-TF issues: [Partially Solved: IPMI stopped working], RAM slot DIMMA1 not reading module

Machine config:

  • Supermicro M12SWA-TF
  • AMD Threadripper 5955WX
  • 256GB (8x 32GB) 3200MHz Crucial CT32G4RFD432A DDR4 RDIMM

Started with 4 modules, everything was working, BMC was web accessible. About two weeks ago the machine was moved and added to a new network.

  • Went into the bios and did update network on reboot thing
  • My new network gives both the IPMI 1G & 10G nics IP addresses
  • Unable to bring up the BMC web interface on the (new) IPMI address
  • Did a bios update and cleared cmos by physically removing the power cable and temporarily removing the cmos battery but the problem persists.

Additionally, 4 more identical ram sticks were added yesterday and it seems the DIMMA1 slot is not reading the module. Dusted and swapped modules around with no luck.

The machine boots, and seems to work on it normally.

At this point it doesn’t seem to be a CPU contact issue-- but who knows? Moving a machine up 2 flights of stairs should not cause these issues?- right?

Any suggestions?

IPMI Solved:
Turns out this is a weird network issue-- for instance i can access it via the wifi, but not through a wired connection on the same access point. Not entirely sure whats causing this. I can access most things on my network, but i’m having issues accessing the web interfaces of the switches on my network, but i can access the main router.

The memory issue persists.

I have one of these, been nothing but trouble since day one. It’s great when it decides to work.

When mine boots it shows the IPMI grabbing a dhcp lease, after boot the 1G & 10G are hit or miss, so three ip addresses. Are you using the correct ip for the IPMI?

I just did a reboot, system came back up with only the IPMI with a dhcp lease and I can login, the 1G & 10G are missing, this happens randomly and I have to unplug the system before trying to boot again.

I had memory issues as well, tried a slew of different sets. Replaced the power supply. After getting some canned air and blowing the memory sockets out with it did the memory finally work 100% with many iterations of memtest pro. (Also removed the cpu and cleaned it then air blasted the socket)

I tried windows on it once, the 10G driver from Supermicro has malware so I tracked down the driver from the vendor.