Dell R620 Drops Connections

G’day all,

I’ve been having this issue with my Dell PowerEdge R620 for quite a while now (years, but kids are eating into my dwindling spare time). I’ve tried pretty much everything I can think of to diagnose the issue, and it’s stumped people on the LTT forums as well as r/homelab.

Some background: I bought this ex-datacentre R620 from eBay a few years ago. It’s equipped with:

  • 2x Intel Xeon E5-2650 V2 (soon to be 2x E5-2697 V2)
  • 112G DDR3
  • AlmaLinux 9.5 (Teal Serval) x86_64
  • Linux 5.14.0-503.35.1.el9_5.x86_64
  • Intel I350 NIC (4x 1G), and a random TP-Link single 1G NIC for testing

It has some docker containers running, as well as Plex and Nextcloud. Temps are good (usually around 40C), and very rarely gets above 50% CPU usage. It’s honestly had a pretty cruisy life with me, especially since stopping the BOINC projects I had running.

Now that’s out of the way, the problem it’s having, as you probably guessed from the title, is it will drop connections for pretty much anything. Things like a TeamSpeak server will drop connections and boot everyone, same with game servers. Hosted web UIs or sites will sit there loading until either the connection come back to life, or until the browser deems the connection dropped. SSH will freeze stdin or stdout, I can continue to type and when the connection comes back, everything typed will pop up in the session. Larger uploads to Nextcloud will fail, as you’d expect with this problem.

I’ve tried different interfaces on the same I350 NIC, I’ve tried the TP-Link NIC, I’ve looked into interrupts, tried running one CPU, gave tuned a try, disabled/re-enabled irqbalance, etc. I’m sure there’s some other things I’ve tried that aren’t coming to mind. The only things I haven’t tried that I think might resolve or reveal more info are reinstalling the OS (I’m trying to avoid this but I’ll keep it as a last resort), trying a trunk with the new switch I won at an auction (which I’m yet to install in the rack), or changing the BIOS profile from performance-per-watt to performance (or whatever Dell called the standard profile).

Hit me with any questions you have, or things to try to hopefully get this figured out. I’ll buy anyone who figures this out a beer and kebab!

Thanks all!