AMD servers

Hi,
I’m trying to convince my boss to replace our equipment with rented or new ones. Due to our requirements we need to use NVMe disks 12 per server with role as storage servers I have two options go with single AMD CPU server or with two Intel CPU server. I want to go with the AMD CPU server. I want to go with SuperMicro servers and locate them in Frankfurt due to the need to have very low latency to AWS via DirectConnect.

When I suggest going with AMD servers due to the lower costs of the Intel server, the integrator indicates that, in some cases, clients report having issues with AMD servers when running some databases.

I’ve tried to find any issues for our technological stack but couldn’t find them.

Maybe you can suggest where I should search for the mentioned issues?

Our technological stack is:
On servers:
RedHat 8 (we are migrating to Ubuntu 24.04)
Kubernetes (Rancher 1.30)
In Kubernetes:
PostgreSQL
Clickhouse
MinIO
Kyuubi
Jupyterhub
AirFlow

Maybe someone could point out that issues with AMD CPU were some time ago, and now this is not the case.

Thank you!

Keep in mind when going with a single socket server. That cost of ram might become an issue. As you are limited by the number or ram slots.

I was looking at a 128c/256t cpu to replace a bunch of servers at work. And for are use case. We would need 8GB per thread. Or 2TB of ram. Which would cost between $10k-12k in just ram for 256GB sticks.

1 Like

EPYC Genoa on Gigabyte MZ33-AR0 boards with 24 RAM DIMMs
Never had a prebuilt Gigabyte server, but their board is solid enough.

Everything you are running is high bandwidth, low latency, QD1 queries.
You’ll need the RAM channels and PCIe lanes for this workload to be optimized.

12x NVME drives is 48 PCIe lanes
the rest would be networking and future expansion.

I run a decent sized Kubernetes cluster on AMD gear (Rome → Genoa) mostly runs Jupyterhub and RStudio. Used to have have hyper-converged Ceph but we moved that to its own cluster (Also Epyc) and haven’t run into any notable issues or issues that we didn’t have with the older Intel gear…

I would second the above and compare the costs of dual socket to single. For use with 1TB to 1.5TB a node dual 32C is basically a wash with a single 64C due to ram cost. The 32C also give you a bit more clock speed if that is something that helps.

I have Supermicro,Gigabyte, and Asus gear. Joys of research computing you go for cost per flop over all else…

1 Like

Thank you for your replies.

Our storage servers for MinIO are relatively small in computing power, with 16 cores and 128 GB of RAM. That is enough for our deployment. In this case, we can add more RAM if we need more. Since we want to use a storage node just for storage, I will suggest that my team go with a single AMD CPU and two to four sticks of RAM with 64GB per stick.

For compute nodes, I may need to go with dual CPU nodes since, for our compute nodes, there is close to 7GB of RAM to one logical CPU core.

Thank you again.
Have a nice day!