SRE workstation build

I want to build a home workstation to run VMs and containers for testing. I need enough memory to fire up 10-50 VMs and virtual network interfaces/bridges on demand. I mostly test functional setups and failure scenarios, no need for high performance.

I’m looking at 100+ Gb of RAM, more is better, if price is reasonable.

The system will also work as a 4k desktop.

I will not play games on it. I would love to play in a VM with a passed through GPU, but I have to use Faceit anticheat, which bans anyone playing inside a VM. So I’m stuck with having a separate dedicated “gaming” PC.

For VM storage, I’ll probably be ok with HDD speeds divided by the number of VMs, especially if there is spare memory for filesystem cache. If I ever need faster storage, I’ll add another SSD or two later. I will also run a storage for personal files on this machine.

I plan this (German prices):
Epyc 7303p 430
H12SSL-i board 580
Supermicro SNK-P0064AP4 CPU cooler 80
GT1030 GPU 100
Fractal Define 7 165
Fractal Ion+ 2 Platinum 860W 180
32GB RDDR4-3200 x4 360
nvme for / 250Gb 60
nvme for faster VMs 250Gb 60
HDD 2x4Tb 200
HDD 16Tb 300

total 2535

The biggest question is the platform: old Epyc 7003 vs latest gen desktop vs latest gen server.

An i7-i9 or Ryzen 9 would be 2-3 times faster in benchmarks for the same price. It would run DDR5 in 2 DIMMs per channel mode which is not significantly faster than the DDR4. The disadvantage is that Linux has no ECC EDAC support for Intels at all, and stable Debian kernel version has no EDAC for Ryzen 7000. It’s not clear to me which kernel has it. Desktop would be limited to 128 Gb of RAM. Could upgrade to 192 later, but will have to fully replace it.

The latest gen server is more expensive with no clear benefit, maybe future upgradeability and power efficiency. Entry level Epyc 9004 are quite expensive and have too many memory channels, Epyc 8004 don’t have boards yet, affordable Xeons look bad with limited memory speeds. I don’t consider workstation platforms, they are overkill.

Any feedback?

Is the Epyc 7303p(Zen 3, 16 cores, 2.4-3.4GHz, 64M L3, 130W) enough for me? I can also upgrade to 7313p(16c, 3.0-3.7GHz, 128M L3, 155W) for exta €350, or even higher.

A server DDR4 platform makes the most sense for you IMO.

  • You can add a lot of ram if needed vs. a hard limit at 192GB
  • You get more memory bandwidth
  • You can upgrade the CPU down the line to a higher core count if needed (e.g. if enterprises upgrade in a few years and sell off a lot of used Epyc CPUs)
  • If the requirements for the machine change you have a lot of connectivity to add storage/GPUs/NICs etc.

If you get a consumer DDR5 platform you’ll be maxing out on the core counts and memory from the get-go. And a server DDR5 platform will cost perhaps 2x as much…

Between those 2 CPUs I’m not sure how that would scale for your uses. The clocks are 25% higher at all-core load, but I don’t know if VM deployments scale with cache size?

I’m also not sure why you are getting 2 250GB nvme’s? And €60 for those seems steep? You can get high quality 1TB nvme’s for about 100, like a kingston KC3000, 980pro, etc… If the budget is there, I’d probably look at replacing the 2x4TB HDD’s with nvme storage, it would only be a couple hundred more and probably make a bigger difference in performance for you than the CPU upgrade; assuming you are reallocating a bunch of VMs constantly?

Perhaps a single nvme boot drive of 500GB/1TB and a larger 4-8 TB u.2 drive for VM storage?

2 Likes

When I did my master thesis work I did something similar with only 16 GB of RAM. Of course, I modeled highly specific embedded systems that took up only around 10-50 MB of space.

Depending on what you want to accomplish, you could get away with a 32GB system running 16 cores / 32 threads just fine. It will not be a concurrent simulation of course, but is concurrency a requirement here?

QEmu allows you to create really small virtual machines that run just fine on 128 MB of memory, with 32 GB that is 256 devices that could be controlled with just 4 cores or so.

That said, not saying your ideas here are bad, just that you seem to be overspeccing your use case. I would go with a Ryzen 9 7900 build personally, but I could definitely be wrong here. :slight_smile:

1 Like

Thanks for feedback!

Yeah, it’s hard to estimate for nontypical workload. I try to think if it’s enough to install and boot 50 copies of linux servers without GUI, or a few Windows in reasonable time, if disk IO wasn’t a bottleneck. Only experiments can give a good idea.

I didn’t think about hardware before, usually companies provide test clusters with servers similar to production.

Right now I only have a laptop from 2013 with i5-3210m, which struggles to run a single Windows 11 in a VM on SATA SSD, and a PC server from 2011 with AMD Phenom II X4 910e and 32Gb RAM in storage room, on which I tested Nagios, Cacti and some databases back in the day.

You are right, I didn’t think this part through. My ideas about storage:

  1. a smallest bootable solid state drive for host OS and tools. I want it to be separate so I can replace it independently from data and VM images. Now I think I might put it on SATA SSD to keep the M.2 slots free for other workload.
  2. For VMs I’ll look at options and will get a larger NVMe with sustainable performance. I will look at your suggestions.
  3. I’ll still need a lot of spinning space with some redundancy for work files, images, backups, data dumps. Maybe the 16T one is overkill.

Yeah, embedded systems are nice and small :slightly_smiling_face:.

I didn’t specify, I plan to run some distributed databases, container orchestration and infrastructure monitoring. Some are in Java.

It’s hard to define minimum memory requirements for small setups, and many of them can be compacted quite significantly.

For example, for Ceph:

For Kubernetes:

And there are many weird things around, like Grafana plugin which wants minimum 16Gb memory: Grafana Image Renderer plugin for Grafana | Grafana Labs

I can easily get away with 128Gb, but it’s nice to have more headroom for more scenarios.

Ryzen 9 7900 is faster than Epyc 7303p. By going to server platform I would give up some computing performance for more potential memory space.

Support of ECC is also important. Ryzen does it, but it’s not clear to me which kernel version has full EDAC support.

1 Like

One of the best AM5 workstation boards currently is the Asus X670E ProArt, and yes that does support ECC. A 7900X or, heck, a 7950X3D should cover your use cases quite nicely. DDR5 ECC is still a bit pricey but not that bad anymore. One drawback is that ECC RAM on AM5 is unregistered.

Here is a quick PCPP of an AM5 build based on your specs. Mind you, the reason I am showing this is to give you a comparison point, it is not the way but a way to go, that may or may not align with your use case or budget more.

I did squeeze a little more for niceties like a 4 display output card with HDMI 2.1, for instance. I also used geizhals.eu for memory pricing:

Anyway, here it is, the system as picked by German PCPP.

PCPartPicker Part List

Type Item Price
CPU AMD Ryzen 9 7900 €355.99
Motherboard Asus ProArt X670E-CREATOR WIFI €459.65
Memory Samsung DIMM 32GB DDR5-4800 CL40 €127.63
Memory Samsung DIMM 32GB DDR5-4800 CL40 €127.63
Memory Samsung DIMM 32GB DDR5-4800 CL40 €127.63
Memory Samsung DIMM 32GB DDR5-4800 CL40 €127.63
Storage Samsung 980 500 GB M.2-2280 PCIe 3.0 €52.90
Storage TEAMGROUP MP34 4 TB M.2-2280 PCIe 3.0 €263.75
Storage TEAMGROUP MP34 4 TB M.2-2280 PCIe 3.0 €263.75
Video Card ASRock Intel Arc A380 Challenger ITX Arc A380 €130.47
Case Fractal Design Define 7 €163.14
Power Supply SeaSonic FOCUS GX 1000 W 80+ Gold €144.90
Total €2345.07

I would say there is room to improve. :slight_smile:

1 Like

ECC is working technically, but there is part of “ECC support” which includes OS reporting system, so you can watch for corrected errors. That reporting support for Ryzen 7000 was added to the kernel only last year: https://www.phoronix.com/news/AMD-EDAC-Ryzen-7000-Series , so it won’t be available to me in Debian for another year or two.

I don’t think there is functional difference between registered and unbuffered. Registered design is just to increase possible capacity, so there are registered 256+Gb modules, but max 48Gb unbuffered modules.

A real functional advantage of a server platform would be more protection from Rowhammer memory attacks, in case of running untrusted software in VMs: https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7021.html
AMD does additional protection only on Epycs, not even on Threadrippers.

That looks very good, I have a similar calculation.

1 Like

Actually…

So many Debian users forget about that amazing repo. :slight_smile:

Otherwise I agree with your reasoning.

1 Like

I’ve read a bit about cases. It’s interesting how PC cases have evolved multiple times in the last 10 years.

There was a trend for “silent” cases like Define 7. And now there is the open airflow. Unrestricted airflow allows fans to spin slower thus making everything quiet, and also provides cooling for modern super-hot gaming rigs and workstations. The sealed “silent” cases are obsoleted for most uses.

In Fractal’s line of cases, Meshify 2 obsoleted Define 7 by providing the same but with open front panel. And both of them are obsoleted by Fractal Torrent and the slow death of HDDs. It provides completely open front and back panels, but no cooled HDD trays.

I still think to go with Define 7, because I need some HDDs and they are the only component which benefits from sound insulation. Define 7 has a front panel door which can be opened, converting it to Meshify 2 like configuration. But Meshify 2 can’t be converted for more sound dampening.

I just wonder why Define 7 is still so expensive. Who the hell still buys them.

I rather wonder why the torrent is so expensive. I have it, and it is a good case, but it is plasticky and creaky. But it is silent and cool thanks to the 2x180mm fans. I have not used a define 7 but it seems like it is metal and nicely built, and at the same price as the torrent.

I think the define 7 is a great case if it has the features you need (like HDD bays with airflow).