Consolidating Homelab into Threadripper Pro Workstation

I did exactly this. Started with an X10SRL + xeon 2650L v3 and two X11SPM + Xeon 6139s, consolidated everything into an M12SWA. This was back towards Thanksgiving or so last year, so anticipating zen3 TR Pro, I bought one of the bundles they had with a 3955WX to ride me over the “few months, surely no more” itd take for the 5975wx to come out.

Here we are are almost a year later, and I’ve finally got the damned thing on order lol.

My justification here was basically that I’d been iteratively buying upgrade after upgrade trying to make everything work the most cost effective way possible, and while it was going well, the time investment necessary to find deals which were not just “good”, but “if you dont respond in 5 minutes its gone” good, was just way too high. Doing the math, I finally decided to go all in on one new system to do it all, hoping for 5 years out of it.

So far, even with the 3955wx, it’s going pretty well; the things got a lot more horsepower than I think its given credit for, for sure. I’ve been augmenting my setup with work’s lab HW in the interim, and while that’s been a huge annoyance, it worked out alright as a stop-gap.

Cant wait for the new CPU to arrive, and never have to worry about some clown torching hours of config investment for another repro!!!

1 Like

This is exactly what I’ve been looking for! I’m glad to hear it’s worked out well for you.

Are you able to elaborate on what OS/hypervisor you’ve been running?

Also, are you able to speak on how much heat it seems to put off? My 5900x is fine in my office, in terms of heat output. I know TR Pro is going to inherently be hotter. Just want to make sure it won’t be too hot and lead to me sweating it out in my office haha

The amount of heat is directly tied to power used, so I guess it’s all relative really…

If it helps for comparison’s sake, it never dips below 196w, and usually runs closer to 220-250w unless I’ve got transcodes running - this all VMs powered down, both GPUs confirmed idle power state, and some ~40 containers running. Admittedly theyre not doing a ton - usually plex always has at least one stream going, nextcloud syncing some files, wife’s mac doing time machine crap, stuff like that. If both gaming VMs are running and active (e.g. wife and I somehow both finding time to game together) and the server’s also busy elsewhere, I’ve seen as high as ~600w.

I’d expect much higher with any current gen GPUs, but I’ve intentionally stuck with a 2070 (blower style, 175w TDP) and 1650S (simple dual fan, 100w TDP) specifically BECAUSE the ampere cards are all so hungry - when compared to what we have and considering the games we play, no way to justify either the upfront (they’re still far too pricey for my blood) or running costs (legit concern as the system’s on 24x7).

This system’s currently running UnRAID, namely again for power’s sake. I’ve 10 HDDs in the thing (with 4 NVME for VMs and “hot” data), and the extra 5-10w each definitely add up. Which hypervisor you should use though is up to your use case - I still use proxmox heavily, and cant say enough good things about it!

1 Like

AM5/Zen4 details will come out very soon. The era of PCIe 5.0 in consumer space offers the opportunity to be really interesting. Lots of fun stuff can be done in terms of expansion slots. Whether it’ll become mainstream or only a niche market where very few vendors (such as ASRock server team) will participate, only time can tell.

Rumors said Zen4 Ryzen provides 24+4 PCIe 5.0 lanes. That’s total of 28 (4 reserved to the chipset) PCIe 5.0 lanes, equivalent bandwidth of 56 PCIe 4.0 lanes or 112 PCIe 3.0 lanes. Insane amount of bandwidth. The problem on consumer boards will still be lack of PCIe slots, which I believe AMD and its board partners are not interested in addressing. They would prefer you to go for TR/TR Pro if you want more PCIe slots.

With respect to OP’s question, whether it’s a good idea to consolidate everything into one machine. IMO it really depends on the type of work, experiments you do on your lab machine. If it’s purely software, far away from dealing with drivers & hardware, then I believe you’ll be fine. Otherwise, likely you’ll reboot your machines often, bringing down unrelated VMs/services - quite an inconvenience. If so, then more flexible with multiple machines.

1 Like

As far as I know, any new EPYC or Threadripper Pro will be based on Zen 3; also, AMD isn’t producing any more Treadrippers, which is what I have heard. So we will know tomorrow unless AMD decides to delay its announcements.

I believe AMD is only going to release AM5/Zen4 details ie. Ryzen, consumer processors tomorrow. Zen4 based EPYC launch will be next year, Zen4 based TR Pro will be far away after EPYC. That’s how AMD did product launches in the past. So likely they can change as they see fit.

Still will be interesting to see some details tomorrow. E.g. is it really 24+4 PCIe. 5.0 lanes? Is the dual-chipset X670E really as stupid as rumours said i.e. chained in series? Or it could be more interesting, both independently connected to the CPU.

Just to clarify is it lane count or slots that you are after? Are the devices you need to add full-on x16 or just a bunch of lower bandwidth PCIe cards? For example, you can get in intel side an ASUS PRIME Z690-P that has four x16 slots though three are x4. I suspect someone will make something similar on the upcoming ryzen. You can also create x4 slots using the (slightly ridiculous) a high number of M.2 slots on modern mainstream motherboards. If you need full on x16, I would do TR yah or wait and see next month what intel’s new HEDT platform looks like. I have a couple of machines on the old faithful x299 platform.

I have to admit I am curious if you came from a Linux space going to windows with hyper-v why not go the other way with a VM on Linux to run windows? With looking glass and pass-through GPU I have essentially native performance for windows inside of Linux. Genuinely wondering why you chose that path?

The way I see things going/what I did

  • 2-4 TB NVME x3 m.2 slots in box = local fast storage
  • Local sata ssd for tier two storage
  • 16 cores + 128gb ram enough for home lab
  • Archive on nas
  • X16 slot for gpu
  • 10 Gb Nic in a slot

All that fits in x570. X670 with PCIE gen5 and maybe more lanes will give you even more.

Unless you need more than16 cores and 128 GB of ram or are doing VFIO you probably don’t need threadripper or epyc. Especially for home lab.

Want yes. Need probably not. Unless you have a very big case with multi GPU plans and a lot of money to spend on high speed SSDs, RAM and add in cards.

For a single user you’re generally struggling to push faster than sata SSD for most stuff and you can still get 12 TB of local m.2 NVME in x570… without using pcie slots.

Threadripper is already niche. For high end oil and gas modelling or people who are getting paid by the hour for stuff that scales with more cores. Like VFX, rendering, AI etc.

Spinning up a bunch of VMs to learn/lab? You’re maybe wasting money you could be spending on hardware you can’t emulate. Or money you could spend on a cheaper cluster of 2-3 x570 machines and have a high availability cluster.

2c. Definitely think long and hard about it. My experience mirrors @wertigon

A lot of people massively overestimate just how much resource consumption is required for a home lab Vm workload. And underestimate just how capable the consumer platforms are of handling it today.

You don’t have thousands of users in your house hitting your lab! :joy:

If you need threadripper by all means get it. But if you’re not sure and buying into it because “HEDT!” You could be blowing a lot of money that could better be spent on network gear, storage, bandwidth, software, training material, books, etc.

2 Likes

True. But the same could be said about cars and other hobbies as well. :slight_smile:

Hey, if you want to buy a Ferrari over, say, fix that damn leaking roof over your head, I completely understand, a man gotta have priorities! :grin: Though personally I would’ve invested in a water boiler or 3 kW of solar power or a heat pump instead, there is a lot you can do for that $1000-$4000 extra money.

Then again, the only one that can actually decide what to do with your money is you (or the one holding your wallet). Some choices are wiser than others, no doubt, but I have no right to judge, only pass advice on what I see is the better fit.

1 Like

This. If I made sacrifices and wanted to absolutely cram everything into my 5900x setup (with its limited PCIe lanes), I probably could. There’s also some fun to be had here, along with learning opportunities provided by some of the expansion capabilities.

Lanes. My graphics card will occupy a full x16 but the rest of the lanes I’m wanting to be used would be occupied by an HBA, a 10 gig NIC, and a decent amount of NVMe. It just doesn’t seem super ideal, even if I were to take the HBA out of the equation.

Definitely. I had 5 motorcycles in the garage. Back down to 3.

But my point is don’t assume you need it that’s all. Because there are so many other toys.

Having seen AMD’s presentation on power efficiency increase from 7nm to 5nm and architectural change from Zen3 to Zen4, it seems lower-core-count Zen4 EPYC or Threadripper Pro will be very interesting option for workstations.

At 65W TDP, Zen4 performs 74% faster at the same power budget. At 105W TDP, Zen4 performs 37% faster at the same power budget. At 170W TDP, Zen4 performs 34% faster. In the last case, the Zen3 Ryzen was overclocked as there is no 170W TDP SKU. But we get the sense of efficiency increase from both process node upgrade and architectural change. Perhaps could extrapolate a bit at higher TDP points, roughly Zen4 will be ~30% faster at the same power budget.

I believe a 16-core or 32-core Zen4 EPYC/TR Pro will be an efficient choice for a workstation sitting under your desk. That assume you have the need for seven slots of full-lane PCIe 5.0.

Lots of details about AM5/Zen4 seem on embargo until 27 September. Digging up some old rumours turns out the connection between CPU and chipset is only PCIe 4.0 x4. The dual-chipset X670 configuration will be chained in series, providing 12 lanes of PCIe 4.0 for PCIe slots or M.2 slots. The single X670 chipset config provides 8 lanes of PCIe 4.0 for PCIe slots or M.2 slots.

So in premium boards, we will see 24 PCIe 5.0 lanes from the Zen4 Ryzen, plus 12 (or 8) PCIe 4.0 lanes from the dual chipsets (or single chipset):

  • 4 PCIe 5.0 dedicated to M.2 storage
  • 4 PCIe 5.0 dedicated to USB4
  • 16 PCIe 5.0 dedicated to GPUs, can be bifurcated into x8 x8 or x8 x4 x4

If AMD allows board partners to provide the 4 PCIe 5.0 (for USB4) as a PCIe slot (for those who don’t need USB4), then the expansion slots could become very interesting. So the ideal layout of PCIe slots for an ATX board in my opinion:

slot 1: GPU x16 PCIe 5.0 (or x8 PCIe 5.0)
|
|
slot 2: x8 PCIe 5.0 (or x4 PCIe 5.0 or unassigned)
slot 3: x4 PCIe 4.0 (or M.2 slot; from chipset; or unassigned)
slot 4: x4 PCIe 4.0 (or M.2 slot; from chipset; or unassigned)
slot 5: x4 PCIe 5.0 (x4 PCIe 5.0 from USB4 or GPU; or x4 PCIe 4.0 from chipset)

I think that’s lots of bandwidth for speed, and plentiful slots for add-in cards. Hope I’ll see such a premium AM5 board.

I really doubt we’ll see a 16 core threadripper 7000 or EPYC.

I could be wrong, but i suspect that by the time you get into the IO and memory capacities they provide, you’re beyond the capabilities of 16 cores for everything but extreme niche workloads.

Or put another way: 16 cores will be handled adequately by 24 lanes of PCIe 5 and 256 GB of RAM that you’ll no doubt be able to stuff into an AM5 platform. Or put another way - if you’re scaling beyond those IO/RAM requirements, the CPU cost increase to go from 16 to 24/32 cores or more are negligible vs. the rest of the platform costs.

8 core threadripper 1900x was discontinued due to the above, i suspect that 16 core threadripper/epyc will go the same way this time around.

16 cores are a lot. But I can understand people’s perception diminishes as number of cores per processor keep increasing. A 16 core EPYC/TR Pro configured with 4 RTX GPUs will be an awesome ML machine. ML is perhaps the sensible reason people are buying TR Pro like hot cakes.

BTW, I’m indifferent to whether 16 core EPYC/TR Pro will be available or not personally.

Btw, I very much doubt that AMD’s board partners will release my ideal AM5 premium board.

I do see 3 mechanical x16 PCIe slots from a couple of vendors. So x8 x8 or x8 x4 x4 all PCIe 5.0 are definitely possible. The 3rd mechanical x16 slot can connect to x4 PCIe 5.0 (dedicated for USB4) or not remains to be seen. From their descriptions, seems not sadly.

Nevertheless, the bifurcation of the x16 PCIe 5.0 lanes (for GPU) can comfortably support two GPUs for very high-end gaming, high-end video editing, high-end audio workstation, entry-level ML, a couple of big VMs, and many smaller VMs and lots of interesting stuff.

One thing I can’t understand though is the stupid number of M.2 slots and USB ports.

Yeah but my point is, if you’re buying into the socket with the associated (high) platform costs, why not double the cores for like 10% more money (total system cost)? Its not quite a rounding error… but the bang for buck at that point is SO MUCH BETTER by stepping up.

Sure, its all money, but buying an epyc or TR system to then cripple it in one area with “only” 16 cores by not spending that last 10% to get half the core count just seems… silly to me.

2 Likes

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.