Consolidating a home lab and desktop a single host using Threadripper/ TR Pro/Epyc

Hi Everyone,

Recently I have been considering upgrading my TR2970 Desktop and my Home lab esxi 6.7server (old Dual Opteron server latest/last gen for opteron). The idea would be to consolidate this into a single 7th gen Threadripper (Pro?)/Epyc 32 Core system where I can consolidate the workloads on old ESXI 6.7 server to a single ESXI 7/8 or Proxmox system and do device pass through of my GPU/Etc to a VM for a Windows 11 desktop with light gaming.

I do astronomy and process a lot of data and high core count, super fast storage and powerful GPU greatly accelerates my processing. Most of my software only supports windows and some mac os :frowning: so going linux for the data processing is not feasible.)

Desktop System relevant specs/info.
W11 Pro
TR2970WX -24 core
64Gb
4x2gb 990 pro - raid 0 (Limited to Gen3 speeds to do platform)
RTX 3080 TI
Sometime i run Vmware Workstation pro and host some occasional use VM on this system.

Esxi 6.7 Host relevant info.
VM count 3-7 but usually assign out 12-16 cores running at time. Lightly loaded.
Peak memory usage 64gb+/-
Main usages: NAS, Backups, Secure VDI for downloading/testing, Home Automation, etc…

One goal for doing this would be to improve performance and reduce power. Both my current systems are on nearly full time and the old server pulls 200 watts. Security is also key aspect that I care about and want good VM isolation.

The thought would be since the VM’s are lightly loaded would be to over provision the box on the cores. Like assigning 26-28 cores for the desktop and 16+/- for the remaining VM’s. I dont process data 24x7 but the idea would be that the W11 VDI would normally not use most the cpu power assigned to it and leaving plenty of power for the other VM’s but could consume the cores when needed.

Has anyone had good experiences running a Virtual environment like this? what are the shortfalls.

Also considered the options of a low power homelab server and HEDT Desktop (similar to current setup) or going EPYC for the Server and maybe conumer grade for a desktop but then I need multiple high end gpu’s to offload the processing to the server and that could up the power usage. Cant really afford to do both EPYC Server and Threadripper/HEDT.

One high end server with a VDI with passthrough seems like the convenient setup and allows me to “shift” the compute power to the where i need it at the time but i could be missing something here.

So looking for help/thoughts about this and recommendations.

2 Likes

Free ESXi is now dead, so I’d suggest looking straight at Proxmox.

Can you give us more details about the kind of processing you do? You still have a pretty fast system, so it’d be good to work out what is limiting you.

  • Is it mostly GPU bound, or is some CPU bound?
  • Is it mostly single threaded CPU, or multi-threaded CPU?
  • GPU memory bound, GPU processing bound or GPU bandwidth bound?
  • System memory bandwidth bound, or system memory size bound?
  • Storage speed bound?

(Cliff notes: Pick threadripper for single threaded bound, EPYC for multithreaded bound or memory bandwidth bound).

The biggest annoyance with passthrough is that a GPU can only be passed to one VM at once, but you can move it between VMs if only one is running at once. (I have a gaming VM and a AI VM that I can switch between, for example).

Thanks for the thoughtful response.

The Astronomy image data I process appears to mostly be at times either CPU Multi-threaded bound or Storage speed bound. The software can easily fully utilize all 24c/48t at 100% once the data elements are in memory. When the process nees to pull the raw data off disk the system will push the the disk utilization 80-90% but the max CPU utilization drops substantially to like 40-70%. some steps are all CPU Mulit Core and others are completely dependent on the disk speeds.

Data set size for the raw data can be anywhere from a couple gigs to 50-100 GB and this is why I use a 4 drive raid 0 (990 pro nvme) and this helps push the CPU utilization much higher during the raw file processing.

The processing takes the dataset and goes through an algorithm adding the contents of the raw files together and creates a merged dataset that is only a few gigs in size. so the processed set is fairly efficient in its memory usage. So more memory can speed thing up but only marginal vs faster disks or more compute.

The main processing is not GPU bound as the GPU is used in only some sub processing steps for ML and image processing but is you could turn off GPU assistance and do it all in the CPU. That being said those steps are faster in the GPU VS CPU.

As for esxi vs proxmox I am leaning towards going the proxmox even though i do have a license for 7.0 because I was planning on getting off 6.7 but the no more free esxi pain as i missed the chance to get an 8.0 key. Might be good to get out of my comfort zone around esxi and learn proxmox.

Between the 2 which one is better to support the GPU pass through and the best Virtual Desktop experience for my desktop?

Thanks for the info on pass through but since I have never tried GPU pass through on a VDI like this and If I did hardware pass through on the Keyboard/Mouse/Sound/GPU/Etc… would that VM behave and feel like those devices were actually connected to a dedicated PC or is there a lot of VDI issues/glitches to be expected. The in theory say it would work but realistically doing may be different and any feed you guys have on that would be helpful.

If you’re using a passed through keyboard/mouse/GPU, you won’t be able to tell that it’s a VM. Especially if you pass through a USB controller or PCIe USB card, so keyboard/mouse are directly connected.

https://pve.proxmox.com/wiki/PCI_Passthrough

This month I built a small form factor RYZEN Windows 10 PC and TrueNAS in one using Proxmox.

I could not get the APU graphics to work otherwise it would have been two Windows PCs plus file server. I was able to pass through a GPU to Windows plus the mouse and keyboard and audio. I fitted a SATA card which I passed through to TrueNAS. Setting them to boot on startup and the PC to boot when power is connected it works without problems.

Actually one problem is if the user shuts down Windows they need to reboot the hardware to get Windows back up again.

ESXi is a very feature complete commercial product. Proxmox is much easier to use until you want to do more advanced things and then you’re on your own but with community support.

I think the range of things you can do on Proxmox will expand greatly now that free ESXi is gone. Individuals will still use ESXi but in order to learn it for a client, not because it’s the best tool for the job. Companies will investigate Proxmox to see if they can switch to it. Many are switching.