Questions about hypervisors & VM's on Threadripper

bonkers · February 27, 2019, 9:21am

Hi,

I’m planning to build a server with following specs:
-ASRock Taichi X399
-4x16GB RAM
-AMD 2950x 16c 32t
-4x1080ti (using 8 pcie lanes per card, 2 cards connected via bifurcation)
-1 10gbit NIC
-2x m.2 SSD, 1 per CPU die
-My systems won’t have a GPU for the host.
-I don’t need dynamic CPU or disc allocation.

I’m planning to run 4 headless linux VM’s for machine learning, which would be accessible over local network & internet. I’m not very familiar with virtualisation but eager to get started. I used Unraid in the past for what it’s worth. Passing through all GPU’s on Unraid was a bit of a pain but worked nonetheless.

I’ve got some questions though about my planned setup:
-I’m doubting between Xenserver and KVM (will probably end up testing both), what would be ideal in my scenario? Performance is priority, and looking for a free solution.
-Is it a good idea to go with Threadripper for this? I’m thinking it should be fine since I can split up the VM’s with relation to the CPU die on which it runs. However I’ve seen people recommend against threadripper due to issues with virtualisation. I would also consider EPYC 7301, however single core performance is also important.
-Where can I find a layout how the die’s are divided over the pcie and m.2 slots for the X399 taichi to avoid slow downs in communication between devices?
-Can I easily not have any GPU for the host?
-Do I need to spare CPU cores and RAM for the host or can every VM get 1/4 of the cores and RAM? How much does the host need?
-Out of curiosity: Can the host prevent abuse of the VM’s GPU? (prevent excessive overclocking for example?)

Thanks!

Dragon6687 · February 27, 2019, 1:29pm

One Hypervisor you might want to consider trying is VMware ESXi. If you create an account with them you should be able to get a free permanent license to run the host software with some limitations, like being limited to 8 vCPU’s per virtual machine. I’ll have to do some digging to see whether or not they block off PCI-e Passthrough on the free license though, I don’t THINK they do…but they might.

VMware ESXi is pretty much THE name in virtualization in the enterprise market, nothing really comes close, not even Hyper-V from MS (they try hard…but still can’t really compare). The ESXi host can run on less than a 2Gb flash drive. When the Hyperviser boots, it pulls the OS info from your storage device and throws it into a small cache and just runs from there, never really accessing the “os” drive aside from maybe writing some logs. Because of this the Hyperviser itself takes up extremely minimal resources, and you certainly don’t need to reserve GPU’s or anything for it.

Now I’ve only installed or run VMware ESXi on old full enterprise class server chassis (like old Dell PowerEdge machines) and it always runs like a dream. VMware has a list of supported hardware, I don’t know off hand if they support basic AMD Ryzen chips (I know they support EPYC) but just because it’s not on the list doesn’t mean it won’t run.

Since it’s free (you should be able to run a 60 day trial version if you don’t want to sign up for a free permanent license) can’t hurt to try at least.

I know this doesn’t answer all your questions, but I hope it helps a bit at least

TheCakeIsNaOH · February 27, 2019, 2:47pm

lstopo

Depends on the host OS. With a Linux based host, yes after install, quite a bit harder during install. ESXi, yes after install, dunno during. Just use the boot 1080ti for host video during install.

Depends on host OS and how much you are doing on the host, ie are you running software raid on the host OS or something.

Nope

KVM almost certainly.

bonkers · February 27, 2019, 4:02pm

Thanks for your answers!
@Dragon6687 The reason I only considered KVM and Xenserver is that that AWS used to use Xen and switched to KVM, so I figure it’s good to get experience with what is used on massive scale. Not only for best performance but also to look good on my portfolio for future work opportunites.
I’d like to be able to automate the process of setting up such machines in the future.

Not planning to use raid, just splitting 1 drive per 2 VM’s. Host OS would be some flavor of linux, VM’s linux as well. The host doesn’t need to do anything besides managing those 4 VM’s.

Are setups with no GPU for the host generally ill-advised? Or is it just a matter of changing some configurations post-setup (which can be automatised)?

TheCakeIsNaOH · February 28, 2019, 2:24am

If you are not going to be running X then leaving a cpu thread and 512mb-1gb for the host and allocating the rest is probably fine.

TheCakeIsNaOH · February 28, 2019, 2:29am

It depends on how good you are at not breaking ssh access. If you are running headless, then if you break networking or ssh it is time for a host OS reinstall. You also could see if you have a serial connection available and that would work fine. Linux works fine without a gpu, it is just hard to fix things if you do not have a method of controlling the computer.

gordonthree · April 2, 2019, 6:37pm

Passthrough on ESXi is pretty easy, on par with unraid as far as doing it all from a gui and not a command line. I have it working great on an x79 board (Intel Xeon), could not get it working on my AMD x399 but I think I have a bad motherboard / bios / user error.

With ESXi 6.7 you’re limited to 16 per VM. Tricky part for my testing was finding enough USB controllers to pass through to guests so I could plug/unplug keyboards and mice as needed.

Unlike other linux distros, vmware had no trouble disconnecting the boot gpu and passing it through to a guest. During boot up, the console just stops updating, and eventually a guest appears.

bonkers · April 4, 2019, 3:33pm

I was reluctant to look at ESXi for the sake of cost and I’m glad I didn’t end up using it.

I ended up choosing KVM with Proxmox, running on an epyc 7401P, works a charm. Having ipmi facilitates things a lot when the machine goes to the datacenter.

The current topology:

The machine has a total network bandwidth of 30gbit/s. Cards 1-4 are 1080ti’s, 1 & 2 are on the same numa node but seem to have the lowest P2P bandwidth (10 vs ~15 GB/s). The P2P latencies range from 10 to 15µs.