My Linux cursed dream build 3

First of all I really hope I am doing everything right with this post since its my first one on this board. If not please just tell me.

The story behind the dream build (off-topic only for those who are interested)

It began many many years ago when I was only 11 years old with my first pc magazines, I was browsing and reading thru the magazine and there where advert pages from shops with tons and tons off hardware SCSI there, Ultra ATA this, Matrox Card there and 20GB SCSI that. So I was marking with a red marker if I would have unlimited funds what I would combine and what I would try to build to gather (no Idea of this stuff back then… - probably would have failed miserably) but of course as an 11 year old I also did not have the money for all the ideas I had in my little head. So times goes buy and I did that again and again and again. And I always told myself if I once will make good money I will build my own dream build system. So time goes by, finished school, finished IT training, started working in IT. So around 9 years ago on my birthday I did my first dream build. I think it is still somewhere on a forum in the internet. 3 years later I did the next and a few month back I did the latest dream build 3 (which actually was the craziest one of them all).

The hardware

  • Asus Zenith Extreme with the newest Bios 1601
  • AMD Threadripper 2990WX @ 4.2 GHz stable on all cores permant (76xx Cinebench 15 score verified)
  • 128GB G.Skill Trident Z RGB @ 2933 CL14/14/14/34 stable (did mem runs for more than 48hrs)
  • Four 280GB Intel Optane 900p 2.5" NVMe drives in software RAID0 (three are linked to this NVMe thingy from Asus and one is attached at the U.2 port)
  • Two Gainward GeForce RTX 2080 Ti Phoenix GS nvlinked (PCIe1 and PCIe3)
  • Asus 10GBit card (PCIe4)
  • Corsair AX1600
  • Ton of water cooling / fans with hard tubing for CPU (Heatkiller) / VRM (also from watercool) since they generate an enormous amount of heat on that clock speed
  • Corsair Obsidan 1000D (customized for quite complex water cooling)

My Linux curse on this otherwise amazing build
I started trying to get Linux up and running a soon as the performance testing (cinebenching / memtest etc.) was done. With some big big hiccups already back then. Because of the big SEV bug in the Bios. So most Linux distributions had quite some troubles booting. I was able to get Fedora 28 running but only with blacklisting ccp with kvm_amd needs to work so no virtualization which is my daily business so no Linux for me back then. So last weekend I finally got time and moved or tried to move again from Win10 to Fedora 29, which started out a lot harder than what I thought. The 2080Ti is not supported by the Nouveau version used in the FC29 build. So I had to go NetInstall which seamed to not like UEFI NVMe’s very much, which meant taking out both 2080Tis putting in a old card setting up FC29, installing native NVidia Drivers I used the really new ones 415.27 and switching back to the 2080Tis. Also the 10GBit card did a lot of packet drops so I build the newest driver based on the 2.0.15 they have on their homepage. So far so good. Back to daily business, I thought, and there comes the BIG BIG issue, KVM is not working correctly at all. Its sluggishly slow, it feels like you are on a 56k modem trying to remote a desktop. For example hitting the windows key for the windows menu takes between 8-10 seconds. Booting is a totally different story, I could go for coffee come back and its still booting, so there must be something seriously wrong, but I cant seem to find out where or what. I know that people (for example Michael Larabel from Phoronix) run quite similar hardware and for them it seams to workout great. But I have to say I am not a Linux expert.

What I have tried so far

  • Taking out the 2080Tis completely and putting in an old NVidia Card
  • Taking out the 10GBit Aquantia card
  • Trying Ubuntu 18.10 instead
  • Trying normal SATA SSD
  • Trying none software Raid0 config
  • host-passthrough mode for cpu
  • virtio for everything there is
  • WindowsVM always all the virtwin-latest drivers installed

What I found out

  • I can get descent CPU performance in the VM (tested with cinebench in the windows VM), but graphics even youtube is not useable
  • I can not get descent HDD performance no matter what I use raw, qcow2 (with or without virtIO) (with or without native / none etc.)

Goal
Goal right now is not to play AAA titles on the kvm I am not even passing the GPUs thru to the KVM for gaming I have another NVMe which I quickly attach if I want to. So the goal since I work a lot and I mean a really lot with VMs for lab testing to get KVM working correctly. The goal is to have 5 - 10 VMs open at the same time with descent speed. For some workflow simulation and deployment testings. Which currently is simply impossible.

So I hope this post finds the right people who are also as passionate and geeky about hardware as I am and have an idea what I can try or how I can debug this issue further, otherwise I will have to stop and revert back to Win10 again…

2 Likes

I’m on a 1920x. I had a similar issue until I manually typed in “host-passthrough” as the CPU type. The VM took forever to load, performance was terrible. BUT I see you already have that.

What is your topology setup like? This is mine:

<cpu mode='host-passthrough' check='partial'>
    <topology sockets='1' cores='4' threads='2'/>
    <feature policy='require' name='svm'/>
    <feature policy='require' name='apic'/>
    <feature policy='require' name='topoext'/>
</cpu>

For 4 real cores passed through. I have everything pinned to the first die, I have 2 other cores for IO and Emulator.

If you post your XML file I’d be happy to take a look.

The 2990WX has 2 dies without direct memory access, so you are definitely going to want to pin all your VM cores to a die with direct memory access to get the best posible performance.

You will get your setup working, it’s just going to take time. I think you’ve already discovered what a pain new hardware is in linux. I only recently bought my 1920x because I was holding off until the LTS distributions had kernel support for it and my other hardware.

Also you wont get nvlink working inside the VM. My understanding is that nvlink still requires the old SLI licensing, none of the chipsets emulated in KVM were ever licensed to use SLI.