(in)sanity check and storage questions on AMD Threadripper PRO build (with TLDR)

Note: I’m not a native English speaker so I apologize in advance for the grammatical mistakes I might have made :wink:

Need a workstation for vm’s, VFIO and dataset processing e.g. LOD creation. Looked at x579, Z690, X299, WRX40 and WRX80. First two have a lack of pci-e lanes. third option is scarce to come by and based on availability of parts the last two options are equal in price but not in spec. So my questions for the build:

  1. I need 2TB of usablespace to begin with but will grow to double and wish to go all in on m.2 storage and do like redundancy. But not now because of funds… I am currently testing 2 bcache drives with one 120GB SSD cache drive and one 4TB HDD as a BTRFS Raid -1 file system.I get a cache-hit of 6~10 percent when running vm’s off that file system. Does a bigger cache ssd of 1TB help performance go up or am i just wasting valuable SSD space. If yes, i waste storage, having two 1TB ssd’s and two 4TB HDD’s what would beside ZFS be a great and fast storage solution, besides getting two 2TB m.2 ssd’s in RAID 1.
  2. On a WRX80 platform is there a noticeable penalty for running four sticks instead of eight?
  3. How can one get a general idea of the amount of memory that is needed for the foreseeable future?
  4. If a WRX40 build with 8 more cores but 128GB memory and no upgrade path (as far as the rumors go) make more sense than a WRX80 build with 16 cores and 256GB of RAM and possibly an upgrade. I can wait till 4th of januari to find out what is true about the rumors but cannot wait for the parts to come to market.

More explanation is written below…

So I find myself in a situation that kind of s*cks. I have two desktop computers. Both Ryzen 3000 systems on a X570 chipset. One Windows machine for gaming and a photography workflow that involves Adobe software, and a sidekick that i bought to get familiar with the Linux desktop. To play around with so to speak. I work in the cultural heritage sector as an IT specialist and Open Source is important for cultural heritage. So practice what you preach was the main driver of buying a second system with Linux.

Both systems are degraded. It started a couple of months back with my Windows machine and since a couple of weeks also with my Linux machine. Both systems will log recoverable CPU errors and out of the blue do a hard reboot. I have since figured out that it is related to the C-states settings in Bios. Because when I disable C-states in Bios, the errors and spontaneous reboots disappear. So in the last months after hours of troubleshooting I decided that I do not trust my current setups anymore and that I am in need of something new. The Windows computer is all figured out the Linux computer is the main subject of this topic.

In the last months I’ve also decided that I also want to prepare for a IT-career change / skill-update and started a Ceritfied Ethical Hacker course. Since that time my Linux machine is not for playing around anymore but is my daily driver. And the current system (Ryzen 3600, 16GB 3600 memory, mini-itx motherboard and 256 GB ssd drive) does not cut it anymore for the things I want to do. That is mainly running a lot of virtual machines to hack and do the hacking with. But also load production copies of systems/environments that need a security audit (or are given to me to try to hack). And with my current profession I do use a lot of big datasets to transform metadata of digital cultural heritage information into Linked Open Data. So a lot of disk IOps are a nice to have too. Also Quiet is king. My desktops are in my living room and i’m not the only occupant of that room. And of course with the whole pandemic, these computers are my way of making a living, so stability is a factor.

My budget is 5000,- EURO and I live in The Netherlands, no preferred supplier but I try to buy my stuff locally. As far as what I selected for this new workstation is the following list (all is based on the immediate availability of parts):

  • AMD Ryzen Threadripper PRO 3955WX
  • Micron MTA18ASF4G72AZ-32G2B1 (8 sticks of 32GB per QVL)
  • 2x Samsung PM9A1 1TB (i’ve confirmed that firmware updates for PM9A1 drives are published/pushed in Linux)
  • 1x Samsung PM9A1 512 GB for the host OS
  • 1x Ubiquity UniFi Switch Flex XG so I can hook up this workstation to my 10Gb network and have a free port for my Windows machine too. I cannot get more ehternet cables into my living rrom.
  • EK-Quantum Reflection PC-O11D D5 PWM D-RGB - Plexi (ainly so I could fit a couple of Fans on the bottom of the case if that is necessary and not to deal with a pump/res combo that needs to fit the case also.
  • EK-CoolStream PE 360 (Triple)
  • 6x Thermaltake Toughfan 12

I already have:

  • MSI Armor AMD RX580 4GB for the host OS and Spice 3d acceleration for the Linux virtual machines
  • NVIDIA Quadro 2000 for VFIO purposes and a Windows 10 virtual desktop because Spice 3d acceleration does not work with Microsoft VM’s.
  • Lian-li O11-dynamic (I will find out if the claim of ASUS and Lian-li of it being E-ATX and E-ATX compatible is true)
  • EK-CoolStream SE 360 (Slim Triple)
  • 2x 4TB 7200rpm mechanical harddrives.
  • 2x PNY 120GB SATA SSD’s
  • Seasonic Prime PX-1000
  • CyberPower 1000w UPS
  • Good screen and other peripherals.

This list will get me close to the budget of 5000,- EURO.

(in)sanity check:
The main reasons for me to go TR-Pro is
a) I think i need more PCI-e lanes than X570 or Z690 can supply. My wish is to eventually go all in on m.2 storage and with VFIO my believe is that more PCI-e lanes is better. I also believe 128 PCI-e lanes is overkill but read points b and c.
b) The rumor that there will be no upgrade path if I choose sTRX4. New TR-pro cpu’s are on the horizon but not a word on the non-pro variants. I can wait for the rumored AMD announcement in January but can’t wait for the new (pro) variants to hit the market.
c) A build based on sTRX4 with the current availability of hardware is also around 5000,- EURO but than with 128GB of memory and with a 24-core 3960x. Maybe a better deal?

  • AMD Ryzen Threadripper 3960x
  • ASUS ROG Zenith II Extreme Aplha
  • G.Skill Trident Z Neo F4-3600C16Q-128GTZN
  • All the other stuff from the above list.

Than there are some, for me, uncertain things I could use some advise on.

Question 1, memory:
I could lower the investment-costs by changing the Micron memory for Kingston KSM32ES8/16ME or use 4 32GB Micron sticks instead of 8 32GB sticks but i’m unsure if it’s wise to cut the amount of memory in half to 128GB of memory or if there is a measurable difference by using 4 or 8 sticks on a TR-pro motherboard. I hope you can give me some advise on this. How for instance can I get a good idea on how much memory is “sane”. Or is it just, more memory is better and I should buy as much as I can afford?

Question 2, storage:
My current Linux system has 2 - 120GB ssd’s and 2 - 4TB mechanical drives and I’ve been experimenting with these. My wish is to eventually (in the future) go all m.2 but I do not have enough funds to create a storage solution that is big (roughly 4TB of usable space) and redundant so I can still work with the computer while replacement storage is shipped in case of a defect. (the computer is needed to make an income)
So i’ve experimented with ZFS and came to the conclusion that the openZFS team does an amazing job sometimes does not have new versions of the packages available when a new kernel is pushed to my OS. (happened with the 5.14 kernels and up) Effectively rendering the ZFS pool useless and forcing me to boot into older kernels. My opinion is that ZFS has a place but not on the desktop.
So in my research I’ve also got myself familiar wit BTRFS and settled on BTRFS as my filesystem. It allows me to use snapshots of VM’s that are configured with a UEFI bios (when the vm is off) and with raw disks as VM disks the performance is good (as to the native disk performance).
To configure the BTRFS filesystem i’ve made two bcache drives out of the 2 ssd’s and 2 mechanical hard drives and slapped a Raid-1 BTRFS filesystem on it ( Bcache - ArchWiki -Read solution three ) and my initial thought is to change the 120GB ssd’s for the two 1 TB ssd’s in the list above to get higher speeds, have a cache that is big enough to cache most of what I need on a daily basis, etc etc. But testing with the current bcache solution makes me think that this is not the best way to go. I get a cache hit of only 6 percent when running VM’s on this storage solution. So what would be a better solution storage wise? Creating a Raid-1 set out of the two 1TB ssd’s and use the 120GB ssd’s with the mechnical drives in another way? Ditch the whole mechanical drive solution, go with 128GB memory and invest in some bigger m.2 ssd’s? I’m a bit at a lost here.

Last but not least. I’m running Fedora Workstation 35 and like it better than POP_OS! or Ubuntu, but that is all the experience I have with a desktop Linux. Server wise I’m working with RedHat, Ubuntu and Debian machines. And I do like Gnome better than KDE as an desktop environment.

So I hope you can help me out with the questions and the (in)sanity check.