Virtualised WS/Server Build

Hello, I’m trying to build a workstation+server based on Qubes, I’m sure many of you are aware of it. It’s been a few years since I enjoyed pouring over the endless stream of component reviews to make my selections so I was hoping you could help.

The goals are as follows:
I want a relatively secure system (compartmentalisation through Qubes). That I can run a nextcloud instance on, a couple other services, and utilise the remaining resources for desktop environments that I may want to pass devices through to. I think it would be quite nice to have this all on top of the backup and data integrity features of a ‘next gen’ filesystem such as btrfs or ZFS, currently I am leaning towards ZFS because it seems less fraught for us mere mortals.

Please point out any issues you notice.


Here’s where I am at with the build:

Already Picked:
PSU - Corsair HX850
CPU - Threadripper 1950X
GPU - Whatever is laying around.

Some questions:
Cooler - Noctua’s still best?
Memory - 4 x 16GB DDR4
3GHz regular or 2.4GHz ECC?
Storage - 4 x 4TB in RAID Z
Any particular disks I should consider?
+ In what configuration should I layer faster storage?

No idea:
Mother board -
Expansion cards - (networking maybe?)
Case - Small as possible, no window.



See also:

I’d grab a 480gb Intel enterprise SSD for a cache drive, throw in a 20GB ZIL and a 100GB L2ARC.

regarding the drives, they’re all pretty good. I’d recommend raidz2 for anything over 2tb, since the stress on a repair from a drive failure can cause a second drive to fail.

Also, you can get larger drives for cheap if you like. The WD EasyStore 8TB drives (if you’re in the US) from Best Buy have HGST he10 white label drives in them for like 170ish. That’s what I’m rocking in my NAS and it’s wonderful.

Want aircooled or watercooled? Noctua or Enermax Liqtech are my recommendations, depending.

ECC is overkill. I’d grab some 3200MHz units.

This is going to heavily depend on everything else, so I’m going to hold off on making a recommendation until your other components are more fleshed out.

@bsodmike swears by the X399 Asus board. It’s expensive, but it comes with a 10G PCIe nic, and it gives you the best IOMMU solution I’ve seen.

Asus X399 solves this, if you go that route.

1 Like

Storage:
If I can find those EasyStores in Cannadorf I will give them a look but I’m trying maximize redundancy at not huge costs.
I assume NVME for the SSD? (they may not even bother with SATA anymore)
So would the ZIL and the L2ARC go on the same drive? (throwing around acronyms I barely understand)
480GB for performance and longevity?
Mirrored RAID-Z1 vs RAID-Z2?

Coolers:
I think I’d appreciate the dry solution. I’m up for whatever cooler gets me a tight form factor without hampering stock perf.
Is there much between the U12S and the U14S?

Memory:
I’m coming from a system with 12GB 1.6GHz DDR3. My priorities for this system are stability, reliability, integrity. I don’t want any files to be corrupted that may possibly not have been.
Do you still believe ECC is overkill if that is the goal?
Is the difference in performance so dramatic?

Case:
See “coolers” above ^

MoBo:
Would we be talking about the Zenith, Strix, or Prime? My only concern with these is that they are all E-ATX and see the above ^^
I hear that the Taichi is pretty good now, do you know anything about that?

Expansion:
In regards to networking I somewhat like the idea of having two separate NICs so I can separate them between the WS side of the system and the servers.

Yeah, let me find a link to the one I was mentioning:

here

This thing is a beast.

I don’t know too many details regarding the coolers, so I’m hoping someone else can chime in here. My recommendation is to grab whatever fits in the case you want.

Zenith.

You’re going to be hard pressed to find an X399 board that does well in regards to IOMMU and is small. That is a niche that just doesn’t seem to be catered to at the moment.

@wendell might be able to chime in on this one though, as I haven’t gotten my hands on any of this hardware personally, yet.

Hmm, not a bad idea. I think the Zenith has onboard networking as well as the 10G nic in PCIe form, so you can make your decision on that one.

I don’t know much about it, sorry.

Yes and Yes. I see about 10% to 15% cpu performance between 2400 and 3200.

The reason for that is that the infinity fabric frequency is tied to the memory clock.

Yep!

Uh, I wouldn’t do mirrored RAID-Z1. They kinda defeat the purpose of each other. Do either Mirrored or RAID-Z, not both, and definitely not neither.

Yes. ZIL, also known as SLOG (separate log) is the “intent log” think of it kinda like the Journal in XFS or EXT4. ZFS writes to the intent log before committing the data to “cold storage” which is the array.

the L2ARC is the Level 2 Active Replacement Cache. Basically, the most commonly used data gets stored in that, so it has faster reads.

1 Like

As far as coolers go they both could limit how small you can go. Both are fantastic in the cooling department. The noctua route could limit how narrow your case can be and also ram clearance while the aio route means you need at least a 240mm fan area. There are 120mm aios but they all perform worse than the noctua while costing about the same. The one that was recommended is probably the best for TR due to cold plate size and also serviceability down the road. If it were my money I would go with that but I also wouldn’t feel bad about having the noctua.

1 Like

Exactly. It’s the only one I know if that has full CPU coverage on the coldplate, and is an AIO.

Thanks for the explanations.

Storage:
Did some looking at the ZIL and L2ARC. The separate ZIL (SLOG as you pointed out) seems to be more of a write cache than a journal. Going to have to look into whether redundancy is necessary for the SLOG. Do you, personally, run them both on the same drive? Have you tried them separately?

Ah, the 900p, that’s not really an enterprise drive. But I can totally see how optane is the best technology available for that role, nice pick. I will probably add one further down the line to spread the cost out. A cursory look seems to indicate that that would be easy squeezy.

I was referring to using ZFS’s Mirror feature on two RAID-Z1 pools, rather than a mix of HW and Soft RAID. I’m not seeing how they defeat the purpose of one-another. But, what do you think provides more reliable redundancy out of the two configurations? Would more drives potentially mean more redundancy? Maybe not in RAID-Z2 since you can only have the 2 drives fail in any size pool. Unless I misunderstand.


MoBo:
That 10G NIC is nice to have. I like the suggestion for everything about it other than size and cost. If anyone knows about the state of IOMMU on other, smaller, X399 boards I’d love to hear.


Memory:
That 10-15% is tempting. Although I would be interested to hear what any systems architects have to say about the effect of ECC UDIMMs on data integrity.

At the moment, no. I have in the past. (well, mirrored)

Not much of a difference either way as far as performance goes, but I recommend redundancy for the SLOG, but the L2ARC gets checksummed and validated before “serving” data, so if there’s an error, or a failure, you’re safe.

Right, not really enterprise, but more enterprisey than most drives.

Let’s see if I can expound on my rationale a bit more:

Mirroring RAIDZ-1 vdevs would cause you to be able to lose two drives, one from each vdev, but would also let you lose 5 drives (assuming 5 drive vdevs) from one vdev and one drive from the other vdev. I would rather just use a single 10 drive RAIDZ-3 vdev and have 7 drives worth of space with the ability to lose 3 drives.

You don’t typically lose more than two drives at a time, and if you’re concerned about that edge case where you lose more than 3, you might want to consider clustering your storage and using the 3 copies policy.

The GIGABYTE X399 Designare is far from good.

@Wendell, do you know anything about the X399M Taichi? That one seems like it might be a good fit here.

Let me be clear, I haven’t experienced it first hand on TR, since I don’t have access to that hardware. I’m only going by my experiences with X370 and what people have said in regards to TR.

Should have clarified, I work as a Linux sysadmin, my last major project was deploying a medium sized OpenStack datacenter. I don’t wanna say that I know everything, but I do have a fair amount of experience in building enterprise solutions.

Regarding data integrity, let me direct you to the following article. It should help you to make an informed decision on your own. Its absolutely worth a read if you’ve got 20 minutes, just to help you understand ZFS a bit better and for fun!

http://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-your-data/

1 Like