Affordable setup to not bottleneck multiple u.2 drives

I’m currently using TN Scale to serve as backing storage for my XCP-ng cluster. I’m currently serving SATA SSD mirrors over 10G using NFS. While this is adequate, I’m looking at upgrading to u.2 drives and 25G/100G networking.

The issue I’m running into is finding an affordable setup that will provide enough PCIe lanes for all of the drives as well as enough cpu to bog down the transfer.

I’ve been looking at older Epyc but I’m curious what else I should be considering.

2 Likes

Depends on the size of your cluster. NVME drives over U.2 are pretty linearly $100/TB and use 4 PCIe lanes a piece.

You can run 2x 60TB NVME U.2 drives in ZRAID1 and a 100 Gb ethernet card on a 7000 series Ryzen with 128TB of DDR5 ECC RAM and be rockin and rollin.

Don’t need 60TB usable NVME storage?
Than you won’t need 128 GB of RAM.

1 Like

You need three drives for raidz1. And I’m going to be doing mirrors, not raidz. Additionally, that limits me to the speed of 4 lanes of whatever gen of PCIe I end up using.

I’m currently looking to run at least 6 drives, possibly moving up to 12 depending on what I find available.

Normal for NVME
at 4.0 speeds that’s 14 GB/sec and double that for 5.0 per x4 NVME connection
a single x4 U.2 NVME can fully saturate 100Gbit network connection with 12.5 GB of throughput from a 2 drive mirror.

I suggested RAIDZ1 as you can expand later if you’re feelin frisky without offloading and rebuilding the array (not recommended for prod).

Quick and dirty is a 2 drive U.2 NVME mirror (leave lanes available for recovery purposes later) and a couple 100 Gb cards in failover/redundant config on a Ryzen 7000 series with ECC RAM.

EPYC is another story as I have deployed Genoa builds with NVME arrays that with redundancy and planned future on-prem recovery options fully saturates all 128 lanes.

Your use case sounded as though you wanted the bang for buck option with at least ECC and redundancy.

1 Like

important to note, that there are now epyc processors for the am5 platform. if you want to have the extra validation and ECC support.

1 Like

My use case is VM backing. You keep focusing on transfer speeds instead of iops.

Indeed, because U.2 NVME drives are meant for the enterprise space with enough iops to saturate any network interface you plug in.

If you want high IOps then grab a ThreadRipper with the fastest per core clock speed you can find, Fastest DDR5 ECC RAM you can find, 100Gb cards and fill the rest with U.2 drives. Your bottleneck remains packet size and processing on the VM host side.

We do high performance enterprise for a living and megatransfers don’t mean shit without context so I express speed in the form of bandwidth.

This topic was automatically closed after 270 days. New replies are no longer allowed.