AMD RAID 0 on Threadripper Pro


I have a TR 5975WX workstation (W11, 256GB ASUS SAGE SE WIFI MOBO) with an ASUS Hyper M.2 card configured as RAID 0 (4x 980 Pro 2TB) - not an OS drive. The RAID is through the chipset rather than via software. This was done by the shop that built the workstation for me.

The issue I am having is that the read and write speeds are significantly less than I would expect for a 4-way RAID 0 on pcie4. Maximum read and write speeds are 11.94 GB/s (see figure). I was expecting about double this rate. On certain real-world file copy-paste operations I barely get above 200MB/s on this drive.

All SSD on the machine are identical (Samsung 980 pro).

Has anyone any tips for how I can configure this to improve read and write speeds?

You should keep in mind that the AMD RAID0 is being done through software, so you’re going to be limited by the CPU and RAM speeds. Though on TR Pro with 8 channel RAM you’d expect it to not be hitting CPU or RAM bottlenecks.

If this were linux I’d suggest software RAID, but I have no knowledge of what’s available on Windows for software RAID.

Are you copying lots of tiny files, or a few large files? The former will never run full speed due to all the metadata updates that need to happen.

1 Like

There’s a lot of considerations with RAID0 arrays. It severely depends on implementation. Some RAID0 implementations (eg. Btrfs RAID0) in order to increase reliability and simplicity of code can’t load balance transfer of single file so that you only get gains as you copy multiple files in parallel.

Also keep in mind that NVME Gen4 storage is really fast and when using RAID0 you’re kinda getting into territory of bonding multiple 10G interfaces with round-robin algorithm - it just doesn’t work. Packets reordering due to tiny timing differences, that at those speeds get out of hand really easily, gets so extreme that it cancells gains from using multiple links in parallel. So again depending on implementation it could have a lot of bottlenecks.

You may also get limited by single thread somewhere - maybe even in filesystem handling. Basically don’t expect it’ll scale linearly. Not even close.

Also - isn’t ASUS Hyper M.2 pci-e 3.0 device? It’s really old card, I have one laying around for few years now… Not sure if it’s passive device that shouldn’t be affected by pci-e version but in case it’s not passive then I’m nearly sure it’s not pci-e Gen 4 capable.

Unless you have V2 which seems to be Gen 4 capable…

It’s the V2 Gen 4 version (so PCIE 4 compatible) - I should have specified. After some work I have improved the sequential read and write speeds for ‘large’ files (1 - 10GB) but it maxes out at 2.8GB/s - which is basically the same as I get for a single non-raid Samsung 980 Pro (and still seems a bit slow). Implementation is via AMD RAIDXpert2. Was wondering if there was a way to check data thoughput is optimally configured across all chiplets on the CPU but don’t know if this could be a factor.

I will keep the RAID as I need a large contiguous space for data but it is frustrating that there is no other performance benefit compared to a single drive.

The 200MB/s was transferring about 4M tiny files (16kB - 1.5MB) so I wasn’t expecting full performance - but was hoping for better than 200MB/s…

I also use WSL2 so tempted to investigate software RAID on linux if this seems to behave better…

Wendell and I have that amd raid is just actually garbage, you’d be better off to set it as a software raid in Linux (using Linux not the amd Linux software), then vm windows

I believe it’s called mdadm

Mdadm will absolutely be better, raid xpert 2 really is trash

That sounds about right to me, writing all the metadata for those files requires synchronous writes - so the OS needs to wait for the disk for every single file. A copier with multiple threads may improve performance for copying lots of tiny files, especially on an NVME device (65535 IO queues!).