Maximizing Recording Performance

I have a PCIe 5.0 card that’s sending video to the PC’s main memory at approximately 10 Gbytes/s.
I need to record 20TB of this data without loosing a single bit.
All solutions are on the table: using RAID 0 with multiple SSDs / recording raw data without a file system / Linux / Windows / etc…
What would you do to achieve the high throughout requirement ?

Well… a SSD that can sustain 10 GB/s write is nonexistent. Even the PCIe 5.0 SSDs that can do it will do so for a few seconds and then the performance tanks. The solution should be pretty much as you said: RAID-0. The better off-the-shelf SSDs could probably fall back to 4 GB/s of sustained write performance, so you would need at least three. Not too sure I would trust software RAID for this purpose. Perhaps someone who actually uses a RAID solution can chime in.

Thanks for your input.

  1. Why wouldn’t you trust software RAID ? Given that the host PC has enough DRAM to temporarily buffer the data from the PCIe card when the SSDs are busy.
  2. What about coming RAW data (for example with Linux dd) instead of going through a file system ? Would this give better performance ?

Regarding software RAID – I’ve used 12-drive and 16-drive SATA SSD and it’s eaten a lot of CPU performance (6-core and 8-core Zen3) calculating the parity data for my redundant stripes. I believe, but haven’t dug into the bottlenecks, that it’s capped by pushing all the drive’s data through the PCIe lanes to the CPU and getting the parity calculations back out again, and possibly slowing main writing by keeping the parity drives in sync with the bulk data landing on the array’s non-parity devices.

Using dd to the raw array wasn’t better than using ext4 or btrfs, because the filesystem layout wasn’t delaying data being written, the parity stripes were.

I suggest that you use a mainboard with lots of PCIe lanes, and put multiple NVMe devices on there and use mirrors of whole devices rather than parity stripes – but write the data to a filesystem that checksums blocks (ZFS or btrfs but not Linux IMA). If you lose a single device, there’s a mirror of its data and you can restore capacity when you stop the device and service it. If you lose two devices, that’s

K3n.

What sustainable write throughput were you able to achieve ?

I’m not going to say, it’s emabrrassingly low because I hung SSD’s of anything that would connect them and there was too little bandwidth to properly serve the SATA SSD’s. I intended to revisit it with a robust configuration but I haven’t yet.

K3n.

1 Like

My comment was purely from gut instinct–mostly concerns about uncertainty of ability to maintain consistent performance. It could work since it’s RAID-0, but software RAID is not something I would trust with a fire hose of data.