I retract my statement, this makes no sense whatsoever, the card looks like just a licensing tool.
ZFS raidz/raidz2 code may not be as speedy as btrfs/snapraid raids code… but that should hardly be a bottleneck.
In world where you get 100+ threads per system, spending a couple of them on raid calculation if/when needed is a much better tradeoff than spending PCIe lanes that could be used to connect storage to the system, or network.
When it comes to reading raid6. You need to only read the data, not read the parity, checksum the data, compare to expected checksum you have stored in the filesystem.
Drives will have checksums internally, but for safety you really want end to end checksums… e.g. ideally ZFS records would fit into your CPU cache and you wouldn’t have to trust ram/controllers/PCIe/drive checksums or parity.
When writing, obviously you need to compute parity from data, which is more CPU intensive than not having anything to do with parity… and there’s various accounting that needs to happen to ensure you can detect/fix half written data in the future, in case there’s power loss (metadata journaling).
When reading, if data checksums for a data block don’t match, you pretend the block is not there and reconstruct the data block from parity.
When doing a scrub, you read all the data/parity blocks, and you compare checksums of blocks to what’s expected.
This is where a big deal with btrfs was a few years back, btrfs wasn’t checksummming the parity, and when scrubbing it was recomputing raid, and was relying on data checksums for verification. Since most of the time everything matches this is an ok optimization. But when things don’t match you need to try different combinations of stuff missing and cross your fingers to recover missing data… this was fixed.
@wendell since you have the shiny 7773X and other similar fun gear, any chance you could run raid/test/speedtest.c
from snapraid/raid.c at master · amadvance/snapraid · GitHub and send a PR to add new numbers
it might also be an interesting example of a workload where huge amounts of L3 cache make some difference; if you were to run this multi threaded.
edit: actually, anyone with a zen 3 , or a recent (10-11-12) gen Intel could probably help refresh these.