Extremely poor write performance in software RAID 5

I would expect much lower numbers if alot of read-modify-write was happening to the disks.
Looking back at one of the old threads, Fixing Slow NVMe Raid Performance on Epyc, 2GB/s was basically what mdadm maxed out at on write speed using from 8-24 NVME drives, definitely CPU bottleneck although it wasn’t clear whether the bottleneck was in the CPU parity calc or was something specific architecturally to AMD. My understanding is that the way mdadm implements stripe thread locking is at least partly to blame if the computation of parity data is the bottleneck.

I am now very curious what kind of peak RAID 5 write numbers mdadm can muster on the most modern CPUs in an optimal setup.