By the way when making a stripe of mirrors, you can start with dual drive mirrors, then add drives to existing mirrors if your need for read speed increases, or add more mirrored vdevs as you need more space. The reason for a 3 drive mirror per vdev is that so that the pool remains online and high performance even as drives fail, get swapped out, and resilvered. Also during writes the zfs server knows which VDEVs are busy, and directs writes to different vdevs if it is a stripe of mirrors the pool is forced to perform writes to a degraded vdev, which can get messy. I think a jbod (just a bunch of disks) where each disk is a vdev is a better strategy. Also one of the VDEVs in a pool may get frequently accessed data while the others are less frequently accessed. It is possible to increase the performance of a single vdev by either adding a SSD device to that single vdev, or just add more rotational drives to that vdev.
on a 3 drive vdev during resilver:
drive being used for reads
drive being used for source to resilver
blank drive being filled with data.
on a 3 drive vdev after resilver:
drive being used for reads and writes
drive being used for reads and writes
drive being used for reads and writes
on a 5 drive vdev during resilver:
drive being used for reads
drive being used for reads
drive being used for reads
drive being used for source to resilver
blank drive being filled with data.
Unfortunately if a drive is going to fail, it usually fails while it is being the source drive during a resilver, hence the usefulness of more than 3 drives per mirrored set.
on a 3 drive vdev during resilver and second drive fails:
drive being used for source to resilver
DEAD: drive being used for source to resilver - during resilver drive fails
blank drive being filled with data.
Notice that there is now no drives available for reads, The entire pool may go offline until the resilver is complete which may take more than 5 hours.
on a 5 drive vdev during resilver and the source drive fails:
drive being used for reads
drive being used for reads
drive being used for source to resilver
DEAD: drive being used for source to resilver - during resilver drive fails
blank drive being filled with data.
You can see why it is worth it to spend more money on more independent drives if the data needs to be high availability.
It is a good idea to have at least one of the drives in each mirrored array on an independent disk shelf.
Also remember that redundancy is not backup, backup needs to occur independently of redundancy. It is usually a good idea for the backup server to request data from the storage server instead of the storage server pushing data to the backup server. In case of ransomeware if all of the data on the storage server gets compromised, you don’t want the backup server’s data to also become compromised. Also in case of dedupe, if the dedupe table becomes larger than system memory, you don’t want the data on the backup server to become unavailable too.