How much depends on your RAIDZ (or mirroring) configuration, but you will need to sacrifice some disk space for redundancy of course.
First you need to understand the difference between old-school hardware RAID and "RAID" in ZFS. In ZFS all the disks are managed in software, so if you have a RAID card, you need to run it in JBOD mode which means that all the drives are presented to the OS as is, so as if you plugged them into your motherboard's SATA ports for example.
Next you need to know that in ZFS your whole storage is managed in pools which consist of so called "vdevs". Each vdev has its own redundancy and you are free to choose a different method for each vdev if you want. But if one vdev fails, the whole pool fails, so choose wisely (Basically you stripe all the vdevs like in a RAID0, however you can also just have one vdev if you want.)
Now while you could pick simple mirrored vdevs (comparable to RAID1), you will lose a lot of storage since you basically cut the available disk space in half. This is great for performance and the fastest method in terms of recovering from a drive failure, but not very economical. This is where RAIDZ comes in. Usually the number of disks that can fail at once (that is before you replaced it and the data is written back) is denoted by a number after the "Z". So RAIDZ2 will still be fine, even if 2 disks fail at once. (Technically you could also just have one drive per vdev and get the whole capacity of all drives and the maximum performance but of course no redundancy).
Since drive failures happen rather rarely, you are tempted to think that RAIDZ1 will be all you ever need, but beware. The issue is that when you are rebuilding a replaced drive, you put a lot more stress on the remaining drives which might have a similar age or even come from the same batch of drives. This puts the odds quite high, that another drive will fail. This is why the FreeNAS Team recommends that you use at least RAIDZ2 and get your drives from different vendors and maybe even different drive manufacturers, per vdev that is.
The recommended configuration for ZFS vdevs in RAIDZ is a power of 2 (so 2, 4, 8, 16, etc.) number of disks plus whatever number of disks you want for redundancy. For RAIDZ2 that would be 4+2 = 6, 8+2 = 10 and so on. I am personally only running a small pool of 2+1 drives in RAIDZ1, that I am only using for backups, so I am not too worried right now, but when I upgrade, I will go for RAIDZ2 for sure.
I am not sure about FTP but it works over SSH. You basically pipe zfs send
into ssh
and run the zfs receive
command there. So something like: zfs send mypool/mydataset | ssh [email protected] zfs receive -vudF mypool
.
What's -vudF?
The -v option will print information about the size of the stream and the time required to perform the receive operation. The -u option prevents the file system associated with the received data stream (mypool/mydataset in this case) from being mounted. This was desirable as I’m using backup to simply store the mydataset snaphots offsite. I don’t need to mount them on that machine. The -d option is used so that all but the pool name (mypool) of the sent snapshot is appended to mypool on backup. Finally, the -F option is useful for destroying snapshots on backup that do not exist on server.
Source: https://www.iceflatline.com/2015/07/using-zfs-replication-features-in-freebsd-to-improve-my-offsite-backups/
Obviously SSH does the encrypting and authentication, so you don't need to worry about that.
Sorry for the long post
I wanted to make sure you know all the important details about ZFS, so you can make the right choice in how you configure your drives. Also it might have been because of my passion for this amazing file system.