I currently cannot create a NAS. My goal is to have two large external hard drives with data mirrored between them, that I can access through a normal computer running Ubuntu. I would like the data to be encrypted and compressed. I’ve heard that ZFS is the way to go, but am fine with BTRFS if that is better for the situation. I would like to be able to then scan for and fix errors with the drives. I’m a complete Linux noob, so if someone could point me to a tutorial (or tutorials) that would be great (including how to add/delete a file from one and have it automatically apply to the other if possible).
What is your plan? Leave this Ubuntu box 24/7? Do you plan to unplug the drives often? As for ZFS on USB, I would advise against it. Just not on USB. You are at your own risk if you do that, ZFS really complains if it cannot find drives or pools during bootup (if you don’t export them properly). And the order they are detected also matters (which is why WWN is preferable in any setup, rather than using the OS assigned names).
You can enable zstd compression on btrfs. Given that btrfs is integrated with the kernel better, I suspect it will have a better time interacting with the USB drives, just like a normal ext4 formatted USB drive. But encryption is not a feature in btrfs, you’d have to use LUKS.
The setup smells fishy. Please give us more details on how you plan on running it.
A mirror does that automatically. If you plan on having data on a drive and a backup on the other, that is something else. Also, RAID is not a backup.
I believe I’ve done sort of what you suggest to do. I would not really recommend it as it is generally warned against, and I generally agree with @ThatGuyB s points.
But I’ll tell you what I did:
I had two USB-connected external HDDs that I added to a ZFS mirror. Then I backed up filesystems from another zpool via zfs send ... | zfs receive ... , exported the externally connected zpool, and stored the two HDDs away from the computer.
When I felt it was time to update the backups, I attached the drives, imported the external zpool, and sent incremental backups to it. Then I exported it again. It has survived through several such cycles, however I did not connect it very frequently. And it was never my only backup for irreplacable data.
One thing to consider is that ZFS is bad at communicating over USB. I don’t know the exact mechanisms behind the caveats, but it is generally recommended against. The other is that most external drives are SMR drives, which also don’t work as well with ZFS (though I’m not sure if this is really a problem if it only receives full datasets).
Again, my story is not really valid as a success story as I generally have no estimate on the risks I took at every time I connected the drives… but at least you have my experience there.
It’s a windows laptop I’ll run Linux on some of the time off a usb. The drives I’ll plug in maybe a couple times a week (by usb). If this is a major problem, I’m working on clearing out an old ssd in a usb enclosure so I could use that, though I personally don’t care if the Linux drive fails, so long as it isn’t a major problem for the drives. Also willing to switch the distro if need be, but would rather not since I’ve been getting used to it, and I don’t think that impacts things. The system is intended to run yt-dlp and other archive programs, and store that data on the drives. It would also be nice to be able to run linux with none of the external drives some of the times, and to a lesser extent it would be nice to be able to plug in only one at a time to save some stuff and then plug the 2nd drive in later to mirror the stuff over later in the week, though the 2nd is a much lower priority, and I’m willing to live without the 1st.
I managed to enable the LUKS encryption with BTRFS on one of them (though I forgot how—that’s what I get for not enough documentation—I’m sure I will be able to figure it out again for the 2nd drive). Though I’m not sure how to mount the right drive and enable compression, and how to set up the parity drive to make it “sync” with the main one, and then how to check parity and fix the bit rot/etc. If you could recommend a or a couple basic step by step guides with the commands, that would be great, since I’m unfamiliar with navigating and using the terminal on Linux, and the commands I’ll need. Thanks.
I’m too unfamiliar with btrfs, but both it and zfs should be running often to take advantage of their automatic data protection features. To protect against bitrot, the array needs a scrub, usually done monthly or bi-weekly, depending on how much data you write to it.
For yt-dlp and other files, I run a single USB drive on a RPi 2 as a makeshift NAS. Works great actually, but I’m just running bare basic ext4 with nothing else.
If the data is really that important to secure that it needs encryption, I would do btrfs on luks on a single drive and use the other drive for backups once a week via rsync. In this scenario, you don’t get the benefits of scrubbing and other built-in data protection, but at least the chances of both drives experiencing the same kind of bitrot is low, although noticing when a file changes would be a task in itself with just rsync. Still a better option for this scenario.
If you only download videos and other things off the internet that you do not need to be backed up:
You do not need redundancy, nor backups for it
You would be better served by one single drive
If you want to backup important data and want to have it safekept, a RAID1 with ZFS or BTRFS is a good idea, but do not use USB drives for that. In this scenario I would build a small NAS (a RockPro64 or similar) and have the drives use sata. Definitely ZFS in this case, if the OS supports it.
If btrfs is in a mirror, you only need to mount one single drive. You would use dmcrypt luksOpen command to bind the decrypted disks to pseudo-device names (you need both of them unlocked), then just use the mount command on a single disk, with the mount -o compress=zstd (make sure you have zstd installed).
If you still want to go ahead with the raid setup, in this particular scenario, I’d use btrfs on the bare drives and use encrypted vaults in certain folders. Just my $0.02.
I would have to agree with @ThatGuyB and advise against what @level_double_o_null is planning. I tried something like what @level_double_o_null has in mind and had a lot of bitrot. I might be able to retrieve some of my files, but I doubt I will be able to retrieve all of my files. So I am saving my money for a Synology.