Hey all. I have a home server that runs Ubuntu with several docker containers for things like Plex, databases, websites, and just general NAS stuff. It’s a custom built Ryzen machine in a Node 804 case.
Currently it has 2 500 GB NVME drives for OS and anything that requires quick IO (which is not much honestly). I had two 4 TB HDDs, and just added 2 16 TB HDDs to it.
I’m trying to decide on the best route for handling backups/redundancy. I have two categories of data - personal files, which I do want to backup to a cloud service, and media, which I want to keep in case of a drive failure, but doesn’t need an offsite backup. I currently have timeshift installed and running. Everything was originally setup on EXT4. I’ll keep the NVMe drives on EXT4.
Now that I added two more big hard drives, I’m trying to decide how I should structure and backup everything. I could do RAID one with the two 4TB and two 16TB drives, then use the 4TB drives for the personal files and do cloud backup, then leave the 16TB for Plex.
I could also use LVM to just combine the drives, and any future drives.
I was also considering reformatting the HDDs to ZFS and using ZFS backups instead of RAID. I’m sure there’s a lot of options I haven’t even considered.
I haven’t used RAID or ZFS in the past. Only Timeshift. I’m looking for some advice on which route to go down. What would you do for both personal files and media files? Assuming personal files will also be backed up to a cloud.
If i’m to understand correctly you have -
2x 0.5 SSD
2x 4TB HDD
2x 16TB HDD
I will assume that 4TB is “full”, and that each of the pairs above are a RAID1 (by LVM of some type) now as if not then anything after this line is redundant unlike the data so that should be a priority to arrange first if you care about it.
It’s not stated where timeshift is putting data now - if it’s in your current host’s drives, that’s more like a second stage recycle bin than a backup though this is still with its uses.
I would consider you have option of migrating everything ‘slow’ into the 16T drives, freeing the 4’s for other uses like off-system backup of e.g. personal data if you want to do that.
- create a zfs mirror of the 16T drives, this is your pool
create separate datasets in the pool:
-
either for personal, plex & any other datasets that suit you - freeing up 4T’s and you copy across data to appropriate places
-
or just a dataset for plex, leaving e.g. personal on 4T’s
-
also make a backup dataset for anything else (your ssd’s, personal data from 4T’s if you left it there, other systems to target backups to a network share of, by rsync, borg, veeam, anything else)
why zfs:
zfs send - allows that it is possible to zfs send the datasets to in future, new disk pools, or another system i.e. a backup target- and you have 2 4tb disks for this (for your personal data for example).
https ://blog.fosketts.net/2016/08/18/migrating-data-zfs-send-receive/
if you don’t want to consider using the 2x 4’s as a local backup you can stil remake them (or not) as zfs mirror for e.g. personal data only and cloud backup that pool/dataset/partition.
caveats
-
unknown what your docker containers etc are on now, or how they’re backing up (guessing from ext4), rsyncing them across while running may have issues with any restore, I’m sure the internet will have further info on that. that’s not a problem with what is presented just also a thing that’s happening now.
-
I don’t think LVM will ‘combine the drives’ in any good way if resilience matters to you
-
I’d also not delete any data from the original disks for a week or so until you know the new disks aren’t having any issues at the leading edge of the bathtub curve.
Thanks for the advice. I pretty much decided to do what you advised. I created the ZFS pool on the 16TB drives and ran rsync over a couple days to copy things over. I was surprised how simple it is to create a ZFS pool. It is mirrored.
I’ll leave it as is for a bit, buy my plan is to eventually wipe the 4TB drives and turn that into a ZFS poll as well.
I have my home drive (including all docker configs) and all my files and media on one 4TB being backed up to the other 4TB. It seems like this might be less efficient that ZFS raid, since I am imagining it means more write activity with the periodic backs that just a ZFS mirror would be.
Once that is a ZFS pool as well, I’ll store my personal files on it and back that up to a cloud service.
Once I read that LVM doesn’t play well with RAID and may cause more issues if a drive does fail, I decided to just not worry about that.