What backup scheme to use for centralized cloud backup

I have a single server hosting my data on ZFS and a handful of client machines (mostly Linux, but also Android and possibly a Windows box soon). I want to backup some of the data to cloud with rclone. This would include selected datasets from ZFS and some folders (like /etc, /root and home directories) from clients. It’s just redundancy for critical data, entire ZFS pool is already backed up to my own offsite backup box.

What’s the best way to stage data from client machines while preserving permissions, keeping the backup incremental and preserving as much isolation as possible?

My first idea is to push the data from each client to dedicated backup dataset on the server. Then mount all the necessary datasets over NFS as read only to a backup VM which would only rclone it periodically to the cloud. But I’m not sure how to exactly implement it. Backup VM would probably need to run rclone as root with no_root_squash on NFS shares so it can access everything on the ZFS pool. Should I then create dedicated user account on the server for each client machine and give it exclusive directory on the backup dataset? I would like each machine to only be able to access and update its’ own directory. Then each machine would periodically rsync the data as root to the server using it’s respective dedicated account on the server.

Or maybe I should make a backup user and/or group on clients, server and backup VM, do some ACL magic on specific data I want to backup and making server pull the data? Managing ACLs, matching permissions etc. in this case sound like a nightmare.

Lastly, I can take some specialized client-server backup software and use that. But that seems overkill for such a simple task. And I don’t want to deal with custom formats, databases or folder structures on the server. I would much more prefer to hava just a directory per client and subdirectories for etc, home and anything else I want to preserve from each client.

Take a look at borg backup. Yes it does have a custom format, but it can mount repos over SSH and it can mount a backup as a virtual directory using FUSE for easier restore.

The great thing of a system like borg is it just uses SSH, and everything is stored on disk. So it can be rcloned really easily, and the dedupe/compression/encryption means the rclones are quite fast. There are also services that let you do borg backup to the cloud, though I’d recommend doing borg locally then copying to the cloud.

1 Like

Thanks, I will test it out.

I’m also going to try UrBackup again, I’ve used it briefly a few years ago and I remembered it can somehow leverage ZFS snapshots.

I don’t see how UrBackup solves your problem. You have a problem of how to backup your ZFS array, and you want to back things up to … a ZFS array? Plus syncing changes to the cloud from Urbackup is a bit of a challenge, snapshots don’t backup well unless you can ZFS Send to another ZFS array.

One of the great things about Borg is that it has a built in Prune feature. You can say “keep 5 yearly backups, 6 monthly backups, 4 weekly backups, and 14 daily backups”. You can keep some older backups without needing to keep every backup.

Though UrBackup does work well for the right purpose. FYI The ZFS snapshot support used to be horrible, since it was designed for BTRFS snapshots which don’t have parent-child relationships. However I believe they fixed that.

No, I want to backup my other computers running on different filesystem to my ZFS pool on the servers and then cherry pick the most important data stored on ZFS and backup those to a cloud.