Offsite backups

ifc2000 · July 24, 2020, 8:44am

I have two nas servers that are in discrete locations and I want to have some my files synced automatically. I’m currently using rsync to do local backups, but I am looking for the best way to do a comparable thing with two servers that are not on the same network. Any recommendations?

Mastic_Warrior · July 24, 2020, 9:00am

rsync with VPNs. That is what I have done in the past if you don’t want to spend any money.

ifc2000 · July 24, 2020, 9:10am

I have thought about using a vpn but my concern is that if the two servers are connected over a vpn and one server is compromised for some reason they both could be thus defeating the point of having a server offsite. Do you know of any security measures to mitigate this?

Mastic_Warrior · July 24, 2020, 9:27am

Not really. You could do some thing like set up the VPN connection to happen with a cronjob, then set the rsync job as a cron that is dependent on the VPN. Once the rsync job has completed, close the connection. That is essentially what I did. I had a server in the “cloud” that acted as a repository for the different machines. The would connect as needed and write their diffs to the respective sub repositories. When they were done, they would disconnect. If something went wrong with the VPN or the rsync job, it would send an email to our escalation account and one of the techs would look into it. Again, that was because the company did not want to spend any more money on that.

I am sure that there are better, paid, solutions out there that may fit withing your budget if you are worried about spillage and cross-contamination.

ifc2000 · July 24, 2020, 9:38am

That’s a pretty good solution, I’m wondering what would be the most secure way to do it I’m not opposed to paying for a third party cloud storage provider or some sort of third party self hosted virtual server system which I can push and pull from. I’m not sure of what the best solutions are for this is there any industry standards that people use for this sort of cloud storage or offsite syncing.

Mastic_Warrior · July 24, 2020, 9:41am

I am not big in the cloud development world, but if XKCD is any indication, I am sure that there are.

You may do well to ask here, we DevOps, Sysadmins, and what not hang out here.

ifc2000 · July 24, 2020, 9:48am

yes that is very true its actually interesting how we ended up with mini and micro usb. Nokia wanted to compete with the iphone and pushed the usb regulatory body to develop a new smaller standard for phones. Of course that didn’t seem to work out the way that they had hoped for. thanks for the help.

Mastic_Warrior · July 24, 2020, 9:54am

No worries. I hope you find something that covers your needs.

risk · July 24, 2020, 3:05pm

There’s “syncthing” - it’s more for syncing e.g. a desktop with a phone or two desktops or server with a phone.

Not exactly a backup in a traditional sense.

For regular backups its hard to beat rsync with ssh combined with either lvm or straight up dmsetup for snapshotting and/or encfs in reverse mode if you don’t fully trust the machine you’re backing up to.

For incremental backups of big files e.g. VM snapshots, I know there’s this device mapper ‘era’ target you can use in combination with snapshots to get really cheap (i/o wise) diffs, so if you have a 2T image you don’t have to go through 2T of data to move only the changed blocks to a different machine. But, I don’t know if there’s a tool or a file format to transfer this incrementally.

oO.o · July 24, 2020, 4:29pm

ZFS send/receive over vpn is the most efficient and reliable option.

Security-wise, assuming one system is primary and the other is backup, give the backup read-only access on the primary and pull from the backup system instead of pushing from the primary. Use snapshots or rsync‘s backup-dir option and then one system can not completely compromise the other.

ronclark · July 25, 2020, 4:04pm

I have used syncthing to sync 17 Tib of data from a home NAS to a offsite NAS. I set the home NAS to send only and have snapshots setup on the offsite NAS. so far it work great. Thats my biggest setups i have done, the other NAS setup is sync to 4 other NAS systems, its made up of two Qnaps, two Unraid and one Truenas.

I would say give it a shot its free and easy to setup.

risk · July 27, 2020, 2:16pm

By backup you mean hot-standby? In case of hot(warm standby) it doesn’t matter, you need to trust both.

The authentication direction - whether you push or pull backups, matters only from a perspective of not exposing the entire historical archive to a potentially compromised host in case you want to rely on that archive to restore the host (e.g. a mac os x system can authenticate and nuke its own time machine history at any time). Making your old backups read only from the perspective of a source system is usually enough, and that’s not particularly hard, a simple rename out of the location where a source host can read/write is usually enough.

oO.o · July 27, 2020, 3:06pm

Wherever rsync is sending files as opposed to where files are actively written. If it is pulling read-only and has snapshots and/or backup-dir, then neither machine can maliciously destroy data on the other.

risk · July 27, 2020, 3:43pm

Each machine can corrupt the data that the other machine is pulling. You need some additional mechanism to prevent corruption when receiving data, whether you’re pushing or pulling.
In either case, you need the ability to snapshot data, or another mechanism to flush writes and make sure the data being backed up is consistent at some point in time.
In either case you need your old snapshots uncorrupted.
Regardless of whether you’re pushing or pulling, you need to limit access to the other host.

Both diffing of two snapshots in order to compute a delta, and finalizing a backup once complete are easier to do locally, but they have to be done on separate hosts, so at the end of the day it’s just whatever is easier to coordinate.

Personally, I prefer pushing the data, as I find computing the diff and ensuring the server can do its job during a backup to be relatively complicated compared to just uploading the delta or a snapshot. Also, I prefer pulling my restores.

ifc2000 · July 28, 2020, 11:22am

I like this idea of pushing data. Is there a way to open file transfer to a computer in one direction only so that the each computer only has write permissions to the other one. That way if someone was to gain access to one of the machines they would not have the ability to destroy data on the other one. Is there a way to set this up via ssh/ftp or perhaps a remote proxy or vpn? Currently I’m using rsync -a command setup with a cron scheduler to back up to a local computer does anyone know of a way to do a similar thing with a remote computer so that data can only be pushed to the computers and not pulled from them.

ifc2000 · July 28, 2020, 11:25am

or perhaps there is away to set up a similar thing with syncthing if anyone has experience with doing something like this l would love to here your thoughts? My main concern is having the data safe it is not a large amount of data but it is mission critical so it is important that it is secure so I want to try to mitigate any possible threats as much as possible.

risk · July 30, 2020, 6:37pm

This is how I build rsync over ssh drop-off points.

I start by making a separate user on the target system. This user is dedicated to this.

I’m using ssh ForceCommand feature on the target system, to ensure that the other host can login as this user with an ssh key, but only call certain commands. For each command (2-3-4 of them, depends) there’s a separate ssh key.

The command , that ssh actually invokes when the client wants to run rsync, is a wrapper which runs rsync from a small chroot-like environment - that way, rsync doesn’t have much access to the host system - nor does it need it.

The rsync environment, is the one I built from alpine linux like this (it takes like 5M of space)

chroot with rsync

Save this in a hacky script, you might need to repeat it.


ALPINE_ROOT=$HOME/alpine_root

# This is where we'll store our data
mkdir -p "${ALPINE_ROOT}/backup"

mirror=http://uk.alpinelinux.org/alpine/
arch=x86_64

# Step 1. find out the latest version of Alpine apk package manager and get it.

# do simple sed processing of the package index to find the latest apk-tools.
version=$(
  curl -Ss ${mirror}/latest-stable/main/${arch}/APKINDEX.tar.gz \
    | tar -zxf - -O APKINDEX \
    | sed -n '/P:apk-tools-static/,/V:.*/ {/V:/{ s/V://; p}} '
)

curl -Ss "${mirror}/latest-stable/main/${arch}/apk-tools-static-${version}.apk" \
  | tar -C . -f - -z --strip-components=1 --extract sbin/apk.static


# Step 2. make a root with only rsync
unshare --map-root-user \
./apk.static -X ${mirror}/latest-stable/main --allow-untrusted --root "${ALPINE_ROOT} --initdb --no-cache add rsync

This server side rsync gets started with a wrapper script:

$HOME/rsync_wrapper.sh

...

# sprinkle logging here

ALPINE_ROOT=$HOME/alpine_root

args=(${SSH_ORIGINAL_COMMAND})
if [ "${args[0]}" != "rsync"] ; then
  exit 1
fi
unshare --map-root-user --mount --net --pid --fork --mount-proc --ipc --net --cgroup -R $ALPINE_ROOT --wd /backup "${args[@]}"

# rsync is done

# sprinkle a logger command.

inside your authorized_keys, you can put:

$HOME/.ssh/authorized_keys

restrict,command="%h/rsync_wrapper.sh" ssh-ed25519 AAA....
...

add other commands as needed.

I don’t really have large VM images I care about at home, any large files I care about (archives) I copy over manually. This approach works for my homedirs, it’s good enough for my photos.

The scripts I use to actually call rsync, on the client, … invoke btrfs/zfs/lvm snapshots, and they connect to see what’s the next snapshot that needs copying, and then they mount that local snapshot using encfs --reverse locally, and just try to sync that one snapshot to that host:destination_dir. If rsync returns succesfully, they ask the remote machine … to snapshot the dir (in one other case, the directory is moved out). Taking snapshots varies between my machines - there’s various exclude paths and such, and different filesystems and such.