Looking for suggestions to mirror data in two locations

I've been knocking around the idea to get two servers type hardware solutions set up that can mirror each others data. The data right now is about 250GB but want to budget up to 500GB for future data accumulation. The data itself needs to be stored for 7 years for legal reasons. So I need something robust and reliable and easily retrievable.

The data where it needs to be accessed the most is in a location with very slow internet bandwidth so it is very hard for the slow side to VPN into the fast side.

My side:

Slow Side

Anyway the problem that needs to be solved is that other people need access to this data and as you can see its very slow. So I thought of a creative way to help both sides.

My side has alot more bandwidth for sharing VPN.

Really what I'm asking is what is a good system to setup for VPN but also a system that mirrors itself every night. Automation is key. So on the slow side they can work and upload locally and then over night it mirrors to my location for people to access by VPN.

And before anyone says dropbox...etc. We've thought of that but we want our own hardware/software to do this. I am pretty familiar with linux and getting a VPN running but what out there is good for doing up a DYI server mirroring solution? Price is not really an object but realistically like to keep it under 1500 dollars. I'd like to keep this open source as possible and not go with windows server as well.

If you use zfs as the file system you can replicate the data every night using cron or something. After the initial replication it will only have to send any blocks which are different so it will be very efficient. You can use the sanoid and syncoid scripts to automate snapshots and replication.

1 Like

rsync might be what you looking for

3 Likes

Due to the speed of that "slow side" I would recommend initially doing an on site copy but I think @Dje4321 is right that rsync should get the job done. You could use any hardware you want really with (insert favorite distro here). At work we have toyed with the same Idea using Synology diskstations. They support rsync, so thats something to consider. FreeNAS should do the same as well if you want something a little more robust.

FreeNAS or another linux distro with ZFS support should do the job. I personally duplicate my personal nextcloud data between 2 systems over a 10/1mbit line. Suboptimal, but it works well after you "seed" the box on the slow side over LAN or with a harddrive first. If you use a ZFS file system, you can just send snapshots of the data changed between the current and last snapshot and have snapshot auto destroy after a time of your choosing. The nice thing with ZFS is that snapshots takes essentially zero space, so you can without problem make and replicate them to the remote site every 5 minutes as long as you don't write data faster then the internet line can handle.

Or as other suggested, rsync would also fit your use case. Less efficient, but it can be run on basically anything from enterprise grade servers to a rasberry pi with a USB hard drive (but don't do that)

1 Like

The business computer end of it on the slow side ideally will be a windows box where files are organized. Then this same computer overnight can upload to my side with the newly build linux vm and we keep the same crash plan on my side for off site backup. I like this idea of rsync and I don't think I need to build a separate machine at all. Just have the slow computer rsync with a VM I set up on this machine. I was thinking either CentOS or debian. Debian I like the package manager but I like Centos for its more tools that are built in at setup. I guess I'll get cracking on rsync with that article and jump right in. Thanks for the responses it's been most helpful.