High Availability SFTP Server

I should start by saying that I have 3, 45Drives Storinators that I can repurpose for this project.

I currently manage a SFTP server that we use to receive content from customers (I work for a small, digital preservation non-profit). This one SFTP server has nearly 100 SFTP accounts on it and can receive data at anytime. The storage is local to the machine.

We have some major upgrades to perform on the server, but my coworker claims it is nearly impossible to communicate downtime with our customers. This got me thinking about high availability storage, SFTP, and load balancers. I figure we can setup a load balancer (or something similar) to handle the failover, but then comes the syncing of the data. I seen one option that is to setup SFTP with a load balancer in front of it, but then have another mechanism/script that syncs the data together. Is this really the only option? Remember, we are a small non-profit so we don’t want to go out and find an expensive enterprise solution.

Going back to the 3, 45Drives machines that I can repurpose for this project, I figure I will probably run KVM and spin up some VMS for the SFTP servers and load balancer. I might also run CEPH to cluster the storage (i’ll try getting another machine but might be limited to the 3). Anyone have any ideas on how to actually put this all together and accomplish our goal of high availability SFTP?

You’ve got the right idea. Also keep in mind SSH/SFTP has host keys that will need to be identical for both hosts or the clients will error.

For the storage, you could use a replicated storage system like DRBD, but I expect you’ll have less overhead using something like rsync or Unison, particularly launched from inotify as soon as files are closed (after creating or updating).

1 Like

It’s not a one-stop solution for your use case, but it’ll solve the underlaying distributed storage issue:
https://www.gluster.org/

HTH!

How are you provisioning the SFTP accounts? what is actually running SSH? Are you manually editing text files with keys, are you using some web ui?


This gives you high availability storage, but it also turns you into a Ceph admin overnight (you might not know what hit you in the morning).

You should be able to move/live migrate the SFTP running frontend VM from one physical machine to the other, and whatever data was uploaded would still be there, no TCP connections would break. Unless there’s a hardware failure in which case, network would break but data written to Ceph, that finished uploading over SFTP would still just be there.


If you then want an even higher availability setup, you could run SFTP on each of the 3 proxmox machines in LXC, pointing at the same storage. There’s various ways you could health check and load balance a VIP across these SFTP servers - it might be overkill. Unless you also have multiple ISPs