I like these practical problem questions, but it sounds vague. Let me know if I understood correctly
So, you’re using directories as “bag of tasks” kind of, and two computers are in a kind of infinite loop picking stuff up from one directory, and storing the results in another?
Additionally, you have humans VPN-ing in, copying files to server to input dir, and browsing and picking up results from output dir?
What are your other requirements?
How much data do these workers need to have on hand to process a unit of work?
How much total data storage do you need?
How much total cpu is used?
What’s your current file server based on (Cpu/ram/disks/os)?
What programming languages do you have experience in to allow you to automate / change stuff? Perhaps python?
When it comes to data durability, cloud is hard to beat … but the cost is relatively steep compared to on prem storage (IMO despite there being a gazillion cloud providers they’re all happy to take super fat margins on what is essentially byte storage, a commodity service at this point). Most cost effective use case for cloud storage at the moment is off-site backups using storage tiers that are highly durabile but perhaps not highly available (translates to cheap in terms of $/byte/year).
To summarize for storage considering how many folks you have, 2 physical machines on prem using DRBD/or HAST/or similar + snapshots + all user interaction through cloud where you can keep a self pruning trace of last e.g. week worth of ingress data from people in the field.
These are all considered “easy peasy” half a day of work for a useful sysadmin to setup on typical Linux + ZFS recycling scripts from the internet… would probably take more time to document/test that everything works well. Heck, wendell himself was able to even make a concise video on ZFS and how to make those snapshots show up in windows / shadowcopy.
Document, document, document your setup. Especially the requirements.
–[ For purists out there, yes yes this is all amateurish and so 90s, weak sauce, compared to min 5 replicas paxos/raft/cockroach on ceph… build your own cloud thing, but it’s likely 2-5 years down the road they’ll just move 100% to cloud + because their needs won’t grow as fast as cloud costs will drop and connectivity’s likely to get better … not worth the complications of building an on prem cloud infra ]–