GIS Workflow Needs a Rework/Upgrade

Currently we have two desktop workstation grade computer doing GIS work and quite an old server box that stores the files needed to process the GIS workflow to make maps.

Long story short we need to replace that aging on premise file server. We found the files would corrupt if we used a cloud file server. Currently we use Egnyte for other business workflows.

We are currently debating whether we should just upgrade our on prem data collection point or move to say a cloud VM that can process the GIS data in the cloud. We have several offices in Western Canada where we need the data in the future to be highly available and robust.

We are a Windows Server shop with some cloud apps and things to make life easier for our large number of remote/home workers. It would be nice to help our GIS folks to not have to be in an actual office to work on GIS things.

I like these practical problem questions, but it sounds vague. Let me know if I understood correctly

So, you’re using directories as “bag of tasks” kind of, and two computers are in a kind of infinite loop picking stuff up from one directory, and storing the results in another?

Additionally, you have humans VPN-ing in, copying files to server to input dir, and browsing and picking up results from output dir?


What are your other requirements?

How much data do these workers need to have on hand to process a unit of work?
How much total data storage do you need?
How much total cpu is used?
What’s your current file server based on (Cpu/ram/disks/os)?

What programming languages do you have experience in to allow you to automate / change stuff? Perhaps python?


When it comes to data durability, cloud is hard to beat … but the cost is relatively steep compared to on prem storage (IMO despite there being a gazillion cloud providers they’re all happy to take super fat margins on what is essentially byte storage, a commodity service at this point). Most cost effective use case for cloud storage at the moment is off-site backups using storage tiers that are highly durabile but perhaps not highly available (translates to cheap in terms of $/byte/year).

To summarize for storage considering how many folks you have, 2 physical machines on prem using DRBD/or HAST/or similar + snapshots + all user interaction through cloud where you can keep a self pruning trace of last e.g. week worth of ingress data from people in the field.
These are all considered “easy peasy” half a day of work for a useful sysadmin to setup on typical Linux + ZFS recycling scripts from the internet… would probably take more time to document/test that everything works well. Heck, wendell himself was able to even make a concise video on ZFS and how to make those snapshots show up in windows / shadowcopy.

Document, document, document your setup. Especially the requirements.


–[ For purists out there, yes yes this is all amateurish and so 90s, weak sauce, compared to min 5 replicas paxos/raft/cockroach on ceph… build your own cloud thing, but it’s likely 2-5 years down the road they’ll just move 100% to cloud + because their needs won’t grow as fast as cloud costs will drop and connectivity’s likely to get better … not worth the complications of building an on prem cloud infra ]–

I guess to clarify the need for powerful processing in GIS work.
I’ll walk you though a typical use case for this application.

The real world data is collected in the field and then sent via the cloud file server to the GIS workstation computer. That real world data is then used to make a map and in making that map the map is actually created on the server drive then disturbed with the paper copies and/or the digital maps that are essentially very large PDF files in the cloud app. The cloud app, Egnyte, is working great btw. On the side note we use Arc GIS for our GIS work.

The one thing we need to replace is the on prem server as it is a 2008 domain server and 2012 file server. Yes you heard me right a windows 2008 and 2012 server. We need about 1 TB active drive and about 2TB for archival

Again the debate is to recreate the on prem system with new hardware and updated server software or something like AWS cloud for instance.

Cost wise we could do either but we are wondering what is more worth while and allows us to pivot faster as business has not slowed down for us even in the light of this covid pandemic.

Every project we are brought onto do needs a map and also maps need to be updated as scopes of construction change or the physical landscape itself changes.

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.