I work as a GIS-analyst and do spatial analyses, process automation, tool development and map production using QGIS, ArcGIS, ArcPy and PyQGIS.
Since I am working with large datasets rather often there is a good piece of time to be saved with a speedy operation, as well as when prototyping tools wich requres them to be ran 100 times per day.
Therefore I am looking to optimize my hardware as far as possible. My understanding is that sadly QGIS does not support multithreading in many cases, and ArcGIS is just a little better, so a lot would depend on single thread performance, but I wonder if 3DVcach might help as well.
RAM… the more the marrier, the quicker the better. However I am wondering wether MT/s or fast timings might be more important here, since GISystems seem to work with a lot of small quick accesses to memory.
That brings us to the SSD, which I suppose should be as fast as possible, but I wonder wether random IOPS might be more important than absolute speed.
Graphics usually do not get involved here so most iGPUs should do the trick.
My current System is a Laptop from XMG which I selected for it’s 7840HS and it’s Ability to support 96GB of RAM and at least decent runtimes on battery, allthough I hope to get more out of my next purchase, since I am talking about currently 3 hours with maximum brightness, WLAN enabled and a lot of bursty workloads with 1-3 hard working threads for a minute or so, followed by research and code modification.
I’m not too familiar with the various tools though I’ve dabbled a little bit in QGIS. What I can say is that I’ve seen a LOT of people try to turn Python into a cobbled-together DBMS and it really doesn’t work.
Have you looked at using PostGIS for Postgres and would that fit with your use-case?
I don’t know that you’d want to do this on a laptop though you probably could with reasonable specs and RAM though I’d look at a dedicated SSD to store the database if you have another m.2 slot in there.
Geospatial stuff fascinates me but I’ve never had a super high priority need for it.
Hej, thanks. I live in the middle of nowhere in Sweden with acceptable mobile network coverage, but no fiber, so VPN and RDP are no fun. I usually download everything I need and do it all on site.
Do you have more suggestions regarding optimizations for database workloads?
Welcome! Fellow GIS Analyst here, all your hardware leanings are correct, but I would say that if you do any work with raster or LiDAR data you will benefit immensely from GPU compute. Both Arc and QGIS can utilize GPU resources to supercharge render times and keep rasters or DEMs loaded so they don’t reload as you pan around. Otherwise you seem to be on the right track, stuff as much ram in there as you can.
Yes two SSD slots are in use. 96GB RAM are overkill most of the time and barely sufficient on occasion, but until now there was no task I was’nt able to solve. Since most of the stuff runs on a single core 128 Cores of Turin would probably not help me here. An Intel HX or Dragon Range CPU could give me a coule 100 MHz more frequenzy though…
Regarding PostGIS… I had that as part of a further education and hated it’s guts, so I’m trying my best to avoid and brute force my way around it for now.
Can you explain though why you deem Python as subobtimal for this task? As far as I understand the matter most of the ready-to-use tools in QGIS (processing.run algorithms) and ArcGIS are just being called via their respective Python Commands, but are programmed in C++ if I am not mistaken (at least with ArcGIS I am 100% sure) which makes them quite snappy. But the custom stuff like iterating through the database copying it into lists and changing those is really quite cumbersome. Better with ArcGIS, but still not really breathtaking. Also it seems to me that some of those operations seem to slow down at a certain point, maybe even exponentially, despite RAM not being a problem.
The only time I did that was at University, writing my bachelor thesis, during which my PC died and I had to do it on my old Laptop with an i3 dual core and 8GB of RAM, that was fun…
Do you know if it is possible to increase the size of the area that is rendered in QGIS? It seems to me that if I pan around with vector data even a milimeter everything get’s rendered again, which seems rather unnecessary
the 7840HS is not the best choice for high single thread speed. An intel would have been better. There are not really a lot of ways to improve the speed, apart from setting the power limits or fans higher.
The new amd Ryzen AI laptops have decent single thread performance and memory bandwidth together with great battery life, but sadly ram is soldered and don’t go over 64gb.
Yeah I know. Sadly my needs seem to be rather specific and I actually didn’t find any Laptop that did exactly what I need, but this one came closest. Intel was considered in the Asus Zephyrus G16 (the prior model) but didn’t get chosen because of lower efficiency. On the upside that 7840HS is probably the most powerful CPU in the whole company with maybe the exception of the Server, maybe…
Lunar Lake with 64GB+ would have been nice…
I am looking to upgrade my laptop (its 3 years old with an 11800H). and i run in the same issues.
For performance the xmg neo 16 would probably be best, for those rare times i need to do FEA or 3d scanning. But that system is also big and has awful battery life.
90% of the time the ryzen AI would be fine and it has great battery life, but the 64gb models are rare and i kind of want 96gb.
If you can wait a little longer you might want to see what Strix Halo brings to the table. At least in theory it supports a LOT of RAM and single core performance should be at least on par with Strix Point.
I’m ordering parts for a new workstation mostly for use with ArgGIS Pro. I work for 911 maintaining GIS and raster data. Unfortunately being county government I am under some pretty tight budget constraints so I wasn’t able to go for an actual workstation setup opting instead for a Ryzen 9700X.
As has been mentioned ESRI is really behind on multi-threading so the 12 or 16 core variants didnt make much sense to me. I used that money for 64GB of RAM instead of 32. I have a TON of custom configuration required for my workflow so I’m ordering 2 boot SSD’s with the intention of cloning the configured boot drive once complete and stashing that as a backup. Wish i could get hardware RAID1 in this price bracket instead. Also adding an SSD data drive for my working/source data but my final output ends up on an ArcGIS server so the data drive is mostly for processing large raster datasets.
The whole build came in just under $1500 but I’m reusing my trusty Quadro GPU from my previous system.
So far as the python comment, I really just mean it in terms of a DBMS is a DMBS and Python is Python so it’s best to use the tool that’s designed for the task at hand. Truth be told, I don’t know enough about your use-case I’m just speaking in generalities based on my experience.
Iteration in a DBMS can be cumbersome for sure but the way I typically handle that is that if I’m handing something off to a tool that is better for said task, I just write wrappers that return the response I want (JSON or whatever). So if PostGIS (and again I’m not super familiar with it) does something way faster than Python (which I’m sure it absolutely does) I just simply make sure my data is in there and write a Python wrapper to manipulate and retrieve it.
It can be a pain, but I’d rather deal with relatively momentary pain of writing a wrapper using a better tool now than sub-optimal performance on an chronic basis.
In my experience, while throwing hardware at GIS is an option, substantive performance gains come from writing your own code. The 7850HS is already Zen 4, meaning you’ll get maybe ~15% from Zen 5 and bumping DDR5 speeds, while storage changes likely have a good chance of being a no op. You can monitor drive utilization to see if workloads are even approaching 2.0 x4 NVMe capabilities, much less 3.0 x4, 4.0 x4, or 5.0 x4. In general, I don’t really see that even a 3.5 is a bottleneck.
In comparison, the last GDAL algorithm I recoded sped up ~275x without trying to do anything for perf. The actual compute part of that’s closer to 700x but GDAL is slow getting on and off disk, which pulls down the overall figure. In other situations gains have been upwards of 10 000x.
Depends a lot on the workload. But, in general, I would tend to say this mostly isn’t a performance critical pattern. The best answer here is going to be to profile what you’re doing to see where it’s limited. Most likely it’s software bottlenecked rather than hardware. Even if you’re coding against GDAL’s C++ bindings there’s still a good bit of unforced inefficiency because the APIs and GDAL’s internals just aren’t designed to be particularly performant.
(Haven’t particularly looked into the ESRI equivalents because IMO life is too short to deal with Arc more than is necessary but, ESRI being ESRI, I’d anticipate mostly worse rather than better.)