In a very similar fashion to this question (from mid-2021) I am looking for fast scratch space (in mid-2022).
I do a lot of data analysis, development of scientific software (e.g. simulations, data processing and analysis pipelines) as well as visualizations. The technical constrains are defined by whatever project lands on my desk … which can actually be fairly diverse. One common denominator is parallelization: Either the workloads are already parallelized or it’s part of my job to parallelize them. There are usually parallel reads and writes of chunks of data involved.
For “local” development, I am using a AMD Epyc 7443P CPU on top of a ASRock ROMED8-2T mainboard.
Data safety is less of a concern - important data resides on a NAS that receives regular backups. I’d copy data onto and off the scratch space as needed. If it fails, I loose a day of computation perhaps. That’s ok-ish.
Purely as a fast general purpose scratch space, I intend to go for something “simple”, if that’s the right term: 2 to 4 striped NVMe drives on a dedicated PCIe card, e.g. Samsung SSD 980 PRO 2TB on ASRock Hyper Quad M.2 Card.
I made fairly good experiences with a similar setup involving a 2nd-gen EPYC, also 24 cores, and two PCIe-3 NVMe’s from Intel. There was usually a lot of room for optimization of read/write performance by tweaking data chunk sizes (I am working a lot with zarr and similar tools) as well as the configuration of the used EXT4 file system “on top” and LVM or Linux RAID “underneath”. In other words, I re-formated and benchmarked this arrangement a few times, depending on projects’ needs.
The common wisdom appears to be to go for enterprise SSDs in a scenario like mine, or even Optane. Now the latter is crazy expensive at the moment. Besides, Intel has just canceled Optane altogether although there still appears to be a lot of of stock. The argument(s) against consumer SSDs include that they wear out more quickly and that they usually are highly susceptible to power losses. I rarely have power losses and if so I’d loose a day of work, ok, see above. As for the wear, if I manage to wear them down in two to three years, the market might have changed (as usual) … so it might not be the worst idea to just, well, wear them down intentionally and replace them with whatever tech comes next. Besides, the numbers provided for “TBW” do not differ this much between better consumer and enterprise SSDs anymore, as far as I can see. We’re talking same order of magnitude: For 2TB capacity drives for instance, they hover between 1.2 to 2.0 PBytes.
Any mistakes in my thought process or alternative points of view? What would you guys recommend?