My work involves processing 1-3TB data sets of electron microscope images (I spend a few weeks at a time on a single data set, usually). I have a Ubuntu 20.04 workstation that has limited physical space for locally attached storage. Because of a suboptimal motherboard layout, the installed GPUs block all drive connectors except 1 NVMe and 2 SATA ports. I also have a separate disk server, but there’s only a 1 Gb link between the disk server and the workstation. The software’s access pattern really benefits from fast SSD storage.
In an effort to avoid just throwing money at the problem (large drives, or 10Gb networking), I’m wondering if there’s some way I can use the disks that I already have ( a few 1.2 TB sata SSDs and a 1 TB NVMe) to set up some kind of automatic local caching for an NFS mount (i.e. data that’s physically on the disk server). Read caching, but also write-back caching would be important - I really would like the active dataset to be on local flash while I’m working on it, I scan through the dataset many times, so the inital time of caching it over the 1Gb link isn’t the end of the world.
Does anybody know of any solutions for automatic local caching of NFS data that might help me out here? If not, or if the existing solutions come with problems, I could just cough up more $ for larger drives and keep everything local, but I’d like to explore this idea first.
Do you have any open PCIe slots on both machines? If so, you can do high speed networking on the cheap by just buying a NIC for both, plugging them directly into each other, and manually assigning an IP address to the interfaces on each machine. If you have a x4 slot or greater open on each then you can do 10 Gbps, but even if you have just a x1 slot open you can upgrade to 2.5 Gbps. For 10 Gbps I would personally recommend used Mellanox ConnectX-3s and a DAC cable. For 2.5 Gbps I would look for anything with Intel chips. You can likely pull off this upgrade for less than 100 USD.
This. If you can’t add enough local storage, and you have reasonably fast remote storage, why not use it.
I’d recommend an infiniband card though, they are probably cheaper per GB/s.
If you really don’t have the free PCIe slot, maybe try a PCIe raiser, or M.2 2.5 GbE device(I assume by NVME you mean a M.2 slot with PCIe).
Keep in mind that you can get quite large and fast M.2 storage devices, and local storage still might be faster, and that two SATA SSDs in RAID might be fast and large enough(You could get 2x 8TB SATA SSD’s, and RAID them for a single 16TB scratch drive).