So…that’s a little bit more difficult to precisely ascertain because it depends a LOT on what I am doing/transferring the data for.
i.e. I don’t “copy” the data over the network necessarily all that often. Usually, if I am reading and/or accessing the data over IB, it’s because I am trying to act upon it. (For example, right now, I am running integrity checks on the xz
archive files, so the compute nodes in the cluster is reading the data directly off my micro HPC cluster headnode over IB.)
As such, NFS itself, I think, by default spawns something like anywhere between 8-20 nfsd processes, and each of those process will report 100% cpu utilisation to top, and yet, the load average would only be around 0.85, so trying to measure that – I don’t really have particularly great/accurate ways to measure CPU utilisation for my deployment.
I haven’t tested the headnode CPU utilisation when an application is using said NFSoRDMA though. The headnode hasn’t noticeably slowed down during transfers such that it would’ve caused me to even look at it in the first place.
(CPU utilisation testing when I was setting up my NFSoRDMA wasn’t a part of the testing suite because it isn’t like I am in a multi-user environment, where I am hammering the cluster headnode with multiple I/O requests from multiple users. It’s just me, so unless it slows down significantly, I wouldn’t even think nor bother to check what the CPU utilisation is.)
The Highpoint link, sadly, is also a 404 for me.
The lack of scaling that I am seeing has moreso to do with the more spindles I am adding to the RAID array, I am not getting a linear increase in read/write performance, regardless of whether that’s sequential performance and/or random performance.
But as I mentioned though, I am using spinning rust, so that might be expected.
(I don’t use NVMe SSDs in my cluster headnode because yes, they are super fast, but being that I can push hundreds of TBs of data through, with said NVMe SSDs being super fast, it will also mean that I will burn through the finite write endurance limit that much faster as well. This is why I have “un”-deployed pretty much ALL SSDs (including SATA SSDs) for this reason (I will just burn through the write endurance limit in hardly any time at all), and this is why I am using more spinning rust (still) than SSDs of any interface.
My “ideal” solution would likely have been Intel DC Pmem, but that division just got sold off to SK Hynix and Intel killed it, and GlusterFS no longer supports created distributed volumes of RAM drives running over the network. (distributed gluster volume of tmpfs mounts) It USED to be able to do that, but it can’t do that anymore.
So, I don’t have really any good solutions for high speed read/writes that DOESN’T come with a finite number of write/erase/program cycles, and at a decent/reasonable cost as well. (Those solutions are WAYYYY too rich for my blood.)
(I’m running eight HGST 10 TB SAS 12 Gbps 7200 rpm HDDs because they were cheap.)