Weird speed issues with SSDs under linux (TrueNAS Scale)

Ok so I have a completely unreasonable TrueNAS setup for a home environment, but it’s my money and I’ll waste it how I want.

Anyway, for the SSD caching on all the non-data vdevs I went with m.2 consumer drives for dedup (1TB mirror) and meta (2TB mirror) and optanes (partitioned to 10GB/40GB/remaining) for log/l2arc and a small separate apps vdev for docker and any write heavy apps to give the consumer SSDs a break.

For the m.2’s I have 2 of these and for the u.2 optanes I have one of these and everything ostensibly works great. Bifurcation works as best as I can tell, all drives are seen by TrueNAS, all drives are performing their intended roles just fine, network speeds are fine.

But I wanted to test out local speeds so I thought I’d just do a quick and dirty read test by dd’ing like 20GB from a partition on each drive into /dev/zero and see how fast it was to make sure everything was working, and here was the results:

(note the Evo drive was from a previous setup that’s no longer in use I just haven’t bothered to remove it to protect the thermal pad on that slot for future use)

sysctl vm.drop_caches=3
dd if=/dev/nvme0n1p3 of=/dev/zero bs=1024k count=20480 ← U.2 905P Optane 960GB
dd if=/dev/nvme1n1p3 of=/dev/zero bs=1024k count=20480 ← U.2 905P Optane 960GB
dd if=/dev/nvme2n1p3 of=/dev/zero bs=1024k count=20480 ← U.2 905P Optane 960GB
dd if=/dev/nvme3n1p3 of=/dev/zero bs=1024k count=20480 ← U.2 905P Optane 960GB
dd if=/dev/nvme4n1p1 of=/dev/zero bs=1024k count=20480 ← M.2 SABRENT 1TB Rocket
dd if=/dev/nvme5n1p1 of=/dev/zero bs=1024k count=20480 ← M.2 SABRENT 1TB Rocket
dd if=/dev/nvme6n1p1 of=/dev/zero bs=1024k count=20480 ← SAMSUNG 980 SSD 1TB
dd if=/dev/nvme7n1p1 of=/dev/zero bs=1024k count=20480 ← SAMSUNG 980 SSD 1TB
dd if=/dev/nvme8n1p1 of=/dev/zero bs=1024k count=20480 ← SABRENT 2TB Rocket 4 Plus
dd if=/dev/nvme9n1p1 of=/dev/zero bs=1024k count=20480 ← SABRENT 2TB Rocket 4 Plus
dd if=/dev/nvme10n1p1 of=/dev/zero bs=1024k count=20480 ← SABRENT 2TB Rocket 4 Plus
dd if=/dev/nvme11n1 of=/dev/zero bs=1024k count=20480 ← SAMSUNG 970 Evo SSD 1TB

21474836480 bytes (21 GB, 20 GiB) copied, 8.7577 s, 2.5 GB/s
21474836480 bytes (21 GB, 20 GiB) copied, 8.77126 s, 2.4 GB/s
21474836480 bytes (21 GB, 20 GiB) copied, 8.77417 s, 2.4 GB/s
21474836480 bytes (21 GB, 20 GiB) copied, 12.7179 s, 1.7 GB/s ← slow
21474836480 bytes (21 GB, 20 GiB) copied, 10.3195 s, 2.1 GB/s
21474836480 bytes (21 GB, 20 GiB) copied, 10.1295 s, 2.1 GB/s
21474836480 bytes (21 GB, 20 GiB) copied, 10.978 s, 2.0 GB/s
21474836480 bytes (21 GB, 20 GiB) copied, 16.1029 s, 1.3 GB/s ← slow
21474836480 bytes (21 GB, 20 GiB) copied, 8.39923 s, 2.6 GB/s
21474836480 bytes (21 GB, 20 GiB) copied, 8.46526 s, 2.5 GB/s
21474836480 bytes (21 GB, 20 GiB) copied, 8.41397 s, 2.6 GB/s
21474836480 bytes (21 GB, 20 GiB) copied, 9.68125 s, 2.2 GB/s ← may be slow but probably not

So everything is fine except nvme3 and nvme7 and possibly nvme11 which are all unusually slow. nvme11 is the Evo drive so it may actually be fine, but the 4th octane and 2nd 980 are both unusually slow relative to their neighbors.

Anyone have any clue why these 4th drives in each card, both from different mfgs and different configurations (m.2 vs u.2) and even connected to different CPUs (it’s a 2 socket board) would be running so much slower?

Is there any way to verify that these drives are getting all 4 PCIe lanes? Feels like they’re missing one or two.

I haven’t write tested them, I guess I could offline them in zfs, do a write test, then online them and resilver.

dmidecode wasn’t much use as it doesn’t seem to break out bifurcated devices.

Moving drives around in this setup isn’t terribly convenient as to testing if the problem follows the particular x4 port on the adapter cards, drive, and I’ve only got one more x16 slot empty I could move a card to, I’m just hoping someone else might have a less invasive idea.

You have garbage collection etc in the background so that might affect your results specially if you only have one iteration.

Oh I should have mentioned I ran this numerous times, particularly on the optanes, and the optanes don’t exactly do GC in any meaningful way.

Oh and these are strictly read tests so no GC in any sense

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.