Tips for improving iSCSI read performance in Truenas Core

I’m currently running Truenas on a R320 (e5-2407, 12GB DDR3, 10Gb network, and a LSI 9202-16e HBA) hooked up to a DS4243 shelf and a single RAIDz1 vdev of 4 4TB drives. I am using this pool explicitly for a steam iSCSI device, and wanted to improve the performance, now that I have switched to a 10Gb backhaul. I am seeing the same read speeds that I was getting over 1Gb, about 105MB/s, while getting about 225MB/s write. (main PC is limited to 2.5Gb ethernet)

Is there any tunables or other options I should be looking at to improve the read performance to at least beat out a locally installed HDD (170MB/s) before upgrading things like the CPU or RAM?

Have you done any iperf testing b/t the two systems?
What volblock size did you use?
During use, how does the cpu/mem/disk usage look?
How about CPU wait %?

just ran a 30 sec iperf, and i’m getting over 2.1Gb minimum, avg about 2.25Gb. as for the volblock size, I believe it is 4096k, as I used the default ‘modern OS’ version when setting up the zvol.

so far, it looks like the CPU is at less than 50% is any case, but reads were in the 15-20% range overall.

Memory is running about 8GB for the ZFS cache, but is fully used. I’m guessing the RAM amount is the main bottleneck, since many of the files in newer games are going to be near or in excess of 8GB

You might also want to have a look at gstat

If link speed is only 2.1-2.5, then the performance is not much lower?
Might want to address the link speed and see if you get better throughput?

Have you tested the speed that you can get locally, then try and improve the link speed?

This is not how ARC works. First, you have to subtract metadata from the 8GB, then the dirty data that’s not yet written to disk, then some headers and then you get to like ~5-6GB of the most frequently and recently used files. And without compression, it’s pretty much only this amount of uncompressed data that is ready to be pulled right from memory. Higher compression gets you more memory(I have 1.34 compressratio on my steam dataset). It’s worse for media, but much higher on text files (like a OS)

Definitely too little for caching game files. Consider upgrading memory and/or using an SSD as L2ARC (even a cheap 25$ 250GB drive will do).

Higher write speed than read speed is very unusual…especially considering you got RaidZ going. 225MB write is almost too good to be true (did you copy the same file more than once for testing?). Smells fishy and my gut feeling doesn’t think it is related to ARC, but I can’t think of another reason.
I would expect 100-200MB/s sequential writes and 450-600MB/s sequential reads and memory speed on a cache hit (so basically limited by networking)

edit: limited memory in a multi-TB pool with only a zvol of possibly small volblocksize could lead to an inflation of metadata that is just crippling the ARC.

/proc/spl/kstat/zfs/arcstats gets you a nice overview on what’s currently stored in the ARC. arc_meta_used and arc_meta_max are the relevant values here. Having to pull metadata from the HDD could be the reason for underwhelming read speed.

1 Like

Something worth to consider is also what type of filesystem you’re using, some are much more I/O intensive.

Just for fun I tried iSCSI on my FreeBSD 13.1 boxes and I was actually a bit surprised about the result in a positive way. I can list all technical information if you want but there’s the essentials:

Target (server):
Dell T20 (Intel G3220, 12Gb of RAM. lightly loaded)
SSD: Sandisk Ultra 3D 1TB (SATA SSD, pretty much the same as WD Blue 3D), ZFS with default settings (no compression), dedicated
Intel GbE NIC
Block sized specified: 4096

Initiator (client):
PINE64 RockPro64
Intel GbE NIC
File system used on iSCSI target: UFS
Default settings for UFS and iSCSI target

root@:/usr/ports # dd if=/dev/zero of=dummy-file bs=4M count=7680
7680+0 records in
7680+0 records out
32212254720 bytes transferred in 424.766804 secs (75835151 bytes/sec)

root@:/usr/ports # dd if=dummy-file of=/dev/null bs=4M count=7680
7680+0 records in
7680+0 records out
32212254720 bytes transferred in 277.146690 secs (116228178 bytes/sec)

Running dd (write) locally on the server on the SSD:

root@hanekawa:/vault2/buildshares # dd if=/dev/zero of=dummy-file bs=4M count=7680
7680+0 records in
7680+0 records out
32212254720 bytes transferred in 300.735652 secs (107111526 bytes/sec)

Looking at gstat I can see that the SSD is getting boged down by writes in a somewhat bursty manner.

I can also mention that this is much faster than what I experienced using UFS under Hyper-V with another SSD (same model) which also would hammer the device (NTFS as native file system) using rather large git repos. Switching to ZFS on the guest did wonders and used a lot less I/O. I haven’t dived into why and there was also a lot less I/O overhead doing writing. It’s also a lot of faster on the RockPro64 despite being somewhat bottlenecked by the SoC and using UFS compared to the Hyper-V VM running on Windows 10 with i7-3770 CPU and 8Gb of RAM dedicated to the VM.

I just checked TrueNAS wizard settings for zvol creation. “modern OS 4096 sectors” results in 16k volblocksize.

I’m getting around 1G of metadata per 10T in my movies dataset (1M record size). Scaling this down to your 16k and assuming 10TB of data equals 1024/16= 64GB worth of metadata (rough quick&dirty estimate).

Yeah…I can see performance problems there.Amount of metadata vastly exceeds your ARC and there is no L2ARC or special vdev to hold it either. To what degree this is causing your problems, I can not tell, but it certainly slows down pool performance. Recommended record size and default for datasets is 128k which is why 8GB memory usually “work” to some degree. But with a 16k zvol, metadata is just all over the place.

16GB+ memory and a L2ARC will solve this problem. Or you crank down the amount of metadata by using higher recordsize/block size. This requires creating a new zvol and copying the data over to the new one. I’m not sure if a 128k volblocksize is any good for the task and may lead to problems by itself. But even with 128k, the ARC will still be limited to mostly caching metadata. And you want the ARC to send game data in lightning speed, not drown in metadata.

1 Like

I’ll throw this out just to avoid any possible confusion a ZFS newbie might get from reading the above:

The recordsize of datasets is dynamic, and can be any power of two size smaller, down to the ashift.

For ZVOLs, the volblocksize is static, all blocks are set as the same size as ZFS is emulating a (virtual) block device. A 1K file sent to a 128K volblocksize zvol takes up the full 128K of space, and the entire 128K block has to be read or written, increasing latency of access, but improving sequential performance of large files. As mentioned, the metadata required to be stored and accessed is much smaller, so ultimately performance could be better or worse.

The extra wasted space may or may not matter, depending on how big or small most of your files are. Most things that eat up space on a regular persons drives these days are going to be in the MB range, so the change may not even be noticeable from a space used perspective.

2 Likes

Definitely appreciate all of the info here, particularly since this is my first go at Truenas. I just added a spare ssd i had lying about (120GB 850 Evo) and set it as a cache disk, and tested with a default CDM set:

Summary

CrystalDiskMark 8.0.4 x64
[Read]
SEQ 1MiB (Q= 8, T= 1): 119.535 MB/s [ 114.0 IOPS] < 69761.71 us>
SEQ 1MiB (Q= 1, T= 1): 46.132 MB/s [ 44.0 IOPS] < 22702.15 us>
RND 4KiB (Q= 32, T= 1): 172.983 MB/s [ 42232.2 IOPS] < 754.55 us>
RND 4KiB (Q= 1, T= 1): 20.633 MB/s [ 5037.4 IOPS] < 198.12 us>
[Write]
SEQ 1MiB (Q= 8, T= 1): 258.167 MB/s [ 246.2 IOPS] < 31114.48 us>
SEQ 1MiB (Q= 1, T= 1): 197.346 MB/s [ 188.2 IOPS] < 5113.45 us>
RND 4KiB (Q= 32, T= 1): 89.812 MB/s [ 21926.8 IOPS] < 1423.92 us>
RND 4KiB (Q= 1, T= 1): 14.680 MB/s [ 3584.0 IOPS] < 265.99 us>
Profile: Default
Test: 1 GiB (x5) [S: 16% (1744/10691GiB)]
Mode: [Admin]
Time: Measure 5 sec / Interval 5 sec
Date: 2022/07/05 18:51:00
OS: Windows 10 Professional [10.0 Build 19044] (x64)

this seems to be a bit better than without the cache.

I ran gstat while loading up a game, and i’m not quite sure what to pay attention to, aside from the read kBps per disk (da1-4 are the vdev I’m working off of) and I saw at one point 5 digit read numbers on each disk.

What should I be looking for while running apps off of the iSCSI to evaluate the next bottleneck?

Out of curiosity, does exFAT provide better performance?

Apologies, I have not had a chance to test since adding the SSD as a cache, which did make things much more consistent. I assume exfat would be on the windows side, correct?

CPU is fine. You need to sort out your RAM. At least 16GB, 32 desirable.
However you should test your setup with a standard SMB share to see what that’s capable of. Can you test the speed of your ZVOL within the TrueNAS machine, maybe with a VM of some kind?

You will get more drive speed if you make a stripe of two mirrors than RAIDz1. Hard drive are only 150MB/s most of the time so by striping you might get 300MB/s heck if it’s only games why not stripe all 4 and get the max speed of 600MB/s. RAM will make a huge difference. Don’t try to add SSD cache it may even make it slower.

Yes, on the Windows side of things