Not getting 10Gbe throughput after upgrade

Hi all - first post here so I appreciate the assistance on this. I tried to find a topic similar to this but wasn’t able to uncover one. If this question is already answered elsewhere please let me know.

I upgraded the networking infrastructure from my NAS to my desktop to 10Gbe. My NAS is running TrueNAS scale with 8 HDDs at 12TB each in two RAIDZ1 vdevs. I’m using a 9240-8i HBA (https://www.amazon.com/gp/product/B0BXPSQ8YV/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1) in my MOBOs PCIE Gen 3 X16 slot to split out to the SATA connections I need. The drives I’m using are a mix of Seagate EXOS X16 12TB drives and MDD 12TB drives (4 of each for the vdevs). The 10GBe NIC in my NAS is the TP-Link 10GB PCIe Network Card (TX401). I’m using a Ryzen 5 3600 as this NAS’ CPU with 32GB of DDR4 memory.

On the desktop, I only had a single PCIE Gen 4 x1 slot left in my MOBO, so I used a PCIE Gen 4 x1 to NVME adapter (https://www.amazon.com/dp/B0CC9424PD?ref=ppx_yo2ov_dt_b_fed_asin_title) so I could install an NVME Marvell AQtion 10Gbit NIC (https://www.amazon.com/dp/B0BWSLSK78?ref=ppx_yo2ov_dt_b_fed_asin_title&th=1).

Between these two machines I have CAT6 cabling and a 10Gbe switch (https://www.amazon.com/dp/B09LNLMH9Y?ref=ppx_yo2ov_dt_b_fed_asin_title). Both machines are plugged into the 10Gbe ports on that switch.

After doing all of this, I’m only getting a maximum throughput of 360MB/s, according to File Explorer. This is better than my 2.5Gbe networking, but I know I’m leaving performance on the table.

The problem I’m having is: I don’t know where my bottleneck is. How would I go about testing my issues and trying to claw back some of that performance?

Thanks!

1 Like

so 2 RAIDZ1 vdevs of 4 drives?
interesting choice

Lil small for 96 TB of HDD… like 1/3 of what’s recommended

That’s…carry the 2… 3gbit plus overhead, around 4 gbit throughput on file explorer

Try fastcopy or robocopy with /mt flag to remove those barriers and see absolute max throughput.

else, you can only at most hope for 450 MB/s MAX theoretical as your 4 drive ZRAID1 vdevs really only have the throughput of 3 drives…which is 150 MB/s MAX each. This takes into account the journaling calculation and write out.

If you had all the drives in a single array, you’d still only have 6Gb/s max theoretical or 750 MB/s… minus journaling calculation and writing you’re at 500 MB at most, or 1/3 faster than your current config.

Long story short:
Start saving for your next NAS build
Redoing your current config would effectively be a new build with all the headaches and heartaches of downtime.

1 Like

I did this because I started with 4 12TB HDDs but later needed more space and created another vdev that I added to my pool.

Yeah this NAS started as a smaller array for media and backups but I needed more space for a lot of the LiDAR point cloud based work I do for my job. Thank you for pointing this out, I should throw some more in there anyways.

Feared this was the case but also this is what happens when you’re inexperienced (my first NAS build) and the scope of the product changes. Thanks for the assist. I’ll try the fastcopy/robocopy step when off work.

I appreciate it

2 Likes

Honestly, save it and plan on ECC for your next build.
Also, goal for 6 drive arrays in ZRAID2. With 22TB CMR drives now a feasible option, I cannot recommend anything else.

What kind of transfers are those workloads? Cause the max speed on HDDs is really only coming from very sequential reads or writes, as soon as you start throwing seek time to get to other sectors in a transfer the speed drops way down.

2 Likes

Shouldn’t the bandwidth sum over the vdevs? So, given 150 MB/s and 250 IOPS per drive, and two radz1 vdevs:

  • Each vdev has three data drives, so 450 MB/s bandwidth (as you said)
  • Each vdev’s IOPS is equal to the slowest drive in that vdev.

Bandwidth and IOPS sum over the vdevs, so 900 MB/s, and twice the number of IOPS of the two slowest drives in total for the pool.

Since the benchmarking was with File Explorer, that probably counts as mostly sequential? So maybe the low speed warrants some further examination.

iX’s ZFS Storage Pool Layout white paper is a great resource for these things!

2 Likes

Theoretical max - yes. Practical - depends. Primarily on block allocation.

Just a note on what seems most likely for the drives.

  • Exos X16 ST12000NM001G: 245 MB/s, 170 IOPS
  • MaxDigitalData MDD12TSAS25672E: no specs that I can find

Assuming the X16s form one vdev and the MDDs the other, do the two vdevs show different characteristics? I’d expect to mostly see 600+ MB/s for large sequential read from the 2X16s, though less than that’s reasonable if the tested files are towards the spindle.

Benching the vdevs locally removes networking considerations. It’s probably not unreasonable to use the X16s as a control on the MDDs.

1 Like

Have you tested the performance of the pool itself? I don’t recall the commands offhand, but if you have an ssd you can add as a single drive pool then you can copy files to and from the pool to that.

Use iperf to test your connectivity. You’re using Aquantia on both sides which can be problematic. I switched to a Mellanox for that reason. Make sure to test both single and multiple streams and both directions.

What kind of files are you testing with? With only two vdevs you won’t have a lot of iops, but you should have okay sequential performance. If you’re not using large files then you won’t see as high speeds.

Since you mentioned file explorer, I assume you’re testing using windows and SMB? SMB is single threaded and needs high clock speeds for fast transfers. That’s why I switched to using NFS so I’m not bottlenecked by a single core.

What are your sync settings? Are you copying to or from the NAS?

What firmware is your HBA running? Is it flashed to a recent IT version?

32GB is fine for a home NAS that’s just doing basic storage. The 1GB per TB that you’re quoting was brought about in reference to people wanting to do things like iSCSI, etc.

3 Likes

Thanks all for the help here. I’m going to respond to all that I can and take your suggestions into account.

I apologize if I don’t have the vocab to properly answer this one, or if I give more information than you’re asking for, but the transfers and work I’m doing off the NAS are primarily: large file storage (video, GIS raster data, LiDAR point cloud data) and data analysis (e.g. using ML to pull classify features in LiDAR point clouds). A typical movie is 3-4GB and a typical LiDAR file is 500MB-1.5GB. I have tens of thousands of LiDAR files in a directory I use for the ML example.

For the ML workload, I’m using the Nvidia A6000 I’ve got in my desktop and ArcPro to conduct the work. I noticed that when I read the same file from my SATA SSD in my desktop, I was able to process the files ~20% faster than when over my 2.5Gbe network. So faster throughput/access to the data definitely helps the process. Even with the 10Gbe upgrade and my throughput issue, I’ve cut my process time down by a good 5-8%. So clawing back more performance definitely helps with that haha

When transferring either 1) large individual files or 2) many files, I hit a 360MB/s throughput ceiling. Checking the TrueNAS Dashboard, I’m not getting anywhere near 20-40% CPU usage when doing those transfers, either.

Sorry, but how would I test my vdevs individually? Thanks for the suggestion I really appreciate it

Great questions. Let me check and get back to you. Thank you so much

You’re not going to see good performance with the LIDAR files. You’d be better off putting an NVMe scratch drive in your desktop and working with the files there and only using the NAS for storage/archive. You’ll see a much higher performance increase with that than any network improvement.

Don’t look at the percentage on the dashboard. Look at the usage in the Stats Per Thread. You’ll likely see one thread at 100 and the others not doing much.

Before you worry about the individual vdevs, you need to test the pool itself without the network.

2 Likes

Yes, BUT he has a 6Gb HBA card (9200 series LSI)

Or high throughput. RAM is cheap throughput boosters.

1 Like

Drive performance, definitely, workload performance perhaps not so much. My experience is many LiDAR processing tools can’t saturate 1 GbE, much less effectively utilize a 3.5. Within that, there’s 3+ orders of magnitude variation with file format and software stack. Also, tens of thousands of ~1 GB tiles is tens of TB, so syncing to local NVMe wants either 1) good cache locality in the training, which in my experience is often not the case since the point clouds get accessed as a multi-TB stream, or 2) substantial spend on enterprise SSDs.

@Waltz157, in addition to benching on the NAS I’d suggest workload profiling to understand the IO patterns and what read rates the stack’s capable of utilizing from where. FWIW, all of the available tools in the area I’m working in were so non-performant I ended up writing a LiDAR file parser and threading engine. In general, a single read thread can saturate the pool (and thus oversubscribe 10 GbE) but is easily quite sensitive to how much other work’s done on the IO thread versus handed off to compute threads. Most code’s also likely to be using synchronous IO and thus sensitive to latency.

My experience with ESRI’s awful. GDAL is better but is really not built for speed over virtual rasters. My solution was to write my own virtual raster implementation but, even throwing a lot of threads at it, GDAL still stalls out around 3 GB/s.

Also, watch DDR bandwidth. At <1 GB/s I’d hope it’s not an issue but it’s been my experience a lot of code is written without consideration of the costs of L2 or L3 cache misses, sometimes with multithreaded workloads ending up provoking dramatic L3 contention between cores. So sometimes low IO transfer rates end up being a side effect.

6 Gb for each of six drives is ~3.3 GB/s potential, within the 9480-8i’s PCIe 2.0 x8 ~4 GB/s ability. Even running x2 it should be able to mostly keep up with a single vdev.

(As an aside the MegaRAID 9480-8i’s not an HBA, though presumably it’s flashed to IT mode here.)

2 Likes

Lets go for a dumb question. did you install drivers for that specific network card in windows? The default windows drivers can be wonky.

You had a 1x 4.0 pci-e connection. but the network chip aqc107 seems to run on pci-e 3.0. but that would still be 9,7gbit if it correctly uses the full 1x lane. So this is probably not the reason.

have you tested how fast the upload to the server is?

RAM doesn’t boost throughput except if it has been cached. But if you don’t access the same data, the extra RAM is useless and 32gb is fine.

2 Likes

hold on, think you misread his config
He’s runnin a 9240-8i
That’s 6 Gigabit / second throughput (more than his drives for sure)
PCIe 2.0 x8 interface giving 5.0 Gbps max - 625 MB / Second
Now take 3/4 of max theoretical to account for reading from all 4 drives, but only serving data from 3 in ZRAID1 and you’re at 460 ish MB/sec
VERY close to his theoretical max for reading from 3 HDD in parallel

He’s getting 360 of the theoretical max 460 MB/sec in file explorer, that’s job well done.

His current setup is well optimized and a good configuration with no significant bottlenecks.

To upgrade, would require such a significant investment and hardware upgrade / reconfiguration that the only way up is a new build.

Yes, an add in card with some NVME drives would bump performance, but not necessarily the best path given other hardware limitations.

PCIe 2.0 8x is about 4-5 GB per second. Not 625mb. So more than enough.

1 Like

You’re correct on that, 5GB pcie interface, but the card itself is not passing that.

I spent 18 hours last week transferring 19 TB sequential through a 9240-8i connected to 4 drives in RAID2 and can comfortably say he’s doin great.

1 Like

Most sequential transfers aren’t going to be in ARC. In regards to the HBA, I can pull 7-8G with one and 32G of memory. Been a while since I’ve looked at the specifics.