Hello all and hopefully a forum topic about this subject is allowed,
I have over 1000 torrents on my torrent client (Qbit) but only max 50 are active at one time(up to 2TB in data), but i notice when there’s is a alot of torrents seeding, my 4 Docker VMs, that connect to the NAS via NFS become very slow, sometimes just doing a ‘ls -l’ on the mount directory can take 3+ mins to respond, i think it’s because the torrentting is using alot of my 4 x 20TB pool(striped mirror), read iops, the pool is very very slow until I stop seeding as much.
And so i was thinking, should i buying 1 or 2 drives and create a dedicated VM (on proxmox) for torrentt seeding, create a RAID 0 or JBOD storage pool for torrentting, that way the read iops for those specific drives will only be needed for torrentting and won’t slow down my NAS ZFS pool
Or should i tackle it a different way and get a 1.5TB optane drive(P4800X) and use that as a L2ARC (i have 74GB RAM for my truenas VM) for the pool, and hopefully, the active seeding torrents will just go to the L2ARC and barely touch the HDDs, and it would also help Plex playback too, if i have 5+ concurrent streams, also since it’s optane i won’t have to worry about wearing out the SSD
Or should i invest in a 2-3 way special metadata device vdev for the pool, i don’t think small blocks will benefit my workload since the pool is filled with big files) so I’d only need 375GB-400GB optane drives (i eventually want to get to 120TB via a RaidZ2 8 x 20TB) or using PCIE 4.0 M.2s even thought latency and not sequential speed is key here, but they’re cheaper and higher capacity.
Also i should add my upload speed from my ISP is only 100Mbps, but will be upgrading to 1Gbps soon
Tldr
For torrent seeding is L2ARC a good choice or adding a 2-3 way special metadata vdev or just buying 1or2 drives and putting it on Linux vm with a EXT4 storage pool that is dedicated for seeding? or is there a better solution?
Thanks for reading and looking forward the responses
Thanks for seeding Linux ISOs. We all appreciate it!
You’ve got three potential pinch points here:
Disks themselves may be tasked to capacity
the NAS’ CPU may be saturated
Your NIC may be saturated.
First thing’s first, start seeding, then check your system stats. Open a terminal and use top to check both CPU and IOWAIT. IOWAIT is basically the kernel waiting for a device to respond with data. (this could be a read payload or a write confirmation)
If your IOWAIT is higher than 5ish, your initial hypothesis is correct and your torrents are saturating the disk.
If your CPU is maxed out, well… That’s your answer.
If neither, it’s very likely network saturation. That’s harder to diagnose, but we can work through that if you’re in that category. My money’s on disk being saturated.
Personally, I keep all my torrenting off my spinning pool because of the disk usage it creates. If you do want to torrent direct to ZFS, an L2ARC can be very helpful, though a metadata device isn’t a bad option either.
The problem here is the random reads.
In an optimal world, a 7200RPM disk will have between 80 and 120 IOPS. a SATA 3 SSD, by comparison will have upwards of 100k IOPS, depending on the drive.
Each torrent chunk is at least 2 IOPS. Read metadata and read file. This happens on each disk mirror, with a timeout. So, this means that with 50 torrents operational at a time, it’s quite saturated.
An easy solution would be the L2ARC. First reads on the torrents would be slow, but they’d be cached, for subsequent reads. That’s good.
Metadata device would also go far in improving this. It’s hard to say which would be better. Frankly, I haven’t played with the metadata devices yet, so I can’t say for sure.
The CPU i use is i5-12600 (6P cores) based on the proxmox dashboard it rarely goes above 80% but I’ll double check the individual VMs.
Network wise, that could potentially be a problem, i use proxmox’s paravirtualised bridge for the truenas vm and the docker VMs, so it maybe there’s a bottleneck somewhere, in which the basic CPU load percentage doesn’t show oor the network aspect could be fine since it’s a newish CPU.
There’s been a couple times where stopping the seeding doesn’t magically make the NFS become responsive but majority of the time it is linked to seeding/rechecking the torrent on completion.
ayeee a fellow seeder, and yes sharing Linux ISOs is a fun way to use fully use otherwise idle hardware and the help the community.
For a few months, i did do the initial download to my nvme pool then switched it over to my HDD pool once it was done, which was working well(especially when rebuilding 10TB+ of Linux ISOs over a few days) but once i started my journey into private trackers and seeding became a crucial part, the initial download/recheck wasn’t the main problem anymore.
Okay brilliant, what i thought what happened in theory seems to be the case, i thought maybe ZFS would be slightly different with how it chooses what goes to ARC, especially if you only seed to one persom, it could maybe be a sequential read which doesn’t require putting the file into the ARC and instead uses the HDD taking up iops, oooooor maybe I’m overthinking this and the L2ARC will sort it out finely
Happy to help! ZFS is one of those interests of mine, so I definitely don’t mind nerding out on it when someone can benefit!
It will definitely help things. I would recommend grabbing a very high endurance SSD for your L2ARC. ZFS absolutely blows through write endurance for L2ARC, so plan accordingly or expect to replace the disk after a couple years.
This is the CPU for the NAS?
If so, it could definitely be pinning the cores assigned to the NAS.
The plot thickens.
This is a circumstance where resource allocation per VM would be incredibly helpful. See, ZFS really likes RAM. If you’ve got a boatload of disk, but little ram, the VM could be starved and take some time to recover.
Given. that you have a lot of moving parts, seeing resource utilization graphs seems like the logical next step.
I see L2ARC a lot of times promoted when it comes to ZFS but think that it is overrated.
Sure, for some very niche use cases it works great, but I think the most appealing thing about L2ARC is that you only need one SSD and the SSD can die without loosing data.
What people often overlook when it comes to L2ARC.
L2ARC needs ARC!
L2ARC can only caches stuff that evicted from ARC to begin with
L2ARC has by default an ingress limit
Knowing that, you should first check ARC before you think about doing L2ARC.
There are also potential pool problems. Since you use vdev for your torrents, you are bound to a fixed volblocksize. That has (depending on the pool) many implications. You could suffer from fragmentation, rw amplification and wasted storage due to padding and pool layout, if you use RAIDZ and not mirrors.
That is basically reading metadata. That use case can be accelerated extremely by special vdev. I used two old spare 500GB SSDs I had and really like the results.
2 x 250GB Samsung 870 EVO - Boot Pool for proxmox, only use it for storing VM ISOs
2 x 1 TB Seagate Firecuda 530 ZFS Mirror(RAID 1) where all my VMs are stored
2 x 2TB Seagate Firecuda 530 ZFS Mirror(RAID 1) passed through to Truenas VM,
4 x 20TB shucked WD Drives, passed through via a
LSI 9207-8i HBA, striped mirror (RAID 10)
1 x 10TB Seagate 1 Ironwolf, also connected to HBA, ‘Backup drive’ although very tempted to use it as a torrent seeding drive for a separate torrenting VM
Proxmox VM setup - 128GB RAM, 6C/12T
5 LXC Containers - autosnap, npm, pihole, adguard, uptime kuma, tailscale,
shoutout tteck ^, they barely use any resources,
Truenas VM - 74 GB RAM, 8 Cores (threads)
Proxmox backup server VM - 4GB RAM, 4 Cores (threads)
Debian Docker VM 1(netwok containers) - 2GB RAM, 2 Cores (Threads)
Debian Docker VM 2 (arr and friends) - 3GB RAM, 6 Cores (Threads)
Debian Docker VM 3 (vpn + containers + Qbit) - 3GB RAM, 6 Cores (Threads)
Debian Docker VM 4 (plex - intel igpu passed through, tdarr and other containers that help plex) - 8GB RAM, 6 Cores (Threads)
Debian Docker VM 5 (nextcloud, syncthing) - 4GB RAM, 4 Cores (Threads)
Debian Docker VM 6 (runs my personal scripts and vscode server) - 3GB RAM, 4 Cores (Threads)
HAOS (shoutout tteck again) - 2GB RAM, 2 Cores(Threads)
Windows VM - 10GB RAM, 8 Cores(Threads)
110-113GB RAM, used in total
All these debian VMs, connect to Truenas via NFS
i done top on the VM that sometimes is slow to connect to NFS and it shows this Tasks: 297 total, 1 running, 296 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.1 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 2869.0 total, 329.6 free, 1715.6 used, 1084.8 buff/cache MiB Swap: 976.0 total, 17.5 free, 958.5 used. 1153.5 avail Mem
and on the torrenting VM, Tasks: 246 total, 1 running, 245 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.8 us, 0.7 sy, 0.0 ni, 93.0 id, 4.7 wa, 0.0 hi, 0.7 si, 0.0 st MiB Mem : 3877.0 total, 146.1 free, 1223.4 used, 2786.4 buff/cache MiB Swap: 976.0 total, 346.2 free, 629.8 used. 2653.6 avail Me
i just caught it while it was 2 x 15GB Linux ISo, it fluctuates quite abit acutally, mostly around 4-8 but with spites, to 9 or 15 or 29 3-30 wa
and thats only 2 torrents, i wonder what the vaule is when its 10-20-50 torrents seeding %Cpu(s): 0.5 us, 0.6 sy, 0.0 ni, 69.1 id, 29.0 wa, 0.0 hi, 0.8 si, 0.0 st
an example of a spike
(this on the torrent VM, where its constantly accesing the NFS shares)
I was looking at getting an optane drive for this, specifically the 1.5TB P4800X which is 164 PBW
it seems that my hitrate average is about 91-93%, i assume prefetch is low because its not always the same data being seeded? maybe
my record size for the HDD pool is 1MiB
i do need to add more RAM to my truenas VM, but im waiting for 64GB DDR5 ECC UDIMMs, to actually be buyable, then i can switch my motherboard and double the RAM for the whole server, but that its a few months away
or i was thinking of going EPYC 7002/3 where i can potentially go up to 512GB RAM, but i think that will be overkill
so i was looking at ways to upgrade my pool with my current RAM, via 1.5 TB Optane drive as L2ARC, or 2-3 Mirror Special metdata device, especially because i assume this problem will get enhanced when i upgrade to 1Gb Upload and move to 6-8 drive RAIDZ2) luckily right now, theres only so much data i can randomly seed because of my current 100 Mbps upload speed
In my experience a seed box does not benefit from L2ARC. The biggest help to performance is discs, and pool design. Like a lot of 2 drive mirrors in a pool.
If your seedbox shares space with a media server a lot of RAM will be good. But at no point has L2ARC ever helped in these work loads. The reads are too random and nearly never duplicated beyond even a small RAM ARC.
Ahhh TrueNAS as a VM… I never understood why people do this.
Why not just setup a NFS share on Proxmox? Is having a GUI really worth the headache you get by virtualizing TrueNAS?
I am not sure why an L2ARC would need a high TBW. Depending on the config, it will mostly only store metadata. So BW is pretty low for most use cases. And it also does not matter if the drive dies. Optanes are great for SLOG but overkill for L2ARC.
I would make it even simpler. Get one Proxmox system with where you have your SSDs in 3 way mirrors, mabye an optane log for faster sync writes, and a TrueNAS system with boring old RAIDZ2, maybe add a special vdev.
That would make your setup way simpler and faster.
thank you for that crucial piece of info, so would you say its better to just create an additional linux VM dedicated to seeding? that way i wouldn’t have to worry about ZFS, and ARC, it would just be a JBOD or Raid 0 (if i use 2 x 10TB HDDs) 4-8GB of RAM and thats it?
Honestly its not much of a headache, you just passthrough a HBA and its pretty stable, i’m pretty sure i would still have the same problems if i had a dedicated NAS and VM Machine, it’s more to do with torrent seeding and its high read iops requirements.
i see, maybe it is overkill to use optane, especially when i can get a cheaper and higher TB M.2/U.2
my storage is on ZFS, really media and seeds use data in a similar way most the time so mine are on the same ZFS pool of multiple mirrors. i can max out the network well before maxing out the disks and i do not have a massive amount of CPU or anything for it.
use case. my virtualized truenas is connected to active directory and handles sharing and connectivity with a few clicks. adding all of this on top of ProxMox would make it impossible to deal with if ProxMox went down, and even ProxMox updates would have a high likelihood of causing issues.
i always think to myself the inverse of what you said. ‘why add a bunch of things to the hypervisor when you could just run it in a dedicated space?’
ooo i see, you went the striped mirror route (RAID 10), so you have alot of iops with each stripe, maybe that is the solution that makes more sense instead of adding more complexity with a separate VM, i’d just be annoyed by the 50% space utilisation, once i reach 8+ drives haha.
i appreciate the input, its given me alot to think about
I highly doubt that, since your IO delay is at 0%.
I have over 100 torrents that sit on a 8wide RAIDZ2 (now with special vdev, but before without) and I can to multi gigabit uploads. Because my hardware is way, way weaker on paper than yours, I suspect some kind of virtualization complexity layer.
Okay that’s interesting, it seems like i need to just increase the amount of drives in my pool first before considering any more NVMes or VMs,
My 20TB HDDs are WD White label shucked drives, so maybe that could be a reason, or i just need more drives (6+) so the pool can find the data ‘quicker’ via the stripes or RAIDZ2 split of data
Yeah i use VirtIO for all the VMs,
and no haha, no nested virtualisation, they are all separate VMs on proxmox
interesting, its looking like more drives are the solution
it isn’t enabled on any VMs, i though numa was an AMD EPYC thing, so i didn’t enable it, my CPU is all P cores, so its pretty standard, unless i’ve misunderstood numa?
Also i should add I’ve mitigated most of the NFS troubles via a python script, that i implemented nearly a month ago, it periodically checks if the nfs share is still mounted and responsive via a timeout ‘ls -l’, and if it’s not responsive, it shutdowns the docker containers until the nfs can be reached again, then restarts them, so i don’t have to worry a slow nfs mount that stalls a container(for example sonarr).
I guess I’m just trying find a solution where the script isn’t needed and the NFS shares are able to be traversed quickly regardless if I’m seeding alot of torrents
You’ve got quite a high hit rate on your ARC; Given that you’ve thrown 74GB at it, I think you’re in good shape. I don’t think you can actually benefit from an L2ARC. I’m inclined to agree with MightBeAFish… Go for a metadata SSD and be happy.
So the “IO Delay” graph I’m seeing just above definitely indicates to me that you’re getting some IO issues, but I think this might be metadata lookups, based on the ls -l hint.
While an L2ARC could be helpful, you’re probably best off with keeping your ARC big and going with a metadata device.
It shouldn’t have any impact on your CPU. If you were on a Xeon of sorts, I’d have a different position, but a bog standard Intel desktop CPU shouldn’t have any numa enablement requirement.
Well, thinking back on it, the 5% is probably a bit aggressive to call “lots”. It’s definitely higher than I’d like it to be, but it’s unlikely to be the culprit here.