Soliciting ZFS pool layout advice for a multi-purpose homelab server

zmezoo · July 27, 2025, 12:13am

Bad advice. The general rule of thumb is zfs needs 1gb of arc for every 1tb of data. Unless your really memory constrained i absolutely would not cap arc. Arc is sort of a star centerpiece of how zfs works and why it performs so well. Kneecapping it without knowing you need to will cause more problems than it will solve.

Arc automatically grows and shrinks to alleviate memory pressure. The default cap is 50% of memory.

Use tools like arc summary to observer your arc before making any changes, arc summary will also let you know if you need a l2arc and or slog.

Arc summary will tell you how many times arc has been shrunk to make room for running processes. And will give you statistics on its current size and how much memory is being used for various purposes.

Generally speaking unless you have more than 10% of arc cache misses you will see no benefit from a l2arc. Knee capping arc is a sure fire way to make misses.

diizzy · July 27, 2025, 4:06am

That highly depends on your workload and if you’re going to run multiple VMs and other memory hungry applications you will need to manage it. The old 1Gb / 1TB is dated and depends on workload.

Just to show from a box running 2 VMs, a bunch of services natively (including Samba) including Poudriere

ARC Size:                               100.01% 12.00   GiB
        Target Size: (Adaptive)         100.00% 12.00   GiB

ARC Efficiency:                                 261.59  m
        Cache Hit Ratio:                97.91%  256.13  m
        Cache Miss Ratio:               2.09%   5.46    m
        Actual Hit Ratio:               97.91%  256.13  m

…and more than 30TB in total storage

zmezoo · July 27, 2025, 6:36am

Still not advisable.The whole point is just like os file cache puts recent files in any unused memory space the zfs arc doe the same with a lot of different zfs related objects including files themselves.

I have never had a oom kill anything because zfs did not shrink arc to allow another program more memory. So unless you have a really good reason to lower the 50% of memory default you should probably leave it alone.

Had you suggested looking at memory usage and arc_summary if they were runing out of memory while testing the new setup, that would be ok advice. But blanket saying to limit, a auto resizing cache is not good advice.

Is that 30tb of data. The guidance assumes your pool is mostly utilized. If your only storing 5tb out of 30tb you may as well treat it as a 5tb drive as far as the advice is concerned.

diizzy · July 27, 2025, 7:32am

Rather pointless to continue given that you’ve proven wrong and seemingly have no grasp of workloads. Race conditions between ARC and processes/kernel isn’t something new, there’s no “one size fits all” approach.

zmezoo · July 27, 2025, 8:02am

zfs operates as a kernel module so there are no arc kernel race condition. Nor are there process arc race conditions as the kernel is the intermediary and its already tightly integrated with and aware of arc. See this is the whole beauty of zfs and arc, because zfs is both a drive controller and filesystem and is a kernel module, when any process goes hey can i have xyz file, the kernel can go oh i have a valid copy of that in arc, because the zfs kernel module has a valid copy there because it knows which files in arc are still valid, since its aware of reads and writes. I don’t think you have any clue of what your talking about.

if your going to pick a cache to dedicate your memory to you want it to be arc, as it servers all vms and the host os. Also its a smarter more advanced caching system since unlike regular LRU page file caches. The last and biggest point is the A in arc for adaptive, meaning if another process needs memory, arc will give it up to that process. There is no harm in letting arc dominate the regular os cache and this is in fact preferable or do you not care to benefit from one of zfs biggest and strongest selling points.

Good point thats what the A in arc is for adaptive! Its why you can overcommit memory to arc and not run out of memory when a process needs more. If anything should be changed out of the box it the minimum memory footprint arc has. You can set that, too. But i would still wait to do that till you have used and observer your new pool and system for a bit.

epower53 · July 27, 2025, 1:00pm

I have a reasonably large (200+) collection of blu-rays, a few of which are 4k, and all have been previously ripped and downsampled for file size minimization. We also inherited an embarrasingly large movie collection recently, and there are a good number of titles there that I’ll want to rip (mostly DVD and blu-ray). Unfortunately some of the most desirable are HD-DVD… even if I get a drive for these, I’m not optimistic they’ll be salvageable. There are many reports of bit rot eating these disks…

Yes, having just figured out that nullfs mounts won’t work the way I want with bhyve, I’m changing plans for this deployment. I’ll put jellyfin in a jail so it can access the media library dataset using nullfs. Cleaner than using an NFS share as a broker to get files from the dataset into jellyfin. I’d prefer not to have the media filesystem embedded in the jellyfin VM…

diizzy · July 27, 2025, 2:54pm

Ahh,

You have multimedia/makemkv , multimedia/dvdread and multimedia/handbrake packages and in ports
For audio you have audio/cyanrip