Proxmox & ZFS newb about to make a jump, looking for sanity check on storage config

Jahf · August 11, 2021, 3:48am

Looking into how to set up my storage on a newb-to-proxmox home VM host.

My uses for Proxmox are going to be to host multiple VMs (1 passing through a GPU for gaming/blender, 1 passthrough of the HBA for Unraid , others without passthrough for coding, secure browsing). The non-passthrough VMs will just use host graphics on a 1060 6GB after I add a desktop to the Proxmox install.

I’m currently doing the last of my data cleanup from my Windows host before I am ready to reformat. Will probably start looking at doing that tomorrow.

…

What I’m thinking of for my Proxmox storage setup is either:

A)

a ZFS mirror pool with compression on the 2x 4TB drives, using one SATA SSD as cache. That should be more than enough storage for my non-passthrough VMs.
passthrough of NVME for gaming and video editing
Using the other SATA SSD with XFS for LXC & unprivileged Dockers (not sure if that’s a hard requirement, simply going on what I’ve read so far with ZFS use case issues and Docker). Basically things that are easy for me to quickly recover in the event of failure of that SSD (will do regular backups of the containers to my storage array)

OR

B)

Move the 2x 4TB drives onto the HBA and add them to the Unraid VM
Use 512GB SATA SSD to boot Proxmox
Use 1TB SATA SSD for the non-passthrough VMs

I’m about 95% convinced I want to keep my Unraid config, rather than converting the 8TBs over to a ZFS share (I do just barely have enough free space on the 2 4TB drives to pull off a conversion by creating the pool of 8TBs first, moving data, then creating the pool for 2x 4TBs). However I’m open to listening to arguments for going for a full ZFS setup. See next section for why I overall lean to Unraid (which has been up and running for me for awhile).

Questions:

Is either of the above basic configs insane (as in not a good idea)?
Is the 4TB mirror + cache safe/sane to also use ZFS compression?
Am I being too weird keeping my existing Unraid setup? It’s mostly media files and long term project/document storage (also currently Dockers but I’m moving that function off Unraid).
Given I have excess memory, I can give the ZFS pool whatever it wants, how much should I look at giving for performance reasons?

…

Unraid note: 3 reasons I’ve used Unraid to date:

Able to add unmatched disk sizes in the future (originally planned on putting the 2x 4TB drives there, plus I have a couple other drives I may install)
Knowing that even in a truly catastrophic case at least some of my data survives as Unraid puts files directly in to a single drive (yes, that’s slower, for my uses that’s fine)
Being able to not have all drives spun up when not in active use, ie, to read a file only 1 disk has to spin up.

…

Hardware rundown:

Aorus Master rev 1 BIOS F34
5950x w/ Prosiphon cooler
128GB (4x32) ECC RAM
1500w PSU (Dark Power Pro 15)
RTX 3080ti FE
GTX 1060 6GB
LSI HBA (PCIE2 x8 but in x4 slot)
2x 7200RPM 8TB HGST drives (HBA)
2x 5400RPM 8TB WD Red drives (HBA)
2x 4TB drives (mobo SATA)
2x 1TB SATA SSD (1 on HBA, 1 on mobo)
1x 512GB SATA SSD (mobo)
1x 1TB NVME PCIE3
Be Quiet Dark Base 900 (but with Pro front panel)
Added 4 USB 2.0 ports using the motherboard headers
1500w SmartUPS UPS

FooLKiller · August 12, 2021, 12:36am

Just FYI, Proxmox is terrible without a full cache or all SSD storage. You’ll want 2x SSD for a ZIL as well as an ARC. ZIL prevents the extra write on a ZFS filesystem. Basically without an external ZIL, it resides on disk. The system then reads the ZIL and writes it back where it goes on the disk.

Basically, you write one bit, ZFS writes a bit, reads a bit, writes a bit. On spinning media, this is murder for high IO. Instead use a ZIL mirror and you’ll eliminate that.

Note that with Proxmox, IO overhead is the single most common problem. You are not going to have enough IO to concurrently run all those VMs on a pair of HDDs in a mirror.

To answer your Questions:

Both configs should work, but with 2 drives on Proxmox, IOs will be limited. For 7200 RPM drives, that’s 120 per second max.

ZFS is an amazing filesystem. Compression is fine with it. De-duplication is where things are icky…

Proxmox will likely nuke your storage arrays, so be VERY careful when you install it. I generally do Proxmox installs on the drive they use, and then manually setup additional storage after.

ZFS does not require massive amounts of RAM. A general rule of thumb is 1GB per 1TB of space, but it will work with much less. I have about 20TB of ZFS storage on a system with 16GB of RAM and it runs fine, but it’s just an archival server and doesn’t see much load. A safe estimate is 1GB per TB of space, however you can get away with much less, just hurts performance. De-duplication is the only case where ZFS gets greedy. Everyone complains about the memory requirements, but it is the IO that is deadly with de-duplication.

Jkay · August 12, 2021, 6:35am

Great write-up but your RAM recommendation is behind the times. TrueNAS install docs have specified since early last year that 8GB of RAM as sufficient for up to 8 hard drives, then 1 GB of RAM for every hard drive after that. Not per terabyte of drive space.

Deduplication still requires 5GB of RAM per TB of storage. Also add 1GB of RAM per 50GB of L2ARC (read cache) drive space.

edit - as my own recommendation to the OP, I would suggest never overclocking your RAM. Let it rest at it’s non-OC speed or maybe one click lower. Don’t try XMP.

FooLKiller · August 12, 2021, 11:01pm

I wouldn’t use 8GB of RAM with 8x12TB hard drives. It’ll run like a bag of hammers. Again, type of usage matters here. If you want lots of reads and writes, more ARC cache will make a massive difference, especially on a busy fileserver. It is safe to say that if you want performance, double the minimum requirement for ZFS at a minimum. You’ll appreciate the extra speed.

Deduplication does not require 5GB of RAM. It can be much lower or even much higher than this. The only way to figure that out is to properly calculate out the de-dup table on current and future data. That’s why the 5GB is recommended, it fits most workloads. If you are storing Video files that you archive, you will need less dedup space as the block size will be larger. It’s lots of tiny files that make dedup gobble RAM (and your iops.)

XMP is ok if you know what you are doing. You are overclocking your memory with it though, so if you want rock solid stable, leave it off.

Jahf · August 14, 2021, 11:26pm

I think I should clarify usage here:

Allocating memory to ZFS is a non-issue, I’ve already got plenty of RAM that needs a use case
Proxmox system would run from SSD (no spinners)
Primary use VM (daily driver) would get SSD space for system partitions and the NVME for gaming/editing (possibly through Proxmox managing it for compression) … I’ll need to add 1 more SSD if I want to separate VMs from Proxmox’s drive in this config
The 2x 4TB + SSD cache would just be for adding storage to those VMs for things that want -some- more performance than via an Unraid networked share, as well as for me to learn how to use ZFS.
In my mind secondary use VMs (locked down browsing, maybe some dockers) would probably go on the 4TB+SSD ZFS drive

I’m not investigating ZFS right now from a file server perspective and I’m not going deep in the weeds on multi-drive setups beyond the one possibly being considered (2 drives + cache).

I’m not running my 24/7 NAS on Proxmox, the big drives will stay in Unraid. For my home use I’m quite happy with the throughput I get from using this for networked home/user directories + media serving … if I get far enough into ZFS in the future I might change my mind on that but right now I don’t see it happening any time soon.

So at this point I think my biggest question remaining is … is there much point in doing the 4TB + SDD ZFS for storage?

The main alternative options I’d appreciate feedback on:

4TB without SSDs in a RAID-1 (just adding some redundancy, but losing 4TB of space … not a big problem for me as I’m re-adding these into my use after a couple of years sitting in an old PC), using the SSD to provide a dedicated VM image drive separate from the Proxmox SSD.
Put the 2x 4TB into the Unraid array, allowing me to use all 8TB from them, and (like the previous item) freeing the SSD for dedicated use by VM guests.
[insert new idea here]

FooLKiller · August 15, 2021, 5:33pm

The main reason to run ZFS is for absolute data integrity. If you don’t need to store things and have them be intact and verified over a long period of time or want zfs snapshot capability, I’d keep your setup on Unraid.

You can always try ZFS out with virtual drives in Proxmox. It won’t get quite the same speeds as a native setup would, but it lets you play with it without nuking your data or waiting a day to transfer all your data around.

Always stick to KISS when it comes to playing with new things. If you can do it simply, you are much less likely to get frustrated and give up than trying some all elaborate setup and giving up half way through…

Zedicus · August 18, 2021, 3:30am

passing an HBA through to a VM is possible and can work fine, i ran Freenas that way for years. However, it can lead to unknown issues that can cause havoc with data.

to your actual question though, if you are just learning and wanting to try out something new, then that would be a fine config. (2 drive ZFS mirror with SSD cache)

the performance benefit of even a small ssd array for the VMs to sit on is something you should consider after you grow the environment a bit. even 4x intel SATA ssd in some sort of ZFS config of your choice, would be orders of magnitude faster, and used datacenter pulls with low hours are not to expensive. (the larger ones are a little pricey right now, thanks CHIA)

nx2l · August 19, 2021, 11:14pm

I run 4x 4tb ironwolfs ( configured as 2 mirror vdevs)
I set arc max to 10GB.
I have arc set for only meta data use.
I have a 1tb SSD setup for l2arc.
And I have a 900p SSD for zil.

I run 16-22 VMs from this zfs pool.

No complaints