The ZFS conundrum - a newbie's curse

SprMari026 · October 13, 2024, 10:50am

There is some confusion going on trying to visualize or design the setup I think will fit my needs. I am hoping that a more seasoned zfs and proxmox king can lend me hand.

The needs:

```
Media server
```
```
Backups as a service
```
```
Private GPT
```
```
Large docker suite
```

Machine:

```
12th Gen 12-core
```
```
96GB RAM
```
```
A2000 ADA
```
```
4 x NVMe Gen 4 x 4 - Backplane
```
```
1 x NVMe Gen 3 x 8 - MoBo
```
```
6 x SATA 3.5 bays
```
```
10GbE
```
On the motherboard I intend to use a P1600X Optane 118GB storage device for Proxmox, Truenas and any docker container I want.

For the backplane I would like to use four 2TB NVMe drives.

For the 3.5 bays I intend to use 24TB drives - adding 1 driver each month/quarter

In this setup I would like to emphasis performance and storage. Important files/snapshots are done off-site so there is little appetite for investment in redundancy.

Can someone check this:

There are 5 NVMe slots and 6 HDD slots and I am currently not sure how to layout the pool.
Initially I thought 1 Optane drive for OS and use namespaces for SLOG. 2 x 2 mirrored nvme drives to create a mirror striped vdev and for the HDD was thinking a raidz1 configuration.

Currently I am still unsure about OS on a single optane drive, ideally I wanted to have that mirrored but I only have 5 NVMe slots. Does anyone have a better layout in mind?

Exard3k · October 13, 2024, 12:13pm

OS needs to load 2G into ram on boot and otherwise writes kb worth of logs. 250-500G cheap SATA or NVMe will do (depending on what port you want to sacrifice). Proxmox/TrueNAS boots fine with 32GB USB drives.

No point in using mirrors or RAIDZ then. Just do stripe vdevs and replicate snapshots off-site. done. fastest and you can add HDDs one by one, which isn’t possible otherwise.

Use optane drive as log for HDD pool.

charles7 · October 13, 2024, 12:22pm

OP just be aware that if 1 vdev (in this case, 1 HDD) fails, you will lose the entire pool.

I point this out since you said important files will be backed up off site rather than all files.

SprMari026 · October 13, 2024, 3:16pm

I could also lay it out as 2 x mirror optane and 3 x nvme raidz1 vdev with a raidz1 hdd vdev to maintain some resiliancy. Striped only seems to risky. Raidz2 or 3 seems excessive if I also replicate off-site.

Thanks for the insights, I am using your thread as a guide for this new setup:

jxdking · October 13, 2024, 3:38pm

Get the largest SSD that you can afford. You save on PCIE lanes.
I would suggest you also do backup on site besides off site. It will speed up recovery process tremendously.
For redundancy, it depends if you can afford down time when things go wrong. If true, as mentioned, you could just stripe everything.

Darin755 · October 14, 2024, 4:39am

Proxmox with ZFS and TrueNAS.

You can passthough the drives to TrueNAS. Just make sure to passthough PCIe

SprMari026 · October 15, 2024, 6:04pm

Thanks for the tips guys. I’ll try and figure out the layout myself.

agrimm1 · October 15, 2024, 8:07pm

I have multiple machines with p1600x boot drives, including a bare metal trueNAS scale. Over the next few weeks I am going to remove the trueNAS install from the p1600x & put it into a sata ssd. With consumer motherboards m.2/u.2 spots are highly coveted & in my opinion you should not waste one on a boot device.

If you are going to roll your own nas on a Linux distro, then I do think running that off of nvme is a better option.

Mirroring your boot device is another huge waste of those premium m.2 slots. If your boot device dies … who cares, your data is safely stored on the pool.

My opinions are mostly focused on attaching as many storage devices as possible to the nas and not necessarily focused on redundancy in non critical areas.

Exard3k · October 15, 2024, 10:29pm

Hours of work in configuring the OS is still something worth preserving.

I went with two USB to 2.5" enclosure for both SATA boot drives to even save on SATA ports and just use external USB. Works. And I don’t trust USB drives, thus proper 2.5". Server boards used to have an SD-card slot for the boot drive.
boot drive = least expensive you can (reliably) get away with

rchrd881 · October 16, 2024, 11:50pm

I have a P1600X. There’s no sign of namespace support. Best use I’ve found for it is a 20GB partition for ZIL/SLOG and the remainder for L2ARC.

No, the ZFSholes don’t like this shared configuration because reasons. Fk the police.

MikeGrok · October 17, 2024, 7:14pm

I was working at iXSystems in 2016 (some of the primary OpenZFS contributors), unless you had a ram based zil, a SSD with a small ZIL on it and the rest for l2arc is what they recommended to everyone. The ZIL is only used when the system is shutdown unsafely. Even then it is only used on applications that constantly sent sync requests, like certain databases. Most client operating systems are multi threaded and happy enough if the write eventually occurs, and don’t demand immediate compliance.

If I had 2 m.2 and some disks, I would:
save 1 SSD to make raw partitions to allocate as scratch disks for client operating systems

set the l2arc as persistant
set 1 SSD as 16GB ZIL and the rest L2ARC.
make some raidz2 vdevs, no more than 10 disks per vdev. If more than 12 disks total, set at least one disk as a hot spare.

SprMari026 · October 21, 2024, 5:20pm

The use would be to host proxmox and run truenas with a few other smaller VM’s.

Currently it’s still foggy as to which config fits best. Someone suggested to use namespaces instead of hard partitions… that does help with the regular NVMe layout but not where it matters most: the Optane drive(only p5800x supports it). If it did I could make it a multi-purpose drive.

I just don’t know which way to go but I appreciate everyone’s input a lot! It really helps out figuring out the new challenge I face

Capacity of the machine:

5 NVMe slots and 6 HDD

First idea:

2 x 1 Optane p1600x mirrored OS 64GB(partitioned) and use the rest as SLOG for the HDD pool.
3 x 2TB NVMe SSDs in RAIDZ1 for the rest: VM’s, containers and other programs. ZIL can go here.
6 x xTB HDD in RAIDZ1 for media, archives and maybe LUN assignment.

Second idea

1 x 1 Optane p1600x as special device(probably partition it for SLOG).
4 x 2TB as striped mirror for: OS, VM’s, containers and other programs. ZIL can go here.
6 x xTB HDD in RAIDZ1 for media, archives and maybe LUN assignment.

There is an abundance on configuration and I just feel a little lost on what the best course of action is. I’m grateful for the advice in this thread but I could benefit from some more

system · July 22, 2025, 11:21am

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.