zpool create nas-storage raidz1 /dev/disk/by-id/ata-ST16000NM001G-2KK103_ZL2BJ6Q3 /dev/disk/by-id/ata-ST16000NM001G-2KK103_ZL28634N /dev/disk/by-id/ata-ST16000NM001G-2KK103_ZL285QFD -f
To create the main storage pool.
Added a cache disk zpool add nas-storage cache /dev/disk/by-id/nvme-eui.0025385991b21308
zfs set recordsize=512K nas-storage
The I have added 2 nvme’s that i intend to use as metadata storage and also for small files.
zpool add nas-storage special mirror /dev/disk/by-id/nvme-eui.002538bc1141edd7 /dev/disk/by-id/nvme-eui.002538bc21bb891c
It’s a mirror. special → mirror → nvme-eui… all fine.
zpool status shows a 3-wide RAIDZ1, a mirror of special and one drive for L2ARC.
zpool attach pool disk newdisk. Adding another drive to a mirrror is done via attach (and detach if you want to remove drives from a mirror). This will make the special a 3-way mirror. triple redundancy but same 1TB of capacity.
You can change recordsize on the fly and for each dataset. No need to keep it the same for all datasets. Some datasets will want lower values. You can set both values on each dataset.
I have the default 128k for my pool and my datasets vary between 8k and 2M.
512k for small blocks is basically 90% of a mixed pool of data. You probably want extra capacity of flash in your case.
64k or 128k is the usual value. HDDs do fine at 128k+. 512k blocks are not small and take up a lot of space.
After data transfer on the pool this is what i have on the nas-stoarage.
I was expecting that using zfs set special_small_blocks=512K more data will be stored on the nvme’s considering the file size structure (arround 15Gb of data).
That’s not a lot of small stuff. When talking special vdev, we’re talking about 8 or 9-digit numbers. special vdev is mainly for pools that have tens of millions of small blocks and TB of metadata that can’t be economically be cached by main memory anymore.
but 7G allocated on special seems low. Metadata on media isn’t much. It’s a couple of bytes every megabyte with that recordsize. Should be more from what I see, but I mostly deal with 8k-128k datasets. Did you change the property after you wrote stuff to the pool? That would explain a lot.
Once written to pool, there is no way to retroactively change the allocation other than delete/rewrite/restoring from backup. ZFS never moves data by itself and changes in dataset properties only apply for things from this point forward.
Pool doesn’t really need special. Little to no small stuff, large recordsize, mostly media. Memory should be able to handle this on its own.
And with RAIDZ you can’t remove the special vdev either. Doesn’t look good.
That looks right to me. ZFS does write combining, if you have several files that are smaller than a block, it stores those files in a single block. It effectively turns random writes into sequential writes, which is why it works so well on hard drives. This is also why people will often use an ($33) 58gb optane for their small blocks, they are cheap and durable. In your use case a 58GB optane would be 15% full.