8 x 12TB ZFS setup

Hello everyone,

I’ve just received my new NAS, a QNAP TS-873A that I plan to fill with 8 12TB (Western Digital WD120EMFZ) drives, 2 500GB Samsung 980s for the system and 64GB of Kingston’s finest ECC RAM.
The device will mostly be used to store family photos and videos, and hum… Linux ISOs. Hooked to a 1G LAN for now, 2.5G upgrade coming in the next couple of months. If possible, I’d like to be able to saturate the link.

I’d like to dabble around with ZFS and as this would first experience with it I’ve done a fair share of research without actually being able to make up my mind. How am I supposed to setup storage:

  • 1x 8 drive vdev in RAIDZ2
  • 2x 4 drive vdevs in RAIDZ1
  • 2x 4 drive vdevs in RAIDZ2
  • ditch the idea of “system SSDs” and use those two drives as cache

I’m coming from about 33TB of usable storage in RAID6 and I’d like to retain as much performance as possible while also not having to compromise too much on data integrity. As most of you have a lot more experience with these sorts of setups, what would you do?

I know this kind of question (or similar) has been asked a couple of times before but none of them actually match my exact use case.

Thanks a lot for the insight!

If you are fresh to ZFS I would say calm down, and play!..
Split the drives you have into 3 pools, a cold pool, not played with, for storing data until comfy, a warm pool with a copy of the data that can die, but if not, provide a feed too… a hot pool, in a config you are using to practice performance / set up with…

ZFS is pretty stable, and stuff committed to it is secure, error free.
But, you might want to play around with configs etc.

I would suggest starting your practice wit a smaller set to begin with.

A Pool with a bunch of mirrors is a good compromise between broken disk (all drives die) and performance.
A raid6 with 4 drives (2 parity, 2 data, kinda-but-not-really) is kinda the same, lower performance, higher redundancy (any two of the four can die, with no data loss)

You already have a bunch of data to use as a performance measure, an da bunch o hardware to play with, so getting to know the system without risk sounds good

As much as I’d like to experiment with the new hardware and ZFS I honestly just don’t have the time for it, hence my question. I guess I could always go for a single RAIDz2 vdev and see how things go.

Sure, a single raidz2 will be stable

Just be aware it’s not so easy to change after the fact, especially once all 60tb full.

If you haven’t ordered it, you could probably get away with a quarter of the ram, but if you are going to use it all for caching, perhaps make sure to tell the system to use more than the default 40-50%, else you’ll have most of it just sitting looking pretty…

1 Like

First question is, what is your plans for backups?

Hooked to a 1G LAN for now, 2.5G upgrade coming in the next couple of months. If possible, I’d like to be able to saturate the link.

1Gbps ~= 125 MB/s minus some overhead
2.5 Gbps ~= 312.5 MB/s minus some overhead
A HDD can do large sequential transfers files somewhere typically from 200-120 MB/s depending on where on the drive the data is. You will only really see this sort of max speed when using ZFS to do a send/recv. Transfer of small files is often going to be double digits MB/s, sometimes even single.

While it’s complicated, in practice you can generally consider a RaidZ1 or RaidZ2 vdev to have slightly worse performance than a single normal hard drive. There are edge cases where it may come out ahead, such as doing large sequential transfers with 1M recordsizes, the reality is that all your writes and reads now have their latency limited by the worst of multiple devices plus various overhead.

I would personally do one of three things

  1. If I have a good backup that my main array is automatically send/recv’d to it throughout the day and I want the best possible performance = 4 vdevs of mirrored pairs.
  2. If I have a good backup as above and want the most space, I’d go with 8-wide RaidZ1 or RaidZ2.
  3. If I was a degenerate sinner with no backups, I would use RaidZ2, or even RaidZ3. Performance is not a factor if I actually want to keep this data alive.

ditch the idea of “system SSDs” and use those two drives as cache

Don’t burn disks and ports trying to set up L2ARC. It will not help your “store files and browse network folders via samba” usecase. It’s really only something you use when you’ve gone down the rabbit hole enough to stare at ARC hit rates.

An SLOG also likely won’t help you. You can test if you need an SLOG if temporarily setting sync=disabled on the dataset results in better performance for whatever you are trying to do. If you find you really do need an SLOG, consider a pair of these old 16gb optanes with some simple pcie adapters. Or even a radian RMS-200 which uses RAM to store things, and shuffles data to internal flash during a power loss event, and as a result has essentially infinite endurance with many old units having hundreds of Petabytes written.

Be aware that the “usable space” that calculators provide for various RaidZ configurations can be highly deceiving compared to real life “filled with actual data” situations. There are various factors can wreck your expected space efficiency on RaidZ vdev’s, that don’t matter for mirrors. Here’s a brief mention of that exact issue. Also please don’t use 4 wide RaidZ2, it doesn’t work out how you expect.

2 Likes