ZFS storage server advice

Sorry it took a while to get back to you.

URE spesification is 10^14.
Mean fail time is 1,000,000 (1million) hours.

Yeah i get that it is unlikely. But I don’t really want to risk loosing the entire pool. Especially since i don’t have a backup. (yes i know, raid is not backup)

Sorry but i don’t see how this has more redundancy over say raidz2. In a raid 10 configuration, it can only loose one disk per vdev. Even if the rebuild is faster, the risk seems to be higher than raidz2 (or even raidz3, but that is a bit overkill).

Mirror depth is how you increase redundancy. So with a typical pool of mirrors setup, let’s boil this down to 4 disks, you have two 2-disk mirrors striped together. Happiness. You’re guaranteed to be able to lose 1 disk, and have a roughly 66% chance to lose a second disk and still be fine. Technically speaking, this loses out in redundancy to a 4 disk RAIDZ2 where you are guaranteed to be able to lose 2 disks and be fine.

However, as you add disks, you can reshape your mirrors. With 6 disks you can have either three 2-disk mirrors and get the performance boost, but still only be guaranteed to be good after 1 disk, or you can have two 3-disk mirrors. A two 3-disk mirror setup would allow you to lose any two drives and still be fine, and then a certain percentage chance to lose another couple of drives and still be fine. With a 6 disk RAIDZ2, you’re still only get those 2 disk failures. Any third disk failure is catastrophic.

3-disk mirrors sacrifice more space for that redundancy though.

2 Likes

After all the great feedback from everyone. I see that the 20 bay server is a no-go. Therefore i will be looking into a 24 or 36 bay one instead. This changes quite a bit how the vdev can be split.

Here are some new configurations based on everyones input. The maximum vdevs is based on either 24 or 36 bays. I would really appreciate your thoughts on them. What would you choose in this scenario?

Configuration #4
24 bay: (4 vdevs, 6 disk per, raidz2): 65.54TB usable.
36 bay: (6 vdevs, 6 disk per, raidz2): 98.30TB usable.

Each vdev can loose at most 2 disks. Rebuilds are slow, and will strain the remainding disks. Potentially long scrub and resilvering times (thanks @magicthighs). 66% space efficiency.

Configuration #5 (as @Vitalius suggests)
24 bay: (6 vdevs, 4 disk per, raid10): 49.15TB usable.
36 bay: (9 vdevs, 4 disk per, raid10): 73.73TB usable.

Each vdev can loose at most 1 disk. Rebuilds are fast, and won’t strain the other disks in the vdev as much. 50% space efficiency.

Configuration #6 (as @Levitance suggests)
24 bay: (8 vdevs, 3 disk per, 3-way mirror): 32.77TB usable.
36 bay: (12 vdevs, 3 disk per, 3-way mirror): 49.15TB usable.

Each vdev can loose at most 2 disks. Rebuilds are fast, and won’t strain the other disks in the vdev as much. 33% space efficiency.

Personally, #5 because I wouldn’t be worried about losing a 2nd disk, but since you care a bit more about it than I seem to, for you, #6.

Yes, it loses a lot of storage, but the performance is important for redundancy. Speed in rebuilding an array is almost as important as how many disks it can lose if something catastrophic happens. Say, a power surge that kills one drive and cripples another, and you don’t know how long any of the drives will last.

It simply boils down to balancing Redundancy to Storage Efficiency to Performance.

You can do well at 2 if you sacrifice the 3rd, and you’ve already chosen Redundancy as very important, so now you have to choose between Storage Efficiency and Performance.

But Performance is a key component to Redundancy, as I mentioned, in certain situations.

2 Likes

Sorry, I know this is a relatively old thread. Just have a quick question…

I have a 36-bay server. I would like to have 5 7-disk RAID-Z2 vdevs + hot spare. I understand that this is not optimal, but I’m not sure if a nonstandard recsize is advisable or if it will cause more harm than good. The workload is backup/archive, but I’d still like the performance to be as good as it can be with the disk config.


EXAMPLE: 128k vs 120k

(128/5 = bad, log2(128) = good)

VS

(120/5 = good, log2(120) = bad)

If there is no clear answer, I will run some tests, but if there is, then I’d rather save the time.

I’m not sure which would be better. That’s something you’ll have to investigate on your own. I don’t have any resources indicating which would be better.

2 Likes