This is not a small task. You have just described a need for an $18,000+ system from 45Drives.
Hardware and Points of Failure
You will need 12TB hard drives to make this happen. You might be able to save some scratch by buying consumer grade hard drives and solid states (some would criticize this decision, I’m pretty luke-warm on it).
But if you’re going to do this all in one machine, you’re going to need 24x 12TB hard drives, and that will get you ~103TB of usable space in a pool of mirrors configuration.
Pool of Mirrors
In configuring a server on 45Drives’ site, I selected two different manufacturers of hard drives. You will want your mirrors to consist of one drive from both manufacturers.
Don’t Forget the Backplane
The next single point of failure is the backplane. I’ve not used 45Drives before, so I don’t know how their backplanes are configured. I’ve setup servers where all drives from one manufacturer go on one backplane, and the other manufacturer on the other backplane. For this amount of data, you want to consider this.
What About Striped RAIDZ?
You may be able to get good performance by together by striping together 8x 3-disk RAIDZ vdevs, and that would net you more space, assuming the same hardware. But rebuilding RAIDZ* is harder on everything involved in the rebuild process, since RAIDZ* is mathematically more complicated than mirroring. It also wouldn’t provide you with any additional redundancy should something go wrong, and guarding against backplane failure is also more complicated, if not just simply out of the question.
What About Striped RAIDZ2?
Along those same lines, a set of striped RAIDZ2 vdevs would provide you with roughly 4x write performance, as compared to the striped RAIDZ of 8x write performance, and the pool of mirrors’ 12x write performance. You are adding an additional layer of redundancy, but similar to the striped RAIDZ setup, you’ve knocked out the ability to guard against backplane failure.
Can I Break Up The Data?
The next immediate thought is that you might be able to do something by breaking up the data into actively used vs archival/rarely used. Lord knows that was my first thought. But alas, it will not do. Actively used data will become rarely used data, and so your rarely used pool would need to be configured assuming that all of that actively used data will be coming over sooner rather than later. Your rarely used pool would offer literally the same conundrum that you’ve got right now.
Now how do you backup such a pool? Potential an identical server with more aggressive compression? I know that video files in their final form are already encoded and compressed to all hell and back, but my understanding is that the pre-edited, raw footage is often more receptive to compression. But I could very well be wrong on that.
In the case of backups, you may be able to gain some ground by separating the data. If they can come to you with what is absolutely critical for them to backup vs data that they can live without. My thought here, not knowing the industry very well, would be that videos that are completed may not need the raw footage backed up, as long as they can get to the completed product.
Use this when designing your solution. Make sure to check the 20% Free Space limit check box. Because I can tell you from experience, when a zpool hits 80% full, performance (reads and writes) tanks.