Petabyte Server for 8K Video Editing

Hey guys, I’m a newbie and would very much like you opinion on building my very cool server with very cool hardware for editing very cool videos.

I own a Video Editing Company, we make car commercials worldwide for many cool brands (if you want to see some of it, check out our instagram @abdalabrothers)
we mainly shoot 8K RED RAW, 6K CDNG and H264 Files.

I will try to keep it short but describe everything that we have in terms of hardware and how I think I should be setting it up, and would very much love your opinion on it.

I currently have two systems:

(working system) “theflash” is a QNAP TSH886 with ZFS flie system that is supposed to be a “current editing projects” NAS with 8x 3.84tb IronWolf Pro Sata SSDs in Raid Z1 for a total of 22TB of usable space, all the current projects being edited right now stay in this system because it is all flash so its very snappy, we usually have 6-8 editors hitting that at the same time, conected to my Switch (QSW-M1208-8C) through 1x10G connection (each editor gets a 2.5G connection), I could be using 2x10G to the switch with LAG but it glitches out everytime and 10G has been fine for now.

(building new system) “bigbro” is supposed to be our main storage server, wehre every video is stored there, unless it is currentlyu being edited, even backup of “theflash” could be stored there in the future. 12x20TB Ironwolf Pro HDDs which in the end of the year (2023) will be upgraded for a total of 24x20TB Ironwolf Pro’s, AMD Epyc 64, and the RAM is currently ddr4 128gb, 1tb nvme ironwolf 525.

How I think I should configure the new system:
motto: everything is upgradable, performance/speed of read and write of video files is the main goal here, even if it costs, if it’s worth it the performance gains, we may gravitate towards it. Imagine being able to edit videos from this system would be a blast (RED files are usually 4gb each, CNDG Files are saved each frame as a DNG of 7mb, h24 files vary between 100mb to 1gb)
I am thinking of phases of upgrades, so Phase 1 is 240tb, Phase 2 would be 480tb and Phase 3 (end of 2024) should be 1Petabyte, AND I’M PRETTYT EXCITED FOR IT, so I wanna build thing the right way since the beggining.

System: given my main need being video file reads/writes, with a plex server or something like that in the future should I go CORE or SCALE?

CPU: I currently have this maxed out, but would be interesting to know if we actually need it to be 64cores or I could have gone with the 24 or 32 cores EPYC.

RAM: currently sitting at 128gb ddr4 I think it’s not good thinking this will be very soon a 240tb system, and as ZFS uses it as cache, I think we should start with 512gb DDR4

Drives: I am thinking of setting a pool of 12x20Tb drives in Raid-Z2 and as I am expanding in the end of the year, add another pool of 12x20TB in RaidZ2.

Cache?: I thought of using the 1tb Ironwolf NVME as a L2ARC in case memory is full. (1tb may not be enough for us, cause every client project we edit usually is 1-3tb, so probably use one of the 3.84tb ssd from the QNAP NAS could be a solution?)

and here is a crazy idea I had:
since now I have space I can take 2x 3.84tb SSD of the QNAP NAS and use it as the Special Metadata devide that Wendel from Level1Techs talks so much about it, is this a good idea? should this be good for performance on my use case scenario?

Truenas Settings: is there any settings I should specially enable/disable given my use case scenario? I have seen hundreds of YT videos about setting up truenas but most of them just teach you to create a “simple” truenas build with not many cache disks that could potentially make your big storage be real fast with large video files.

thanks for reading all the way through, and I would very much appreciatte the community 2 cents on my case haha

1 Like

If you need a chassis for that many drives, you can get a 72 bay 8U server chassis with all hot swap and redundant PSUs on AliExpress or Alibaba for $1780 plus shipping (shipping will be expensive though)

You mentioned things like CPU and memory, but I don’t see whether or not you had a way yet to plug in as many drives as you are wanting and typically the high drive count chassis like from SuperMicro are quite pricey.

1 Like

ZFS performance is still better on FreeBSD and matured over a decade. Linux and Scale is still a new product with flaws and some problems. I’d always recommend Core for a pure storage server.

CPU usage is mainly derived from compression/decompression. If your ZFS does a lot of this, you need CPU. Usually 8 modern cores are plenty for everything regarding 10Gbit-connection. Storage doesn’t need much CPU in general. There is a reason why commercial NAS/storage systems are the ones with the least amount of cores.
Having some additional cores for servicing NFS/CIFS threads certainly help. But anything >16 Cores is overkill when talking HDD storage. Regarding your use case with already compressed media files, compression settings probably aren’t impactful.

More memory more better when talking ZFS. You may want to dig into & tune some parameters to make the most out of it. Depending on the actual usage, there are diminishing returns kicking in where more memory gets you less than optimizing other cache mechanics.

Upper limit of recommended vdev width, but pretty much the sweet spot for best storage efficiency. There is an argument for going Z3, but that’s a matter of taste.

I wouldn’t personally use 20T drives at this point of time. price/TB for 16T is better and resilver only gets worse with more capacity. Enterprise drives like Seagate Exos or Toshiba MG series get’s you the most for your buck. It’s just economy of scale that works for these models.

First: Memory is always full, that’s the nature of ZFS.
Second: This is not how ZFS works. Read into how ZFS uses caching. But I’d recommend using an NVMe L2ARC in any case. Size entirely depends on how much % of the pool is frequently/recently read. A couple of TB will do, NVMe is cheap these days.

With lots of memory and L2ARC, impact isn’t that huge but helps getting some (nasty 4k reads) off the HDDs. If you have the slots, special vdev will improve things when regular cache fails. There is no reason not to use a special vdev. But the amount of metadata from mainly huge files is usually trivial. Although talking about PB, tiny things add up.
In general, getting metadata from the HDD along with reading the corresponding 20 Gig file isn’t any random read HDD madness. This is far from an ideal use case for a special vdev. But considering the scale of the system, it’s still worth it.

general setup for a pool looks like this:

  • plenty of memory (ARC)

  • 1-x cache vdev (L2ARC) (commodity hardware)

  • mirrored LOG vdev , 8-32GB is all you need (for sync writes) (enterprise-grade endurance)

  • mirrored special vdev (for storing metadata, small blocks or dedup table) (commodity hardware)

You don’t always need all of them (also lack of slots/space often prevent this), but they all fill their role and they are very good to deal with specific problems.

And if you’re just file transferring large files (sequential reads and writes) from the editing server to that pool, you don’t need any of that because 20+ HDDs will saturate network speed limit just fine. In this case a quad core with 16G of memory and no caching is all you need.

1 Like

Part of the footage I deal with is RAW files, CDNG and uncompressed files and H264 and R3D Raw files do have compression, should I be using LZ4 Compression or that could damage my video files, and would that speed up reads/writes at the cost of using the cores of cpu that I have so many?

ok, but using a real world example, given that each client project has about 500gb to 1TB size of total video files, ideally what should my RAM be while editing 1 project?
(just to clear things up, each opf the 8 editors have a highperformance machine with i9 and RTX3080 minimum, so they would edit using their computer hardware but acess the NAS as a storage for the video files, the Stream of RED files are about 160MB/s, the first option is to transfer the files from the 240TB Server to the Editing Server that is all flash sata ssd, so they edit off of it, but the second option which is very interesting to know is if its possible to also use the 240tb to acess files and edit without transfering to the faster SSD server, that would be super usefull too)

but should I be worried going on the limit over there? going for 2vdevs of 6x20TB RaidZ2 would give me double IOPS performance, but as I understood, my use case does not require alot of IOPS and is much more sequential based, wich would benefit from a larger pool performance wise cause reads speeds would multiply by 11 instead of 5, right? altough that would be a little more dangerous because of resilvering time, but thats not crazy dangerous right? I’m not being irresponsible doing that, correct?

okay, given what you said to me, should I do this:

I have enterprise class Sata SSDs 3.84tb (ironwolf pro) that I could pull off the SSD Editing Server if the performance is worth it (taking one or two would not be so bad for me)(currently 8 of those 3.84tb ssds) , the size looks to be perfect for my use case scenario as a L2ARC, but you are saying a consumer NVME is much faster and cause is a cache no need for parity and speed is more interesting.
so should I use the 1TB NVME Ironwolf 520 that I have as L2ARC,

buy another 2 enterprise class NVME for mirrored LOG vdev (why is that important and why just 32gb?) (and do you have a product to reccomend?)

and should I take 2x3.84tb SSDs from the Editing Server to use for mirrored special vdev or is that too much, would add to no performance and its better just to buy new consumer ssds? if yes, about how big those ssds should be (considering 240TB) and why commodity hardware? I thought if they would fail, they would result in corrupt data.

I don’t know about a petabyte, sounds expensive, you’re building more or less a backups/archive server

For half a petabyte:

  • TrueNAS Scale because Samba has always liked Linux more, and k3s might be handy.
  • 1MB record size
  • Something like Intertech 4F28
  • Fill half of it with 14x…or 12x 20T(18.4TiB) or 14x… or 12x… 12x18(16.4TiB) drives, e.g. Seagate Exos or MG09ACA or MG10ACA or HC550 … whatever is cheapest.
  • one nvme for L2ARC… don’t bother with enterprise, power loss protected, or optane, or this or that, get something that isn’t going to die in the first month, 2T (or a but less is fine).
  • get a pair of boot
  • don’t bother with slog/zil, don’t bother with draid, don’t bother with special metadata (if you’re only working with big files - many megabytes, you’ll have very little small metadata, and reads will come from L2ARC anyway)

That’ll give you around 200T (160-240) usable from the start for about 4000 bucks.

How it works:

SAS expander is like a network switch, for hard drives. HBA is like a router for hard drives, SAS is a fancy hard drive protocol for servers that in its ancient cheap version does 3Gbps per lane (300MB/s, more than spinning rust HDDs can do).

Next:

  • No need to go big on RAM, what are you gonna use it for? 32GB of slow ram … maybe splurge on ECC?.. or don’t, ECC is awesome but if you don’t have it, meh, don’t bend over backwards for it.
  • Get an ASRockRock board with IPMI and 10GB ports, x470/b550 or something

… or … even better…

  • Get any board you want, a several x16 slots would be a god send:
    • 1 slot for a ou 2x40G QSFP+ nic, hook it up to a cheap Mikrotik switch
    • 1 slot HBA
    • 1 slot SAS expander
    • 1 slot ASRock PAUL card, … or if you can get your hands on a PiKVM

This could be a Ryzen 5600G on X570 or X470 motherboard or something, it doesn’t matter much you’re building more or less a backup server.

So about 5000-5500 total, maybe 6000-6500 if you want to go with 40G switch. 10,000 incl. another batch of drives to bring it up to half a petabyte. (0.25/terabyte isn’t that bad on a small scale).

For 1PB, you’d repeat the exercise with newer components and build a second system

1 Like

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.