I need help with TrueNas configuration for video editing

I need help to configure a TrueNas system.

I am building a system for a TV station. It will only be used for a Premiere Pro media file workflow.
It will have a 220TB spinning rust pool for media files only and a 400GB SSD pool for Premiere Pro projects. I have this separate Premiere project pool to avoid constantly writing a lot of small project files to the spinning rust media file pool.

Some questions:

What would be the best recordsize for the drives in the media only pool? I assume media workflows would benefit from a larger recordsize.

What would be the best recordsize for the drives in the Premiere projects only pool? I would think a smaller than standard recordsize.

Will the SLOG device for the ZIL do any difference for a media files only workflow? Writes will mainly be from Windows machines on the network.

I guess the amount of metadata will be limited and there might be close to 0 files below 128K?

Will the Special VDEV do any difference for a media files only workflow?

What is a realistic size of a VDEV for my requirements?

Will the Special L2ARC do any difference for a media files only workflow?

Is it possible make a Special VDEV for metadata ONLY - and another Special VDEV for small files only? That way I could use fastest but expensive storage for metadata and cheaper, still fast but big, for the smaller files. Lets say 8TB for files smaller than a recordsize of 1M. Here I assume 1M is a good recordsize for a media only workflow.

I know it is possible to tune the transfer speed between ARC and L2ARC? It is by default 8MB/s which is a bit slow for my big media files. What would be a decent transfer speed?

In the future I will have 3 VMs running on the TrueNas servers, they will do some administration and transcoding of the media files with Nvenc.
I plan to add a separate SATA SSD pool for the VMs or perhaps make a separate partition on the Premiere project pool for the VMs. What is the best solution?

With my media workflow requirements, how do I configure the system to have the best balance between L2ARC and Special VDEV. Only one can be based on Optane, which one should it be - if any?

Will the Optane DC P4800X be a better choice than 905p? DC P4800X support reprogramming name spaces, something I am not sure I need (honestly, I don’t know what it is).

A little background information - a lot actually.

The main media format will be AVC-Intra 100, a 100Mbit/second HD format.

There will be a few 4K productions, but they may very well be edited in a proxy format.

The system will be connected to about 25 editing workstations, half of them in full resolution (100 Mbit/s per video track), the rest with a 8Mbit/sec proxy resolution.
Some of the editors will work with the same material, because they do versions of the same story to different platforms.

We will ingest up to 1TB of new material a day and I estimate we don’t touch more than 4TB of media files on the NAS a day. Some files are touched every day.

Previews and temporary’s will be rendered to a local SSD in the editing workstation.

Today we have a similar workflow (except from 4K) with Avid Media Composer and 4 old Avid ISIS 7500. The network is 2x10 GigE and each workstation is connected with 1 GigE. That works perfectly - we never see studdering playback or any kind of lag.
I guess the new system can run on the same or similar network, since it is the same type of media files and workload we will use. I am planning to upgrade the network though. Probably to 2x25 Gig from TrueNas to the switches. Perhaps some of the workstations will get a faster connection to the switch, it depends on how the 4K workflow actuallly evolves. It could be a new NIC or a teamed connection.

I have read a lot about TrueNas, I feel I have good knowledge of the system, but I actually don’t know so much about filesystems, I am more a hardware guy. I don’t know enough about the workload of a media file workflow. I don’t understand if a media workflow like mine will take advantage of a SLOG device for the ZIL, L2ARC and Special VDEV.

I have focused on optimizing the “cache” options supplied by TrueNas. I have around 350 GB ARC. That should be enough to hold the video material that is beeing edited at any time. I will add 4 TB SSD L2ARC. That should be more than enough to hold all video material beiing touched in on day. I will add a SSD Special VDEV for metadata (and probably small files) so the spinning rust can concentrate on delivering high sustained throughput.

The system is being build on a SuperMicro platform.

Actual I am building 2 identical systems for redundancy - the systems will be Rsynced, so I have (most of) the video material backed up.

On a 3’rd Truenas I will store snapshots of the main system for another backup, and I have a LTO library, which will back up all new media files to tape every night.

We have an emergency power generator and a redundant UPS installed.

I feel I am covered on safety. But I also need high uptime, we are doing news production, so actually no downtime at all.

The motherboard is a x11sph-nctf with a Xeon E5-6210U 20 [email protected].

The chassis has room for 36 3.5" drives and some 2.5" drives.

Right now it is configured as this:

Network is still 2x10GigE, but i will add faster (probably 2x25Gig) network before we take the system in production. Still under configuration.

2 SuperMicro 64GB SATA-DOMs in mirrored configuration for boot.

25 EXOS 20TB SAS drives in a 3-mirror configuration and a hot spare. This is a pool for media files only. The drives are 4Kn and Ashift -12.
It will be expanded to 33 drives, a total of 220TB in a 3-mirror configuration plus hot spares, before we go into production.

3 Intel DC S3700 400GB SATA SSD in a 3-mirror configuration. This is a pool for storing Premiere Pro project files only (not media files).

384GB RAM - for system and the ARC.

128GB Optane PC4 DDR memory module. I have 2 for mirroring, if possible. This will be used as an SLOG device for the ZIL for the pool with media files.

I need help to configure L2ARC and Special VDEV for metadata for the pool with media files.

I think around 4TB L2ARC would fit the project - it’s about 10 times the ARC. I see 4 options (the budget does unfortunately not stretch for 4 TB Optane DC P5800X - redundant):

1 Samsung M.2 NVME 990Pro 4TB (upcoming).

4 Optane 905p 960GB in a striped configuration.

1 or 2 NVME U.2 drives in a striped configuration - around 4 TB totally.

1 or 2 SAS drives in a striped configuration - around 4 TB totally…

I am not sure how big a Special VDEV for metadata would fit the project - inputs will be appreciated. The rule of thumbs says 0.3% of the pool, that is around 750 GB. The media file pool will ONLY hold media files. I see 4 options (capacity to be determined):

3 Samsung M.2 NVME 990Pro in a 3-mirror configuration.

3 Optane 905p 960GB in a 3-mirror configuration.

3 NVME U.2 drives in a 3-mirror configuration.

3 SAS drives in a 3-mirror configuration.

I only have 4 possible NVME connections - 3 on the motherboard and one 4-lane PCIe to NVME add in card.

All PCIe slots on the motherboard will be used for GPU or network except for a single 4-lane PCIe, which can be used for NVME.

So I can only add 4 NVME drives to the motherboard. I prefer the Special VDEV to be a 3-mirror configuration so it is self-healing.

That is why I also am thinking of using some SAS drives for for the Special VDEV.

Unfortunately I can’t go all-Optane.

Did I miss something? Any comments are appreciated.

Thank you.

To me this is the most important variable to answer your questions.
If I understand your use case correctly, there will be 25 workstations connecting to this TrueNAS and use it exclusively as file storage (as opposed to having additional VM- or container-based workloads).
In this case your ZFS setup only needs to be fast enough in most (all) data access patterns to saturate your network.

Your media pool will have to be able to read/write all files it receives up to network speeds with the parallelism expected for your workloads.

In case you upgrade to a 25gb NIC there would be only 1gbit / workstation in a worst case scenario. This would probably not be considered sufficient bandwidth.
Do you know what speed/bandwidth any one editor would consider sufficient? Then you need to figure out and negotiate a SLA for concurrent access of workstations (I assume that not all 25 workstations are fully active scrubbing through media content 24x7).
Now, you know what network speeds your ZFS pool needs to hit.

For reference you may want to refer to the well-documented storage challenges of the Linus Media Group. The workload seems pretty similar (a room full of editors 4k editing). They had some 20TB SSD based storage for daily work, upgraded to a 24x NVMe based storage server just for editing, and by now in excess of 3 petabyte for permanent storage of media files for archive. The have videos about setting up 10gb network to all their workstations, using 100gbit network for their servers, etc.

To get to your questions let’s start and focus on the media pool first, because this seems to be the most complex part.
HDDs perform incredibly poor on small block sizes and in 2023 standards have still pretty poor bandwidth in optimal access scenarios.
Your 25 HDDs in a (not recommended, but optimal performance-wise) striped configuration would not produce in excess of 25x200MB/s=5GB/s or ~ 40gbit. This cannot even saturate 2x25gbit network. A 3-mirror configuration of HDDs and mixed workloads will drop that number considerably.
To deliver acceptable scrubbing performance of ~10gbit for each workstation way more spindles are needed. Or faster NAND-based storage.
I will continue with the assumption of a 200gbit/s network requirement.

I would, again, make sure to write this number into the SLA.

To saturate the network bandwidth for up to 4TB of media files I recommend setting up a strategy that allows read/write speed at network speed for all that content.

The performance of ZFS pools are dominated by the performance of their Vdevs. ZIL, L2ARC, and special devices help overcome weaknesses for certain workloads, but are not designed to fully elevate performance levels.

I think it is fair to say, that the 25 workstations like will not access too much overlapping content. This means the benefit of caching should be expected to minimal. You did not elaborate on the software running on workstations, but I would assume that these will offer local caching mechanisms.

As a result, the media storage pool should be able to deliver about 200gbit of sequential read/write bandwidth or about 25GB/s. Assuming operation at optimal performance the NAS needs at a minimum 120-150 (striped) HDDs, 50 SATA/SAS SSDs, 8-10 Gen3 NVMe SSDs, or 4-6 Gen4 NVMe SSDs in their VDevs to hit the performance requirements. The numbers need to increase to add redundancy as required.

This is where you need to tell me that I got the requirements all wrong or you need to reconsider your design.

1 Like

Thank you for your input. Appreciated. I am sure you overestimate my bandwith requirements. We are very different from Linus. Actually following Linus was what led me to a TrueNas solution. We do news in HD - mainly 1 video track per workstation. Lunus do 4K and 8K productions with several videotracks.

Today we have a similar workflow for our news production (except from 4K) with Avid Media Composer and 4 old Avid ISIS 7500. The network is 2x10 GigE and each workstation is connected with 1 GigE. That works perfectly - we never see studdering playback or any kind of lag.
I guess the new system can run on the same or similar network, since it is the same type of media files and workload we will use. I am planning to upgrade the network though. Probably to 2x25 Gig from TrueNas to the switches. Perhaps some of the workstations will get a faster connection to the switch, it depends on how the 4K workflow actuallly evolves. It could be a new NIC or a teamed connection.
Some of the editors will work with the same material, because they do versions of the same story to different platforms.
Another TV Station like ours allready has their system up and running. They reused their 2x10 Gig network and 1 Gig connections to their workstations. The use a single Synology RS4017+ with 2 expansion bays and SSD cache. It works well for them.
I concider TrueNas a superior product compared to Synology, but I understand the laws of physics and the requirements for enough spindles to reach a certain bandwith. I have 2 tools:

  1. Only half of the 25 workstations will do full resolution video (100 Mbit/s) the rest will work in a 8 Mbit/s proxy resolution.
  2. I will optimize the “cache” options supplied by TrueNas. I have around 350 GB ARC. That should be enough to hold the video material that is beeing edited at any time. I will add 4 TB SSD L2ARC. That should be more than enough to hold all video material beiing touched in on day. I will add a SSD Special VDEV for metadata (and probably small files) so the spinning rust can concentrate on delivering high sustained throughput.

So that is why I ask very specific questions about optimizing the “cache” options in my TrueNas. I don’t know enough about the workload of a media file workflow.

1 Like

Thanks for clarifying the requirements. Glad you are confident in those.

Let’s try to clarify the expected workloads and impacts on ZFS.

  • Ingest: This will probably happen in bursts and not with a high level of concurrency. I’m not quite sure if these will result in sync or async writes. In case of sync writes, SLOG/ZIL devices will help, otherwise not. Sync writes to SLOG device workload is sequential writes. Your proposed setup (Optane memory) seems overkill to me. Optane is good for this use case, but not for its great random read/write performance, but because of its lack of wear compared to NAND flash. The SLOG needs enough devices to not bottleneck writes.
  • Editing: As I discussed above I expect this to be performance critical. Sticking to HDDs for main VDevs, you want to use Special devices to pick up any small-blocksize read/writes to keep HDDs at their optimal performance. To do that you’ll configure the dataset(s) holding media to large block size (recordsize=1M) and to write smaller blocks to the special device vdevs (special_small_blocks=64k or even higher). Media files don’t compress well, you’ll save CPU cycles and increase speed by turning off compression (compression=off). I’ll talk about L2Arc later.
  • Rendering: Rendering operations are mainly compute-intense and don’t require quite the bandwidth/performance.

L2ARC works quite differently to a normal cache and as a result may not behave as initially expected. ARC independently reserves capacity for frequently accessed blocks/files and recently accessed blocks/files. Because I expect the software on the workstations to implement local caching I don’t think you’ll see a lot of ARC/L2ARC activity. I expect workstations to work on separate projects with little overlap in files (little opportunity for caching) and little re-read (due to local caching). That’s something for you to verify. Assuming I am correct you could save money and hardware and get away without L2ARC altogether.
In case you see active use, you need to watch out for L2ARC only getting loaded very slowly by default. I’m not sure about the TrueNAS defaults or how to influence it, but on Linux the max load rate of the L2ARC is a zfs module parameter (l2arc_write_max, l2arc_write_boost).

Finally, the Special VDevs. Similarly, I expect so see some, but not too much activity. It will be used to access metadata that is not already cached and blocks of smaller recordsize as configured for your datasets. You will want to make sure that these don’t become bottlenecks to the overall operation. I expect these vdevs to see little write and more read activity. Enterprise NVMe drives would be great, Optane 905s maybe even better due to their superior small queue performance. I’d go for a striped mirror to ensure sufficient performance.

With that I expect the HDD-based main vdevs to severely bottleneck your use case.
25 HDDs in triple-mirror config will only result in 8 vdevs with 1 spare. Consider a regular mirror setup which would increase number of vdevs to 12.

Thank you.

I will increase the number of spinning disks in the media file pool to 33 giving me 11 VDEVs of 3-mirrors before we go to production.

Once you’re going into the test phase you can monitor drive access from the command line with zpool iostat -v [<time in sec>] (very long output for all the drives). This will tell you if and how much each drive and vdev is being used. You can use that to potentially improve the resources attributed to SLOG/L2ARC/Special.

You can see recordsizes written in real time with zpool iostat -r [<time in sec>] (add -v to see broken down by drives). You can use this to dial in the special_small_blocks parameter.

Finally, you can see the service time broken down most importantly into sync and async wait times using zpool iostat -w [<time in sec>] . This will essentially tell you if your stategy worked. Try to get the bulk of sync operations around 1us or less, HDD operations take 4-14ms according to spec.

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.