It would be nice to see some encoding benchmarks for the 2990WX that reflect more real world encoding usage. We use FFMPEG directly using the prores encoder which supports massive multi threading to some degree. For our workload, the task is multiple encode performance. You can use AnotherGUI as the front end on windoows, or I would be happy to write a front end that would gather the performance data such as fps and encode time.
With our 1950X, just two simultaneous encodes of standard definition widescreen Prores LT video will peg the CPU with a combined maximum frame rate of 600-1000 depending on the storage configuration. I would love to see how this workload scales on the 29990WX at SD-WS, 1080i59.94, 2160p30, and 2160p60 with multiple simultaneous encodes. This type of workload does require a dedicated NVME drive for each higher resolution encode to not be storage bound. Using specific preset options in FFMPEG directly should be much more representative of real world usage in the broadcast sector. I am also interested in linux vs windows performance. I am working right now to transition our workstation used for CasparCG Graphics and FFMPEG transcoding to linux from windows 10 for greater stability.
Phoronix did some x264 and x265 benchmarks, and it can do 1080p encoding at 144fps on the slow preset. That’s impressive. At medium preset, x265 1080p encoding is 11fps. So far that’s all the data we have.
I’m trying to get ProRes to DNxHD and DNxHR added to the Phoronix Test Suite, since H.264 to NTSC DV is dated.
http://openbenchmarking.org/result/1808130-RA-CPUUSAGED10 (includes CPU usage during x264 and x265. The usage isn’t even 100%.)
Currently, the reason why they’re reluctant to take the current 4K ProRes samples is because the file sizes are too big. 500MB is more reasonable than the 15GB sample on samples.ffmpeg.org
Also, DNxHD/DNxHR will get you past broadcaster QC. ProRes on FFmpeg is reverse engineered and doesn’t pass broadcaster QC.
Yea they all seem to be focused on those two formats. H.264 may be a little dated, but it is pretty standard for live streaming still. Most Cable, IPTV or ATSC streams are still either MPEG2 or MPEG 4. All of the MPEG encoders in ffmpeg seem to be limited on the multi thread front. H.264 and H.265 are all delivery formats that are not suitable to edit from in many cases. True media/broadcast workstation formats for editing intermediaries are MPEG 2 I frame (old), Prores, DNxHD, FFV, Cineform, REDCODE, DCP, etc. Actual workstation loads are more likely to involve those types of formats.
I know the ffmpeg prores is reverse engineered, but I have had a hard time figuring out how to use DNx codecs with an SD frame size. (Yes the small studio I work with is still SD till we get the $100k to upgrade to HD or 4K cameras) FFMPEG is also the library used in many open source or linux compatible video projects so it is relevant there also. DNxHD/DNxHR are probably more relevent encoders to benchmark, but Prores decode is still relevant.
15GB is a fairly small file for a 4k workstation to handle. Most workstations will need a scratch drive for the editing program anyway so just keep an SSD around with the benchmark program folder and sample footage on it. It is also possible to make a 500MB clip that is one or two seconds long and use it with a looped concat operation to create the benchmark input file. Prores and DNXHD/HR don’t do intraframe compression so visual clip length shouldn’t effect the compression algorithm too much. An NVME drive, or dedicated SSDs for each clip are important for multiple encodes though. I can get storage limited even with my SD clips when doing more then two sequential writes even on a 960 PRO SSD. Even NVME experiences decreased overall throughput under multiple sequential opeartions.
Yeah, but to a benchmark package maintainer, 15GB is huge.
DNxHD has no SD support. It’s called DNxHD for a reason. For SD, you want to do MPEG-2 I-frame like MPEG IMX.
The Convergent Design NanoFlash has IMX recording capability over SDI.
What if the benchmark package included a render step to generate the 15GB file? I can easily think of a way to use a couple 1 second clips and stills along with some animation that could produce a longer clip. If the purpose is to evaluate encode/decode performance rather then encoder fidelity then the file shouldn’t have to be to perfect…
It would have to grab it from a public URL, and contain public domain material. The benchmark grabs from samples.ffmpeg.org because it’s an easy to download file for
wget. Remember, easy to script commands is the key to a successful benchmark set.
A “Big Buck Bunny” render to 3840x2160 at 59.94fps 10bit 4:2:2 would be a good sample file. Big Buck Bunny is a public domain animation film.
Bone stock - 2133 memory speed baseline:
MSI MEG Creation - 64G (B-die, but done stock)
3200C14 memory - all else the same
3200C14 + Precision Boost
single: 5399 multi: 62045
This last iteration of my first night tweaking is pretty impressive… it’s nearly double the windows multi-core scores I could find before buying thing (note I’m running linux here).
If you search you will find:
- it seems mine is now the highest multi-core score
- Note the drop from windows to linux ~60k to ~30k for the multi-core score
This latest score is gratifying in that it now has an air-cooled system “trading blows” with this water cooled, 350W (CPU alone) 4.5Hz beast.
single: 5903 multi: 66559
All of this seems to support the thesis (not mine) that windows’ schedular is a weak-point for this chip that linux does not share. This creates broader applications for this chip running linux until or unless windows is fixed.
Windows 10 results @ 4.1GHz
single: 4020 multi: 36291
More to follow, was debating a thread…