FFMpeg AV1 Encoding Using Intel Arc GPU tips?

Hey there,
I recently got an Intel arc a380 graphics card for AV1 encoding and I’ve had some success getting it running on Xubuntu 22.04 (altough I’m not sure which combination of steps did the trick, and I ended up having to re-install my desktop packages and nvidia drivers in the process).

I can successfully encode a video using my arc GPU by running:

ffmpeg \
  -i input.mp4 \
  -init_hw_device \
  vaapi=va:/dev/dri/renderD128 \
  -c:v av1_qsv \
  -preset veryslow \
  -look_ahead_depth 99 \
  -b:v 1M \
  -low_power 0 \
  output.av1.1000kbps.veryslow.mp4 

Even at the veryslow preset, it whips through a video at lightning speed (and quietly) compared to using a CPU-based multi-pass encode with handbrake. However, it is very obvious that the outputted result is nowhere near as good a quality/filesize compression as a CPU based encode, which takes waaaay longer.

I was wondering if anybody knew if it was possible, and if so provide the command, to run a multipass encode using the arc GPU?

In case it comes up, the GPU is in the secondary PCI x 16 slot, with the primary allocated to my nvidia GPU that the displays run through. These are paired with an AMD Ryzen 5 7600.

2 Likes

I don’t use multipass, so I’ll leave that to someone else, but here’s what I used to optimize AV1 encoding quality on my A380 (on Jellyfin) according to Intel’s quality optimization guide:


-look_ahead_depth 99 \

-look_ahead_depth (LAD) requires -extbrc 1, and recommended value for quality is 40. It also requires -extra_hw_frames to be set to the same value:

-extbrc 1 \
-look_ahead_depth 40 \
-extra_hw_frames 40 \

  -b:v 1M \

For CBR, Intel also recommends to set -bufsize to at least 2 times the bitrate to maximize quality. In this case, 2M should be used. Also -rc_init_occupancy should also be set for initial buffer delay. The recommended value is half of bitrate:

-b:v 1M \
-bufsize 2M \
-rc_init_occupancy 512K \

Other optimizations:

  • -adaptive_i 1 for adaptive scene change detection
  • -adaptive_b 1 for adaptive miniGOP
  • -b_strategy 1 -bf 7 activate full 3 level B-Pyramid. Note that -bf should be set to less value than LAD. (If LAD is 40, then valid value is up to 39.)

So final command (untested):

ffmpeg \
  -i input.mp4 \
  -init_hw_device vaapi=va:/dev/dri/renderD128 \
  -c:v av1_qsv \
  -preset veryslow \
  -extbrc 1 \
  -look_ahead_depth 40 \
  -extra_hw_frames 40 \
  -b:v 1M \
  -bufsize 2M \
  -rc_init_occupancy 512K \
  -low_power 0 \
  -adaptive_i 1 \
  -adaptive_b 1 \
  -b_strategy 1 -bf 7 \
  output.mp4

Typically I run VBR, but this settings halved the encoding speed while giving a much better quality. SVT-AV1 still produces a better result at a smaller filesize however (but it’s wayyyy too slow compared to QSV).

Forcing ExtBRC to use the new EncTools by setting -extbrc 1 -look_ahead_depth 40 (or any value above 1) seems to be what helps with quality on QSV AV1.

Thanks for that.

I tried running the command as you posted it, but got the following error:

Thus, I ran it again without the -extra_hw_frames 40 \ and it worked. I’ve been running a variety of bitrates, and I swear this has “fixed” the quality of the 1000kbps version so that I don’t have to run at 2000kbps to get the same level of perceptual quality which is fantastic (a major improvement). The command in its full form that is currently working for me is:

ffmpeg \
  -i input.mp4 \
  -init_hw_device vaapi=va:/dev/dri/renderD128 \
  -c:v av1_qsv \
  -preset veryslow \
  -extbrc 1 \
  -look_ahead_depth 40 \
  -b:v 1M \
  -bufsize 2M \
  -rc_init_occupancy 512K \
  -low_power 0 \
  -adaptive_i 1 \
  -adaptive_b 1 \
  -b_strategy 1 -bf 7 \
  output.mp4

Can you try putting -extra_hw_frames before -i? I remembered the order of arguments matter quite a bit for this one.

Was thinking of getting a Intel ARC GPU for Linux OBS encoding but it seems its STILL not there yet. Only the Windows drivers function atm?!?

Setting up a windows server for OBS streaming is depressing! But it seems maybe only option atm for ARC until Linux drivers catch up (could take a while)