On H265/HEVC as a Viable Alternative to H264

H265/HEVC is a viable alternative to H264 encoding in some circumstances.

Terms:
H265/HEVC- Formal Specification
H264- Formal Specification
Notable H265 HEVC Implementations: x265 (used by FFmpeg), Intel MSS HEVC Software/GAcc
Notable H264 Implementations: x264 (used by FFmpeg and VideoLAN)

Intro:
Next generation codecs promise identical quality to H264 at half the file size. In practice, this promise has yet to achieve a full realization. To understand the use cases where H265 currently excels, consider that codecs are designed to balance quality, encoding/decoding time and file-size (bitrate). H264 is an incredibly well-balanced codec, excelling in all areas. H265 is an imbalanced codec that shines compared to H254 when given computational ability (time) and when allowed to slightly reduce quality.

Quality:
Encoding quality, while mostly subjective, can actually be measured objectively via SIMM (structural similarity). Using both this metric and subjective tests, HEVC is superior at any given SIMM value/bitrate.

1) http://x265.ru/en/bolshoe-sravnenie-kodekov-h265-ot-mgu/

2) https://blogs.gnome.org/rbultje/2015/09/28/vp9-encodingdecoding-performance-vs-hevch-264/

Please see Moscow State University's research “HEVC/H.265 Video Codecs Comparison” especially 5.3.3 Figure 17 on page 24 (1). Also see rbultje's blog “VP9 encoding/decoding performance vs. HEVC/H.264” especially the section “Encoding quality” (2).

Encoding Time:
In practical terms however, this increase in quality comes at a significant cost in encoding time with very little improvement in lower bitrate. In the best case scenario, as per figure 17, to match the quality of x264, an x265 encode takes 3.25 times longer to encode. In addition the average bitrate for a fixed quality of x265 is 83% relative to x264 (5.3.5 Table 5).

This means that if a video encodes using x264 to 300 MB in 1 hr, the x265 equivalent would take 3hrs 15min and be 249 MB.

Due to H265 codec design principles (more on this below), certain panning videos that containing lots of small details will take 2.8x longer to encode and will take slightly more bitrate resulting in larger file sizes (5.3.3 figure 18). In situations where time is more of a factor, the average bitrate for a fixed quality of x265 is 94% relative to x264 and the encode will still take 40% longer (5.2.5 Table 4, 5.2.3 Figure 10). In a third scenario, if time and quality are both key factors, all the benefits of HEVC disappear completely with x265 taking 117% the bitrate that x264 does for a given quality (5.1.5 Table 3).

File Sizes:
Some people don't care about file sizes, some (like me) do. The bitrate of a stream directly determines its filesize. To compare quality at a given bitrate is to measure the resulting filesize. Thus to desire low file sizes means to want a low bitrate for a stream. Larger resolution videos take more bits to represent. A 1080p video at a given quality/codec will have a larger size than the 720p equivalent.

The appropriate file size, resolution and quality depends upon the intended use case:

  • to play on a 240p iJunk screen
  • to play on a 4k TV
  • to stream at 480p to a netbook
  • end-user archival
  • editing archival

In addition, the appropriate balance between file size and resolution/quality is user-specific and I will not attempt to discuss that here objectively. Instead, I'd like to outline HEVC design considerations and lay out certain heuristics to make it clear when HEVC should and should not be used.

Codec design considerations:

  • The idea in designing HEVC was for 4k video to be the same size as a 1080p H264 one.
  • Due to H264's macro block size limit, H264 tends to hit a computational “wall” at which point throwing more computational ability at the encode will no longer reduce the bitrate.
  • HEVC does not have that same “computational wall.” The difference between a preset “slow” and preset “very slow” H265 encode is about 15-25% (this is anecdotal), where as it was 1-5% with H264.

Given enough time:

  • On average, an identical quality x265 encode will take 82% that of an x264 one (6.1.3 Figure 24). In practice, this varies dramatically depending upon the video source.
  • An x265 encode of certain sources will encode to a similar size as the audio stream (KILL la KILL EDs) and other sources show no improvement over x264 -at all- while still taking longer to encode (5.3.3 Figure 18).
  • Video, in general, encodes well (to small sizes):
    1. when it does not have a lot of change between frames
    2. when change is minimal
    3. and when large blocks of a video are the same color.
  • This describes certain anime series' very well, much more so than live-action footage.
  • H265 can take more advantage of these fundamental characteristics than can H264.
  • A slightly lower quality 720p x265 encode will take about 20-25% or 1/5th to 1/4th the space as a typical quality 1080p x264 encode (anecdotal). This implies the same amount of space could be used 4-5 series at 720p x265 over one series at 1080p x264.

Decoding Time and Compatibility:
Decoding time is mostly a measure of how well a given decoder implementation can use its current hardware (available instruction sets) and partly a measure the software's efficiency. If hardware assistance is not available, then a software-only decoder must be used instead. An incompatible video encode can refer to either hardware devices that do not have the option of using software-only decoders (TVs/iJunk) or to decoders that do not have sufficient computational capacity to decode the bitstream in real time using only software (tablets/9” netbooks playing 4k HEVC).

While decoding time and device compatibility are king for most video encodes scenarios, this has taken a back-seat in the anime community (note the near-instant switch from xvid->h264 and the popularity of mkv, advanced substation alpha subtitles, and 10-bit). Personally, I couldn't care less about stream compatibility. Anime does not exist to play on your device; your device exists to play anime.

H264 and H265 Recommendations (tldr):

Use x264 when:

  • Time is a key factor
  • Compatibility is important
  • Desiring more consistent file sizes (why?)
  • Retaining a minimum of 99% the quality of the original is a must

Use x265 when:

  • Time is not a factor
  • Low file sizes are important
  • 95% the quality of the original is “good enough”

Note: There is very little difference between in quality/bitrate between HEVC and H264 when HEVC implementations are not given sufficient computational capacity (in the form of time) and using HEVC will make the encode take longer. In that situation, use an H264 implementation instead.

Random:
Most Anime: x264, 720p-1080p, crf=16, aac, 10-bit, yuv420p
Personal fav atm: x265, 720p, crf=17, opus, veryslow, 10-bit, yuv444p
but might switch to 1080p @ crf=20

Edits:
* Grammer and spelingz
* Updated references
* I do not really understand how up-scaling works in relation to quality/bitrate, so I have removed most references to it pending more testing. The post is not really about that anyway.

4 Likes

Hmmm... it's late now but i look forward to reading through this when i have time. A large part of what i do for a living is encoding video. I've not had a proper look at h265 yet, It's something I need to get to when work calms down! But based on some simple tests I've done so far I'm certainly impressed, though the cpu overhead is lot atm.

isn't h.265 supposed to replace h.264?

Yes. H.265 is supposed to replace h.264 for archival purposes but not quality.
The h.265 allows for smaller file sizes compared to h.264. And soon the new AMD GPU's will have a coprocessor built in for acellerating h.264 and h.265 encoding and decoding. ( could be only one of them could be both am not entirely sure at this time need to go back and read specs)

Note a key finding of the MSU paper, and the blog I listed, is that HEVC implementations (with x265 as the focus) do in fact offer better quality than x264.

This makes sense given that the design goal was to achieve that, but at half the bitrate.

:S. Sorry I did speed read it and, I must of missed the configuration that gives better quality at half the space H.264. I did notice that it does take longer in nearly all configurations. If H.265/HEVC can be GPU accelerated I would personally start using it. If not then I will keep using H.264

Currently I find x264 a better encoder than HEVC. The speed vs filesize tradeoff is far too big. generally it saves about 15-25% but it takes about 5x as long to encode if you do proper ones.
For animation the filesize difference is more so there are still a lot of optimizations to be done before it becomes viable to switch. Also HEVC is absolutely horrible when it comes to high grain video.
also HEVC still has some quality issues that need to be fixed and where they are working on.

So no I still prefer x264 over HEVC at this time.

Currently x

x265 is at 82% of the file size as x264, so not quite 50%. On server grade hardware that figure goes down to 74% (6.2.3 Figure 28). So still not 50%. That was really one of the key findings of the research.

agreed, The only time I'd explicitly recommend HEVC is if time is not a factor and for certain video types.

This is actually a key area where H264 implementations shine compared to most other codecs. In order to represent high grain situations and extreme details accurately, x265 will actually use more bitrate than x264. HEVC encodes that attempt to use less bitrate for these video types will wash out the image quality so I would still recommend H264 for extremely detailed images.

Will follow this because im very greedy to know more. As I produce content im always filling disks and some are just for the content itself and not super duber quality so some quality loss will not be hard to live with.

If mine nvenc can perform fast at h265 I will for sure transencode things to it. As nvevc is super fast if it takes even 3x times still fast.

I hadn't even considered that nvenc would already support HEVC. And nvenc is in the Kepler GPUs, ...and I just so happen to be using a 660 GTX. o.o...

Will test this extensively when I have more time.

Also found this interesting post that has the following conclusion regarding nvenc when used for HEVC:

"The fight nvenc wins is not against x264, it[, nvenc HEVC,] matches it in quality and is somewhat faster but the next generation from Intel or Skylake should erase that advantage, the fight nvenc wins is against Divx Hevc and x265, which require significantly more cpu resources to run fast, nvenc matches or beats their quality depending on source characteristics and flat out levels them in the speed department."

So quality concerns, unlike x265.exe, but worth testing.

Source: http://forum.videohelp.com/threads/371187-Testing-NVENC-with-the-GTX-960 ~9 months old

I enter the nvenc world at rendering output just since Black Friday when I buy one 950, but love it. It cutted mine render times by 50 times.

Nvenc h265 is possible in the exporter I mainly use and you can even choose level of it. I will PM you two photos id take with Phone of the settings.

Can you detail me what is the best way I compare two clips? I can help you while you dont have a newer GPU on getting data.

One tool I use a lot with nvenc is movavi converter to change mine old avi movies to mp4 and progressive download settings for the NAS hdd. The rendering is blast. Wish it had the H265 for me to try it already.

I think my GPU supports it actually.

However, I'd like to know what software you use other than Movavi Converter. The snipping tool can quickly cut small portions of the screen for quickly capturing settings. Images can also be uploaded directly to posts.

For comparing clips, I usually use MPC -> File -> Save Image -> Png.
It's possible to find exactly the same frame via
Navigate -> Go To... -> Copy the frame number -> paste it to the MPC instance with the second video -> Go

well to be fair no anime encoding group ever uses the correct settings when encoding h.264. so odds are good people are using the wrong x265 settings as well. for example anime encoders mostly use profile 4.0+ when their video should be only profile 3.2 at highest. most web video only has a bit rate of about 2000 which technically makes it 320p video quality wise.
see:

also if you play around with the advanced settings in h265 it's possible to get reasonable speed while also mantaining good quallity/compression. namely playing with the motion estimation settings can have a big impact on results.

as for quality the best thing you can do is convert your source video to a lossless format like FFV1 or FFvHUFF then run it through filters to get it as clean as possible before compressing it down. this is how pro AMV makers manage to get such great quality at small file sizes.

The point of profiles is hardware support/compatibility, thus they are kinda pointless for anime group encodes. I always leave the profiles at auto.

I've been looking at playing around with avisynth for filtering since it doesn't seem to require dumping the raw bitstream (in y4m or the formats you mentioned). What programs would you recommend for filtering video?

selur's hybrid(free) and and mediacoder(paid) can handel most encoding tasks but virtualdub still has the most powerful filters out there.

as for the filters them selves I find MSU's smart deblocking and TTempsmoth do a pretty good job for most anime. you may also want to try gauss resizing if your downscaling it seems to to produce less aliasing around edges. if you need things to look sharper use mftoon to darken edges a little.

1 Like

Anime groups use profile 4.0+ because that officially supports 1920x1080. profile 3.2 does not. not that is matters since nearly all anime is done in hi10p which no hardware device supports and the chips themselves can't handle the software decoding.
The reason they mainly use 10bit encoding because it works so much better in animation. The most common issue with 8bit is banding which you eliminate in 10bit encoding the fact that file sizes are smaller is a nice bonus but for animation 10bit is actually a plus.

h264 is very good for anime because rasterized vectors (of which is most anime) are easy to compress. Too bad most new anime sucks. ("pantsu", "onniichan" and "I hope senpai will notice me") Though I do like your attention to detail in your codec comparison and you really know how to encode. I can't stand colour banding when people use a low bit depth.

Opus <3 Seriously, fuck that 23 year old codec known as MP3. Using MP3 today makes less sense than using IE6. At least IE6 is more updated.

For what I understand, banding could be described as the inability of an encoder to represent granular changes in color spectrum at a given bitrate with the color settings specified. To reduce banding then, increasing the color's resolution helps just as much as increasing the bit rate. As you pointed out, the bit-depth helps in this but also so does increasing the resolution of the color in the bitstream.

By color "resolution" I mean both the possible colors (10/12/16 bit) and the bit stream's chroma information. It is possible to increase the chroma information, the number of represented pixels, from 1/4th of the colors represented in the encoded bitstream (yuv420p) to 1/2 (yuv422p) or all the way to representing every pixel (yuv444p).

I found it's more efficient (space and quality) to increase the chroma than to just throw more bits at the stream (with a lower crf value) when banding is a serious concern. That is, 10-bit yuv420p helps to reduce banding significantly when compared to 8-bit yuv420p (as you already know), but it can still be very noticeable. For those cases, 10-bit yuv444p helps decrease banding just as much as going from 8 to 10-bit without having to resort to increasing the bitrate and hoping for the best.

yuv444p can also take the same amount of space as yuv420p when using some psy hacks, so to be honest, I don't understand why the anime community is still using yuv420p. yuv420p to yuv444p encodes do not result in quality gains but they do decrease artifact build up, help with banding significantly without increasing filesizes or encoding time.

Also: lol @ your hatred of mp3, it's fine after 192k but yes we need to move on to better audio codecs.

Edit: I found another description of banding from the avisynth docs on dithering:

"Color banding generally occurs on low gradients or almost flat colors covering large areas, which is often the case in cartoons and animes. Because of the limited video bit-depth, the gradient color values show staircases in its curve, making apparent the discontinuity between two adjacent color steps." -From: http://avisynth.nl/index.php/Dither

I love VirtualDub, an old, old friend that has served me well over the years.

Recently used Selur's Hybrid out of simplicity to help transcode a very large amount of video for a client, but it should be possible to use X.265 with MEGUI and have access to many of the same functions as VirtualDub, including the super powerful & flexible AVISynth.

Anyway, using a custom config 2-pass Main/Main profile I was able to achieve about 10% of orig file size with no discernable loss of quality even at high zoom levels (subjectively attained by obs-viz), but the encode time was massive so I had to setup batch job files & get Hybrid to del completed tasks in-volo. Playback of resulting files was also much more demanding, but the intent was archival.

It seems like x265 has improved a bit since this thread was started back in 2016. On my ryzen 1700x the speed gap is more like 2-3 times slower, with 50% smaller files sizes.

I made the switch to x265 this week.