HEVC backstage story

Wednesday April 23, 2014

It was Thursday April 17 2014. I had sinus pressure and nasal congestion. Not-so-extreme but abrupt temperature change, coupled with shower of rain and snow proven as the potent concoctions to render me helpless under blanket, fighting against the thud pressure banging my head from time to time. Decided to take a quick look on how HEVC x265 tools were at what point on that day.

Worth checking, FFmpeg has officially supported the x265 in its codebase, as reported by Phoronix. Lucky me I found another thread on StackOverflow on how to get it (the H.265 encoding with FFmpeg as its front-end) done. Everything was dawned pretty much like having a heaven bestowed upon me…

… until the sinus pressure killed the joy of discovery.

I started with Veritasium’s video (I forgot which video I used for the first experiment). It took way too long for the encoding to complete, and x265 packed tons of bitrate per second compared to x264.

packing-more-bitrate

Having only Intel i5 clocked at 2.50 GHz (Macbook Pro Mid 2012), it wasn’t fun waiting to get it done. I had to use my friends’ battlestations (Isam with his SSD-based battlestation and Adib with his 3.40 GHz behemoth —loving the Corsair K60). Thanks to Zeranoe for providing the sort-of bleedy build of FFmpeg Windows binary, made it possible for me to test x265 on Windows without compiling.

Inline note #1: ArchWiki has a good all-in-one tutorial for FFmpeg

As on OSX, I took the liberty of trying to compile the x265 (thanks to MulticoreWare). Running on OSX, clang is there by default, but Linux guys doesn’t like clang, and compiling x265 needs cmake to get it done (are cmake and clang interchangeable?). God permits, Brew has cmake.

compiling-x265-from-source

Done compiling x265 binary on my OSX, let’s have it merged into FFmpeg’s main binary. Not good, because according to Brew, FFmpeg on its repo doesn’t have the option to include x265 (as of this writing). Tried with --enable-libx265 and --with-x265 options when I was re-compiling FFmpeg, fairy godfathers got no spare magic sticks to cast the miracle..

Let’s do the piping method, because the [StackOverflow says so][link8].

piping-ffmpeg-to-x265

confusing terms

H.265 is the name for a video standard, while x265 is the tool to encode it. Defining FFmpeg was a problem, because end users are never gonna see it, let alone using it directly. Thanks to LICEcap, saved me the trouble from making a screencast, because uploading to YouTube was another thing I wanted to avoid.

I was a bit into AVI MP4 MKV stuffs, but I had to stop myself not even halfway. Why, because if I were to write more, I might introduce a counterintuitive terminologies that is the AVI MP4 MKV aren’t video formats, but rather multimedia containers that hold bitstreams of video, audio, and also subtitle files. Inside the containers there might be some AAC files as the audio streams (great for dual-audio), ASS or SRT subtitle files, or YUV as the video stream. I might as well froth my mouth (or fingers) with bubbles about video interlacing, deblocking filter, etc… but believe me I should email the draft to a publishing company to get a book published. Specifically talking about ASS or SRT, I might as well introduce Aegisub or Subtitle Editor, the process of hardsubbing and softcoding subtitle(s), and also the MKVtoolnix.

Elaborated, and fun… but for the sake of brevity, let’s talk about that later on (or, even better, I won’t. Wikipedia has it all!).

3 trials / 3 experiments

Having learned statistics, I should do a sampling with counts of trial more than 30 to get an approximately normal distribution of the group of interest…

… except it won’t happen. It would take too much time.

Ran a trial with FFmpeg Zeranoe, and twice with FFmpeg (Brew, OSX) piping to x265. 3 tests, then I called it off as I guessed everything was fairly conclusive to do the documentation.

Anime Encoding / Pirated Film Encoding

Inline note #2: This Wikipedia page might pique your interest.

Done reading that Wikipedia page? Now this is the fun part. First of all, please be thankful to all anime encoding groups out there (Anime2Enjoy, Animekens, Cyber12, ONS… and also the anime fansubbers e.g. HorribleSubs, Hadena, etc…), and the movie pirating groups (YIFY, aXXo, Ganool, etc…). They have done hell a lot of efforts to the dark side of creative and digital contents distribution.

Why?

Because encoding takes time and it chugs resources. Typically an encoding task would span hours per run, and usually movie pirating groups do the encoding twice or thrice (2-pass encode, 3-pass encode). You might want to Google what 2-pass encode is. Generally speaking, it is about doing better quality of the encode with the lowest file size possible, with the first pass guessing the bit range, and the second pass that uses the bit range to produce the file with the lowest size.

To simplify encoding process, usually these people don’t use the FFmpeg nor x264 binaries directly via command line, instead they use the point-to-click GUI applications like Handbrake, MeGUI, and Staxrip. Noob-friendly? Well I prefer Cyko (with HandbrakeCLI as its backend engine).

Imagine if these people adopted HEVC. Please imagine.

Inline note #3: There’s a video by Anime2Enjoy on how to use MeGUI. I bet you wanna use the simplified version of MeGUI, called MiniX264 (tutorial in Bahasa).

The VLC and HEVC

The very first trouble when it comes to testing new technology, is that the implementation in industry. Surely Mplayerx (OSX) is not joining the fad of H.265 at this time, given the fact that it was late to the x254 10-bit party. Phoronix reported that VLC has supported HEVC videos, but after tested with latest stable version 2.1.4 and the nightly version 2.1.5, nothing good. The experiment depended solely on FFplay, and the backward/forward playback on FFplay was horrible, not even the like of mplayer.

Being a Linux and OSX user, I wasn’t able to test DivX player (the website says it supports HEVC) and PotPlayer. Maybe you could do some simple experiment on that matter, please check it for me. As of this writing, I can only depend on FFplay.

The Chronicles of 8-bit & 10-bit Profiles

I was on Intel Atom back in 2012 (Samsung N148P, thanks to SKMM for the netbook). I was really frustrated because that time most of the encoders released Shakugan No Shana Season 3 in 10-bit format (not to mention how frustrated I was with evil Yuuji), and my processor kinda sucked playing it (good lord I still had my Pentium 4 HT 3.0 GHz at that time). I love to talk how the 10-bit profile reduces the file size compared to 8-bit without sacrificing quality, how it works, and all the banding stuffs but hey, I posted links! Have fun reading the extra materials.

FFmpeg Zeranoe doesn’t ship x264 with 10-bit profile in their binaries, but you can get the x264 precompiled with 10-bit profile from x264.nl and use it with MeGUI.

Inline note #4: Through a short chat on Anime2Enjoy, the team leader said they don’t favor 10-bit. I don’t know why, but their encodes are awesome. Loving the size and quality. Downloaded Log Horizon and Nisekoi from Anime2Enjoy, and being in love with them ever since.

Inline note #4.5: I resorted not to explain what banding is, because it is quite counterintuitive (it feels like explaining address space 64-bit arch vs 32-bit arch).

compile-x264-options-on-brew

A bit on WebM and libvpx!

When I was in the midst of writing the conclusion and perfecting the whole article, I bumped into Google’s VP8 codec library. I was shivering, feared that I had to refactor the whole article by including VP8 encoding results instead of having H.265 and H.264 only. Fortunately the VP8 encoding outputs told me not to do so.

As for H.264 and H.265, default encode schemes produce reasonably good quality but it is quite on the other side of the spectrum regarding VP8 (or it was just FFmpeg being unfriendly towards VP8). The default encode scheme without extra encoding options produces what I won’t even call as mediocre quality.

VP8-default-encode-scheme

My eyes were bleeding. The pixelated spots were noticeable! Well, FFmpeg Trac already states that encoding with libvpx without specifying video bitrate nor constant rate factor would plunge the quality. The correct way to do this:

ffmpeg -i input_file.avi -c:v libvpx -quality good -cpu-used 0 -b:v 500k -qmin 10 -qmax 42 -maxrate 500k -bufsize 1000k -threads 4 -vf scale=-1:480 -c:a libvorbis output.webm

Yeah it is quite long.

comparing-VP8-WebM-with-X264-8bit

Picture above is the final size after being encoded (VP8 vs x264-8bit).

Another thing is, under default encoding scheme, libvpx engine doesn’t use fullest capability of my CPU, instead of CPU core 1 and core 3 were being maxed. x264 is better in term of handling resources, IMHO.

VPx-V-orbis-testing

A bit into Flash player

Did you know that Adobe had stopped providing the new version of Flash Player on Linux? And did you know that there is no Adobe flash player for mobile platforms? Indeed, the rise of HTML5 discouraged Adobe from working on it. Currently, the popular ways of playing Flash videos are 1) using Google Chrome because it has built-in Flash plugin, 2) install the OSS alternative called Gnash, 3) install another OSS alternative called Lightspark.

Spoiler alert: Gnash and Lightspark are subpar (as of writing) when it comes to performance. Tested on my Archlinux (Acer Aspire One D255, Atom 1.6 GHz 2GB RAM), stuttering playback. Ditched both, and I installed flashplugin, no more stuttering playback on YouTube.

Why, at first, FOSS (e.g. Firefox) was reluctant to provide H.265 support in their application?

If you are using a Linux, and your wireless adapter is Broadcom, odds that you have to install Broadcom’s proprietary wireless driver is high. And I can say that you have installed the proprietary codecs to play videos by using the package named ubuntu-restricted-extras (which includes Libavcodec-extra-52, which contains libx264, et cetera).

Ubuntu explains the meaning behind “restricted formats” here.

It is related to licensing policy, and the philosophy of open source software. Safe to say that OSS movement doesn’t endorse proprietary softwares and anything that has restricted usage (obfuscated source codes, royalty stuffs, etc…) because these things are against the openness. Restricted softwares are perceived as shady entities in the OSS community because the people of OSS can’t improve the restricted softwares for their own likings and they can’t audit it. That situation, for the OSS people, is a huge letdown.

Inline note #5: Richard Stallman: Clang vs. Free software.

Inline note #6: Although softwares licensed under OSS-friendly terms are superior in term of hackability, we have to accept the fact that sometimes proprietary and restricted technologies are far better. End users can’t argue this fact when we are debating about UI/UX of Apple devices vs Android-based devices (inconsistent Holo on Android vs consistent Cocoa on iOS). What’s more, Clang (LLVM-based, proprietary) provides better performance compared to GCC (open source) when it comes to compiling Unreal Engine 3. Maybe this is the reason why ART (Android Runtime, Dalvik’s successor) is being developed with LLVM technology.

Anyhow, all hail open source. We live in a better world of supercomputing today thanks to open source application and technologies. Verily that 90% servers deployed are Linux-based. Shame on you Steve Ballmer!

The trend (I got this from pali7x)

First thing before I started off, I would like to say thanks to @pali7x for supplying me with information (as always, since my GPGPU article).

Year 2000. At this time DivX was the popular format (trendsetter for AVI video files). But due to limited storage on PC, the VCD was regarded as crucial in video industry. You can’t store too much files in 10 GB HDD PC, or can you? Downloading movies / videos from internet, 56 Kbps modem (7 KB/sec download speed) would take you nowhere. As you might have guessed, DivX was (and still is) proprietary technology.

Next years afterwards when disk storage and bandwidth became reasonably affordable in the market, and internet speed became more bearable than it was, more video players emerged. For example, the Vlan a.k.a the VLC Video Player. At this time also, the DivX became a commercial application (it means you have to pay to use it), and the free version was ad-laden. People (on Windows XP) would be presented with an error saying “codec not found” when they tried to play certain specific video files, and they started to search for alternatives. At time, there was a group that wanna compete DivX, and this group released the free version alternative of DivX called Xvid (such genius naming technique!).

Talking about K-Lite and VLC, actually VLC gained popularity early on than K-Lite because the VLC installation process was simple and dumbed-down. Installing K-Lite required you to choose your codecs and determine the settings and all those technical stuffs. At the same time, iPod (capable of video playback) was announced and available in the market, so the people at that time had more motivation to start collecting movies.

And few years later, K-Lite codec started gaining popularity when there was a war regarding to HD-DVD vs Bluray: which of them should be the standard of video distribution (PS3 was announced within this timeframe). At this time also, GPU acceleration (GPGPU in some sense) technologies were also competing and trying hard to get into market (CUDA/Firestream). The year where Directx 10 was announced also (Windows 7). K-Lite codec gained popularity because it was the first decoder that implemented GPU acceleration technology, so instead of revving and exhausting more out of CPU, the burden of load was shared between CPU and GPU (though not in the same memory space, which brings us to HSA technology developed by AMD. More on this next time, or rather, maybe not).

If you to read more about the trend, jump to this article (I have to admit that I haven’t gone through it, but nice article anyway).

Thanks to the contributors!

I love doing peer-review. My article is inundated with technicalities, and I need people to verify that the information is within the range of end-user’s comprehension capability (such term!).

Throughout the first phase of this article, I was using Isam and Adib’s desktop to do the experiment (both are running on Windows 8, powerful setup. Isam is more into videography and he needs a fast desktop for rendering, while Adib is more to hobbyist setting up a burly and tanky rig. This Fall semester he will assist me to get a decent AMD FX-8350 (or newer) setup so I can start being serious with Linux development).

Thanks to @pali7x for supplying the information. You are always being helpful when it comes to verifying the article, and providing addendum so the article can be more meaningful while packed with information. Good lord I have you at my disposal hahaha.

Special thanks to @aribismail for being my loyal editor and proof-reader (I still can’t fight stage fright when it comes to publishing articles directly on AmanzMY, fear I might screw up the editor hahaha). Thanks to Zulh for reading my draft and giving the feedbacks. Loving them! Thanks to @izwannizam for reading the draft and giving an assurance that my article is readable tho heavy. You guys rock!

Feeling curious? Get the video files here.