AWS for M&E Blog
Optimizing compression settings for cost efficiencies with AWS Media Services
Video compression is complex. Quality, bitrate efficiency, latency, and cost optimization all seem to compete against each other. Determining the right balance of each takes trial and error. This blog post showcases examples of different codecs, color spaces, encoding settings, and output settings to give you a starting point for your own testing. We look at the differences between chroma-subsampling, color bit depths, group-of-pictures (GOP), codecs, codec profiles, rate-control modes, dynamic ranges, and cost per output, using the same master file as our reference point. All these outputs will be generated using AWS Elemental MediaConvert, a fully managed file-based cloud transcoding service from Amazon Web Services (AWS). If some of these terms are new, I recommend starting with my colleagues’ exceptionally helpful posts – Back to Basics: GOPs Explained as well as Back to Basics: Mechanisms Used Behind the Scenes in Video Compression.
The codecs
We will explore three video codecs throughout these tests: AVC, HEVC, and AV1.
AVC (H.264 or MPEG-4 Part 10)
By far the most widely adopted video codec. MPEG-4 Advanced Video Coding (AVC), also referred to as H.264, is supported in virtually every media device. It offers good compression performance, and the ability to be readily encoded and decoded on most compute platforms. It’s the least CPU intensive of the three.
HEVC (H.265)
High Efficiency Video Coding – often associated with 4K or 8K video workflows. Also referred to as H.265 and MPEG-H Part 2, it is a successor to the AVC codec, and due to its compression architecture, additional features such as High Dynamic Range (HDR) are supported. It is more CPU intensive than AVC both in the encode and the decode.
AV1 (AOMedia Video 1)
The Alliance of Open Media Video 1 codec (AV1) is an open, royalty-free video codec that was designed to be the successor to the VP9 video codec. It boasts 20% increased efficiency over VP9, which is a codec used by YouTube and Netflix, particularly for streaming UHD content. AOMedia is a collaboration between Amazon, Google, Intel, Microsoft, Netflix, and now Apple. It offers great compression performance for HD and higher resolutions, and is exceptionally efficient at lower bitrates. It requires drastically more CPU resources than HEVC.
The starting point
Our Master.mp4 file for this series of tests has the following specifications:
Resolution | 8K (7680×4320) |
File Size | 2.99GB |
File Duration | 00:07:45:00 |
Codec | HEVC (H.265) |
Codec Profile | Main10@L6.1@Main |
Framerate | 60.000 (True 60fps) |
Bit Depth | 10 |
Chroma Subsampling | 4:2:0 |
Color Primaries | BT.2020 |
Transfer Function | PQ |
Color Space | YUV |
We will be transcoding in the us-west-2 region using the master file living in Amazon Simple Storage Service (Amazon S3). All outputs will have the same aspect ratio (16×9), color space (YUV), and frame rate (60), and their resolution with be HD (1920×1080). The first encode sample will be using the following output specification to showcase the difference between Accelerated Transcoding and On-demand with regard to transcoding speed and output file size, as well as demonstrating what to expect when we leave almost every setting as Auto in the MediaConvert console.
The only setting difference in these two jobs is that Accelerated Transcoding has been turned on in the second job.
Codec | Codec Profile | GOP | Accelerated? | Transcode Duration | Bit Depth | Rate Control | Chroma SS | Single or Multi Pass | File Size | Cost |
AVC | Profile: Main Level: 4.2 |
m=3, n=120 | NO | 35min 16 sec | 8 | CBR 10 Mbps |
4:2:0 | Single pass | 587.1 MB | $0.1457 |
AVC | Profile: Main Level: 4.2 |
m=3, n=120 | YES | 11 min 51 sec | 8 | CBR 10 Mbps |
4:2:0 | Single pass | 598.6 MB | $0.2325 |
Notice that MediaConvert determined a 2-second GOP (120 frames), with the structure IBBPBBPBBP…..I taking advantage of the temporal compression that B-frames offer. If Scene Change Detection is on, slight variations to this pattern may occur as additional I-frames will be injected at the scene breaks. The bit-depth was dropped to 8-bit, which caused serious banding to occur in some of the shots, making the outputs fall short of the requirements for High Dynamic Range (HDR). The default codec Profile is Main, which is widely supported among players, but the recommendation is to use High in any modern player, especially when outputting HD or higher resolution content. The other item that we notice is that for $0.0868 cents more, the PRO tier cost of Accelerated Transcoding allowed this job to finish roughly 3x faster. For all remaining tests, Accelerated Transcoding will be selected.
The only setting change in the two subsequent tests are the Single Pass HQ and Multi Pass HQ settings.
Codec | Codec Profile | GOP | Transcode Duration | Bit Depth | Rate Control | Chroma SS | Single or Multi Pass | File Size | Cost |
AVC | Profile: Main Level: 4.2 |
m=3, n=120 | 15min 23sec | 8 | CBR 10 Mbps |
4:2:0 | Single pass HQ | 598.6 MB | $0.2325 |
AVC | Profile: Main Level: 4.2 |
m=3, n=120 | 22min 34sec | 8 | CBR 10 Mbps |
4:2:0 | Multi pass HQ | 605.7 MB | $0.2325 |
The file sizes are still quite large as the Constant Bitrate rate control mode maintains the use of all the available bits while trying to keep the highest possible quality. For over-the-top (OTT) delivery, we need to focus on keeping the quality high, but at the same time drop the bitrate as low as possible. This is where the switch to Quality-Defined Variable Bitrate (QVBR) is key. QVBR is supported in AVC, HEVC, and AV1. Let’s do a codec and rate control comparison next.
There are two changes to the following outputs, the codec and the rate control mode.
Codec | Codec Profile | GOP | Transcode Duration | Bit Depth | Rate Control | Chroma SS | Single or Multi Pass | File Size | Cost |
AVC | Profile: Main Level: 4.2 |
m=3, n=120 | 22min 34sec | 8 | CBR 10 Mbps |
4:2:0 | Multi pass HQ | 605.7 MB | $0.2325 |
AVC | Profile: Main Level: 4.2 |
m=3, n=120 | 22min 41sec | 8 | QVBR Level Auto 10 Mbps Max |
4:2:0 | Multi pass HQ | 246.6 MB | $0.2325 |
HEVC | Profile: Main Level: 4.1 Tier: Main |
m=3, n=120 | 37min 22sec | 8 | CBR 10 Mbps |
4:2:0 | Multi pass HQ | 606.1 MB | $3.255 |
HEVC | Profile: Main Level: 4.1 Tier: Main |
m=3, n=120 | 25min 22sec | 8 | QVBR Level Auto 10 Mbps Max |
4:2:0 | Multi pass HQ | 240.6 MB | $3.255 |
AV1 | Profile: Main Level: 4.1 |
n=81 | 13min 30sec | 8 | QVBR Level Auto 10 Mbps Max |
4:2:0 | N/A | 235 MB | $13.392 |
Notice that there are no B-frames in the AV1 output, and you are required to use the QVBR Rate control mode, so there is no multi-pass comparison for AV1. The file size of AV1 is drastically lower now due to the advanced compression algorithm and the use of QVBR, but the cost is far higher than AVC or HEVC due to the sheer amount of compute resources required to encode it. This needs to be considered when deploying AV1 in your workflows, as well as verifying player capability, as OTT specifications like HLS don’t support AV1. It’s worth noting that AV1 performs exceptionally well at very low bitrates, even for larger resolutions, making it a better alternative to low bitrate AVC or HEVC.
The amount of B-frames present can lower the bitrate quite drastically, so for the next comparison we are going to bump the B-frames between reference (P) frames to 5, and explicitly set the GOP to 120 frames. We will also jump up to 10-bit color depth. The QVBR quality setting ranges from 1 (low) to 10 (high), so we will lock the QVBR quality level at 8, which is high enough quality for almost every device. We will run both 8-bit and 10-bit color depth for comparison.
Color bit depth, QVBR level, and GOP structure have been set.
Codec | Codec Profile | GOP | Transcode Duration | Bit Depth | Rate Control | Chroma SS | Single or Multi Pass | File Size |
AVC | Profile: High Level: 4.2 |
m=6, n=120 | 23min 25sec | 8 | QVBR Level 8 10 Mbps Max |
4:2:0 | Multi pass HQ | 132.7 MB |
AVC | Profile: High10 Level: 4.2 |
m=6, n=120 | 23min 2sec | 10 | QVBR Level 8 10 Mbps Max |
4:2:0 | Multi pass HQ | 127.4 MB |
HEVC | Profile: Main Level: 4.1 Tier: High |
m=6, n=120 | 25min 41sec | 8 | QVBR Level 8 10 Mbps Max |
4:2:0 | Multi pass HQ | 173.9 MB |
HEVC | Profile: Main10 Level: 4.1 Tier: High |
m=6, n=120 | 29min 20sec | 10 | QVBR Level 8 10 Mbps Max |
4:2:0 | Multi pass HQ | 150.9 MB |
AV1 | Profile: Main Level: 4.1 |
n=120 | 11min 13sec | 8 | QVBR Level 8 10 Mbps Max |
4:2:0 | N/A | 167.7 MB |
AV1 | Profile: Main Level: 4.1 |
n=120 | 11min 3sec | 10 | QVBR Level 8 10 Mbps Max |
4:2:0 | N/A | 175 MB |
Inspecting the GOP structure using ffprobe of the AVC or HEVC output, we are now seeing the IBBBBBPBBBBBPBBBBBP….I structure applied. The file sizes of all six outputs are drastically lower, but the image quality is still very high. You may want to run tests at both QVBR quality level 7 and 9 to visually compare, especially if most of your content is similar (e.g. sports or news), as the quality may be acceptable at lower levels, which would further reduce file size, and in turn distribution costs when serving through a Content Delivery Network (CDN) like Amazon CloudFront.
What’s unexpected and fascinating is that both AVC and HEVC 10-bit outputs have a lower file size than their 8-bit counterpart with every other setting being identical. This is due to the encoder efficiencies that we can take advantage of when we have more available bits to use. The AV1 10-bit is only slightly larger than the 8-bit output, meaning both could be usable, but at this point in time if your source is 10-bit, it makes sense to use 10-bit across the board, as it’s better quality, lower file size, and even if the cost to encode it is slightly higher, the distribution costs at scale will be much lower.
Notice that we have not updated the Chroma Subsampling, as the master file for these tests started as 4:2:0, so bumping up would not produce more color information. If we had received a 4:4:4 or 4:2:2 file, it would be worth inspecting the visual differences by lowering the subsampling, as 4:2:2 is 1/3 the bitrate of 4:4:4, and 4:2:0 is 1/2 the bitrate. There can be cost savings found in subsampling if the quality remains acceptable and the color intent is honored.
Our key takeaways are to balance encoding costs with overall delivery costs when dealing with media assets. 10-bit can actually become more cost-effective than 8-bit and the proper use of QVBR and B-frames can make a big reduction in bitrate while maintaining exceptionally high quality. Remember to start with the end clients and players and work backwards. Be certain they can decode the desired output, then make the adjustments to the encoders to achieve the best quality with the fewest bits.