How to configure a low-latency HLS workflow using AWS Media Services

The HTTP Live Streaming (HLS) protocol allows delivery of live streams to global-scale audiences. Historically, HLS has favored stream reliability over latency. Low-Latency HLS (LL-HLS) is an extension of the protocol that appeared in 2020 in the 2^nd edition of the specification, enabling low-latency video streaming while maintaining scalability. It allows reduction of live streaming latency by a factor of two. While regular HLS latency usually ranges between 12 and 30 seconds depending on the workflow configuration and the player capabilities, LL-HLS brings end-to-end workflows latency to between 5 and 10 seconds. This enables streaming video latency to rival broadcast video latency, which is 6 seconds on average, and typically prevents viewers of a streamed live sport event from being spoiled by surrounding TVs or by social media stemming from the stadium or from broadcast viewers. The technology also opens up exciting new use cases like augmenting the broadcast viewing experience with additional synchronized camera feeds delivered over LL-HLS on second screen devices, as demonstrated by Amazon Web Services (AWS) partner NativeWaves.

In May 2023, AWS Elemental MediaPackage launched support for the packaging of media streams in LL-HLS, both with transport stream (TS) and CMAF segments. This blog post explains how to configure multiple AWS Media Services – namely AWS Elemental MediaLive , MediaPackage and Amazon CloudFront – to support LL-HLS workflows. DRM content protection with SPEKE v2 and multi-key encryption is also supported in these workflows, and documented in a previous blog post. Let’s first start with a summary of the LL-HLS mechanisms in order to map configuration settings.

How does LL-HLS reduce latency?

LL-HLS uses a combination of approaches to achieve this goal.

Blocking Playlist Reload: While it was already possible to somewhat reduce latency with regular HLS by using short segments, it is not possible to guarantee latency predictability with this approach, as the video player requests HLS media playlists on a random timing. While using a cascading effect, it also requests media segments on a random timing. This unpredictable request timing then leads to varying latencies and an unpredictable positioning of the play head compared to the edge of the live stream. LL-HLS solves this core problem by making sure that the player always requests the most recent media playlist, which is done by having the player find the most recent media sequence and partial segment number values from the initial media playlist response, and add incremented values as query string parameters (respectively _HLS_msn and _HLS_part) to subsequent media playlist requests. The Blocking Playlist Reload mechanism ensures that the playlist request is kept open until the requested media sequence number and partial segment number are reached on the origin, at which stage the media playlist, including references to the latest available partial segments, will be returned to the player through the CDN.
Rendition report: this mechanism complements the Blocking Playlist Reload mechanism. Rendition Report signaling allows the player to understand the values of the most recent media sequence number and partial segments in other media playlists so that it can switch bitrate without reassessing the last media sequence number and partial segment number from a default media playlist obtained at switch time, without the use of query string parameters.
Partial segments: LL-HLS playlists include a mix of classical full duration segments (usually 6 seconds) and partial segments (also known as “parts”) that have a much smaller duration (usually between 500 milliseconds and 2 seconds) and roll off the playlists after exceeding a duration of three full segments from the edge of the live stream. LL-HLS players will consume these partial segments, as they are available before the full duration segments, which will allow play head positioning at a shorter distance than a full segment duration from the edge of the live stream. Partial segments are referenced in the playlists as soon as they are pushed successfully to the origin.
Hinted partial segments: the end of LL-HLS playlists can include EXT-X-PRELOAD-HINT tags that reference partial segments that don’t yet exist at the time when the media playlist is returned to the player. Requests to such predictive parts are then issued by the player. The origin will hold onto the request until it can respond, resulting in the shortest time to partial segment delivery to the player.
HTTP delivery optimizations: LL-HLS requires HTTP/2 on the CDN side in order to benefit from the multiplexing benefits of this HTTP version. On the public hls-interest mailing-list, Roger Pantos recently announced that iOS 17 would add support for HTTP/3 and that LL-HLS delivery over HTTP/3 would require the use of server-defined priorities described by RFC 9218 (Extensible Prioritization Scheme for HTTP), to prioritize playlists delivery. AWS will continue to update its solutions, when such improvements are added to the specification.

If you are looking for more information on LL-HLS, a good summary of all its mechanisms is available on the Apple Developer website.

Workflow architecture and latency measurement

The configuration steps in the following sections aim to build an end-to-end workflow that combines a contribution encoder (AWS Elemental Link in the following example, but another contribution encoder could be used) pushing to MediaLive for the ABR encoding of HLS with 1s segments, then to MediaPackage for the repackaging of this ingest stream into LL-HLS with 1s parts, and finally to CloudFront for the last mile delivery of the streams. The easiest way to measure the glass-to-glass latency of such workflows is to film a clapperboard application running on a tablet, sitting alongside the LL-HLS player screen, and to take a picture of the two screens side-by-side. The difference between the two timecodes is the glass-to-glass latency.

Workflow for measuring glass to glass latency in live streaming using AWS Elemental Link, Medialive, Mediapackage and CDN Cloudfront

Figure 1 Workflow for measuring glass-to-glass latency

For more information about how to perform fine-grain latency measurements per workflow component, refer to a previous blog post outlining methodologies to do so.

MediaPackage configuration

First, you need to configure a Channel Group. Channel Groups are logical containers that host your channels and define the ingest and origination DNS entries of these channels. Combined with other user-defined parameters like Channel name, Endpoint name, and Manifest name, MediaPackage v2 makes all ingest and origination URLs fully predictable. In the MediaPackage console, select “Channel Groups” in the Live v2 section, and click on the “Create channel group” button.

Figure 2 Selecting Channel group under Live v2 and creating channel group

Enter a name for the Channel Group and a description (optional).

Assigning name and description to channel group under Channel group details

Figure 3 Assigning name and description to channel group

You can start creating Channels with the “Create channel” button once your Channel Group is created.

Channel group created and showing egress domain name. Start creating change by selecting Create channel button

Figure 4 Channel group Egress domain and ARN

Enter a name for the Channel and a description (optional), then select “Attach a custom IAM policy”.

Enter name and description of channel and select Attach a custom policy under Channel policy

Figure 5 Assigning channel name and description

The policy field expects a json snippet that allows MediaPackage to identify MediaLive through the AWS Signature Version 4 (SigV4) headers that MediaLive adds to the ingest requests. Following is a reference model for this:

{

"Version" : "2012-10-17",

"Id" : "AllowMediaLiveChannelToIngestToEmpChannel",

"Statement" : [ {

"Sid" : "AllowMediaLiveRoleToAccessEmpChannel",

"Effect" : "Allow",

"Principal" : {

"AWS" : "arn:aws:iam::AWS account number:role/MediaLiveAccessRole"

},

"Action" : "mediapackagev2:PutObject",

"Resource" : "arn:aws:mediapackagev2:AWS region:AWS account number:channelGroup/Channel group name/channel/Channel name"

} ]

}

Replace the red parts with your actual AWS account number, AWS region (e.g. “us-west-2”), Channel group name (e.g. “demo-channel-group” and Channel name (e.g. “LL-HLS-demo-channel”) values. Once your parameters have been entered, copy and paste the IAM policy into the policy field and select “Create”.

Once your channel is created, select “Settings” to view the MediaPackage HLS ingestion points. Take note of the ingestion endpoint URIs as these will be required to create the ABR encoding channel in MediaLive in the next section.

Figure 6 Channel created and showing endpoint details for ingestion of Live channel

Select “Origin endpoints” and then select “Create endpoint”. The endpoint defines the media segment parameters. Enter the name and description (optional) of the endpoint. You can either select TS or CMAF (also known as FMP4) as the container type, depending on your player support constraints. CMAF is recommended unless legacy devices are in scope. In the additional settings, the default start over window is 15 minutes (900 seconds). You can change this value up to 14 days (1209600 seconds), depending on how long you need MediaPackage to retain your ingest media segments on disk for origination.

Create origin endpoint by proving endpoint name and description. Select CMAF and segment duration as 6 seconds

Figure 7 Create origin endpoint

The Segment duration value is the total length of a full segment, generated through the concatenation of multiple shorter ingest segments. It’s not the duration of partial segments, which is defined by the duration of ingest segments coming from the ABR encoder (Set to 1 second in our example MediaLive configuration that follows). The ”Include IFrame-only streams” option will generate one distinct IFrame-only track per video rendition present in your ingest streamset. The “Enable SCTE support” will define the ad insertion behavior of your endpoint, if selected. You can find detailed information on this setting in the “SCTE-35 messages” user guide section.

Attaching a public policy to the endpoint

Figure 8 Attaching public policy to endpoint

The “Encrypt content” option allows you to configure all the encryption and DRM settings that can be applied to your endpoint. MediaPackage v2 supports different options in terms of encryption schemes and DRM types, depending on the endpoint type (TS or CMAF). Please refer to the Encryption fields and SPEKE Version 2.0 presets pages in the user guide for comprehensive information on all the exposed configuration parameters.

In the Endpoint policy section, you define the origination behavior of your endpoint. Using “Don’t attach a policy” will totally disable the origination, while using “Attach a custom policy” will allow you to restrict origination based on a variety of conditions: AWS SigV4 if your CDN supports this authentication method when forwarding requests to origins, AWS accounts, or IP ranges for other use cases. If you select “Attach a public policy”, MediaPackage will populate the policy field for you with a policy allowing public access to all of your endpoint objects, and will include the correct AWS account number, AWS region, Channel Group, and Channel name values relevant to your endpoint. For more details on these endpoint policies, please refer to the “Origin endpoint authorization” in the user guide.

On the same screen, you will now be able to create multiple manifests sharing the same media segments produced by the endpoint. LL-HLS playlists should be backward compatible with legacy HLS players – meaning that these players should be able to ignore all the new HLS tags related to the low latency mode. If some of your HLS players actually don’t properly ignore these tags, you can always create regular latency HLS streams through the first half of the Manifest definitions section.

Add a HLS manifest and provide manifest name and description. Additionally provide manifest window duration and program date/time

Figure 9 Add HLS manifest and enter details

The LL-HLS playlists should be configured in the second half of the Manifest definitions section. It’s important for the resulting latency to configure a Program date/time interval that is aligned on the partial segments duration (1 second in our reference configuration).

Add low latency HLS manifest and enter mane and description of the manifest

Figure 10 Adding low latency HLS manifest and enter details

Once created, you will get the playback URL as follows.

https://xxxxxx.egress.yyyyyy.mediapackagev2.us-east-1.amazonaws.com/out/v1/MPv2-LL-HLS-Demo/MP-LL-HLS-Demo-Channel/MP-LL-HLS-Demo-Endpoint/LL-HLS-demochannel.m3u8

Contribution encoder and MediaLive configuration

In most of our tests we used an AWS Elemental Link contribution encoder, for which Latency can be set in milliseconds. 200 milliseconds works well as a value, but you might need to increase this buffer level depending on your network conditions. On other contribution encoders you should find similar buffering parameters that you can tweak, as well as encoding options that you can simplify to reduce the encoding latency (e.g. look-ahead or B-frames). Generally speaking, it’s good to activate timecode burning on the video whenever this encoding option is available, as it allows you to get a finer grain idea of the latency split between multiple contribution/ABR encoders and the downstream packaging/delivery part of the workflow.

In the AWS Elemental Link device, enter latency 200 milliseconds

Figure 11 Latency value in Link device

As referred to earlier, partial segment length will be configured in MediaLive, which will let downstream devices know the length of partial segments in LL-HLS manifests. In the MediaLive console, create a channel, attach an input, then add an HLS output. Select the HLS output and enter the ingestion URLs created earlier.

Take note of LL-HLS origin ingestion endpoints

Figure 12 LL-HLS origin ingestion endpoints

Configure the ingestion endpoints in MediaLive HLS output URLs

Figure 13 Configure endpoint URL’s in MediaLive output destination

In the output configuration, under “Manifests and Segments”, change segment length to 1 second and in “Stream settings” change GOP size to 1 second. Since we are defining segment length of 1 second here, this will be actual length of partial segment which players will get in the LL-HLS manifests. A 1-second GOP size will ensure that each fragment created by the encoder will have a keyframe so the player can start playback. GOP size is one of the main encoding parameters that has a direct impact on video bitrate and video quality, and an indirect impact on end-to-end latency. It determines how often a keyframe (or IDR frame) will be available. In LL-HLS, the player requires a keyframe to start decoding, meaning it can start the playback only at GOP boundaries. Longer GOPs cause higher start-up delay and higher latency. Apple’s recommended GOP size is 2 seconds. Typical LL-HLS workflow implementations have about 5 seconds of end-to-end latency when the GOP is set to 1 second.

Configure segment length =1 under manifest and segments

Figure 14 Setting values in MediaLive output manifest

Set GOP Size =1 in video settings in MediaLive

Figure 15 GOP size configuration

CloudFront configuration

Amazon CloudFront is a global Content Delivery Network (CDN) that securely delivers web content to users with low latency and high transfer speeds. CloudFront consists of over 120 Edge Locations located close to your viewers. CloudFront lowers the latency of delivering content using caching, request-collapsing, and TCP optimization across Amazon’s global infrastructure. In addition, CloudFront includes Regional Edge Caches (RECs), located within most AWS regions, to provide features such as mid-tier caching.

You need to create 3 custom policies in CloudFront to use a MediaPackageV2 configuration. You will create a cache policy, an origin request policy, and a response headers policy. The cache policy will customize the cache key and the time-to-live (TTL) settings. The cache key settings help determine whether a viewer request results in a cache hit, which can help you increase your cache hit ratio. Including fewer values in the cache key settings can help increase your cache hit ratio. The TTL settings work together with the Cache-Control and Expires headers to determine how long objects in the CloudFront cache remain valid. In the CloudFront console, select your policies.

Under CloudFront console, click on Policies

Figure 16 Defining CloudFront policy

Create a custom cache policy with the following parameters and name it MediaPackage-LL-HLS-CachePolicy. Save the changes.

Under Policies, click on Cache create policy

Figure 17 Creating Cache policy

Figure 18 Configuring cache policy parameters

Now you will create the second custom policy, which is origin request policy. Some information from the viewer request( such as URL query strings, HTTP headers, and cookies), are not included in the origin request by default. You need to make sure that CloudFront will pass all the parameters required for LL-HLS to MediaPackage V2. Under the “Origin request” tab click on Create origin request policy.

Under origin request click on Create origin request policy.

Figure 19 Create origin request policy

Create a custom policy with the following parameters and name it MediaPackage-LL-HLS-OriginRequest. Save the changes.

Figure 20 Parameters for origin request policy

Finally, you need to create the third policy, the Response headers policy. This policy will specify one or more HTTP headers for CloudFront to add to the responses that it sends to viewers. With a response headers policy, you can specify the desired headers and their values without changing the origin or writing code. If your origin already sends one or more of the headers that are in your response headers policy, you can choose whether CloudFront uses the header from the origin or the one specified in the policy. You will also specify the CORS headers that CloudFront adds when it responds to CORS requests. CloudFront only adds these headers in responses to CORS requests.

Create a custom response headers policy with the following parameters and name it MediaPackage-LL-HLS-ResponseHeader-policy. Save the changes.

Under policies select 'Response Header' and Create response headers policy.

Figure 21 Configure Response header policy

Configure name and description for response header policy. Set Configure CORS as true, Access-Control-Allow-Origin as true, Access-Control-Allow-Headers as true. For Access-Control-Allow-Methods set as customize and select GET, HEAD and OPTIONS. Set ACCESS-Control-Max-Age as 86400. Leave rest of the parameters as is and save the policy.

Figure 22 Configure response header policy parameters

Now that you have created all three policies, you can create the CloudFront distribution. In CloudFront console select Create distribution.

Under CloudFront console select 'Create distribution'

Figure 23 Creating CloudFront distribution

Enter the “Origin domain” and “Origin path” from the MediaPackage endpoint you created earlier.

Please note that CloudFront will not automatically populate the origin domain name in the drop-down menu as it does for MediaPackage V1. You will need to manually enter the domain name and origin path in the respective fields.

Next you are defining one CloudFront distribution per EMP channel group. You will put the channel group name in the origin path as it will be common to all the channels which will run under this distribution. Channel group is the top-level resource, which contains channels and origin endpoints that are associated with it. All underlying channels and endpoints will then be served by the same CloudFront distribution, which will minimize the number of distributions that need to be managed.

Configure Origin domain as endpoint from MediaPackage configuration. Select HTTPS only, define origin path as '/out/v1/MPv2-LL-HLS-Demo and click save

Figure 24 Creating origin for CloudFront distribution

Next you need to define the behaviors and attach the policies to the distribution which you created in previous steps. Navigate to Cache key and origin requests and select the three policies, then leave everything default.

Under behaviors select the 3 policies created in previous steps and save

Figure 25 Selecting policies under behaviors in CloudFront console.

Save the configuration and the CloudFront distribution will be deployed in few minutes. Check the playback using the CloudFront domain name. The playback URLs will follow the {CDN-hostname}/{channel-name}/{endpoint-name}/{manifest-name} pattern. In our example, it will be as follows:

https://CloudFront_domain_name/MP-LL-HLS-Demo-Channel/MP-LL-HLS-Demo-Endpoint/LL-HLS-demochannel.m3u8

Player support status and configuration recommendations

In the Apple ecosystem, you will be able to play LL-HLS streams reliably on compiled applications that leverage AVplayer. This player doesn’t integrate a drift compensation latency mechanism that would skip in time or accelerate playback rate when drift happens, so you will have to implement the mechanism of your choice. Once you’ve implemented it, the URL should play fine on macOS, iOS17/ipadOS17 and tvOS17. It will also play on versions 14 to 16 but with less-optimized heuristics. Direct playback in Safari mobile is not a suitable option, as this browser still needs multiple improvements (bitrate up-switching, playhead positioning predictability, and drift compensation) to offer a good user experience with LL-HLS streams.

There is still an option to properly play LL-HLS streams on Safari mobile, and that is through the use of the hls.js player which can leverage the new (as of iOS17) Managed Media Source API . There are a couple of LL-HLS related improvements planned for hls.js v1.8 (especially the support for EXT-X-PRELOAD-HINT signaling parts not yet available) but this player is stable and usable in production since version 1.4.2, on all browsers including Safari mobile. The recommended settings are the following:

{

“debug”: false,

“enableWorker”: false,

“lowLatencyMode”: true,

“backBufferLength”: 90,

“maxLiveSyncPlaybackRate”: 1.05,

“liveDurationInfinity”: true

}

The maxLiveSyncPlaybackRate parameter impacts both the speed at which the player will catch up with the live edge time in case of latency drift and the audio pitch. Past 6% of audio pitch acceleration, it’s generally considered that the human perception will detect it. Therefore the value of maxLiveSyncPlaybackRate needs to be a careful trade-off between how fast you want to reduce latency drift and how important it is to preserve the audio perception. A too-aggressive value will also be perceptible visually, when the player will fast-forward in the video timeline. Playback rate acceleration is also good to balance small latency drifts, but it’s not a good approach if your player was on a suspended browser tab and suddenly wakes up, as an example. In this case you would probably want it to jump straight to the live edge time, which can be achieved by adding the two following parameters to your hls.js configuration:

"liveSyncDurationCount": 0
"liveMaxLatencyDurationCount": 6

The first parameter tells the player to position the playhead at the live edge, at the beginning of the playback session. The second one tells the player to jump to the live edge if the playhead is more than 6 full segments duration or more behind the live edge.

In regards to other open source players, Exoplayer on Android also includes a production-grade implementation, which doesn’t require specific configuration to reach optimal latency. Shaka Player also has a deployment visible in its nightly player build, but at the time of writing this blog post, we haven’t achieved satisfying results with this player.

On the commercial players side, we successfully validated playback on THEOplayer and JW Player. Additional player partners are currently working on their implementations.

Latency results

Using the workflow described previously (Link > MediaLive > MediaPackage > CloudFront > Player), we tested two scenarios leveraging different MediaLive segment and GOP durations – 1 second and 2 seconds, resulting respectively on the MediaPackage output into 1 second parts with 3 seconds PART-HOLD-BACK value and 2 seconds parts with 6 seconds PART-HOLD-BACK value. The first scenario prioritizes the latency, while the second one prioritizes the encoding efficiency. Using the players that we consider production-ready, we obtained these results:

Parts duration	PART-HOLD-BACK	hls.js (v1.4.2)	AVPlayer (v16.5)	Exoplayer (v2.18.6)	THEOplayer (v4.11)	JW Player (v8.27)
1 second	3 seconds	5.0s latency	5.95s latency	4.04s latency	4.70s latency	5.90s latency
2 seconds	6 seconds	8.0s latency	9.75s latency	5.32s latency	6.06s latency	9.45s latency

As a comparison, an alternative workflow where the ABR encoding is done on-premises (AWS Elemental Live > MediaPackage > CloudFront > Player) will result in a latency reduced by 400ms.

Conclusion

This blog post explained how to set up end-to-end LL-HLS workflows leveraging MediaLive, MediaPackage, and CloudFront. The walkthrough includes all the configuration parameters required for reducing latency and provides detail about commercial and open source video players – allowing you to reduce glass-to-glass latency to 5 seconds. Give it a try, it’s easy to configure! Low latency allows you to compete with broadcast latency and to create new user experiences not possible before. Stay tuned for further latency improvements and additional low latency standards support on the AWS for M&E Blog.

AWS for M&E Blog