AWS for M&E Blog
What is origin storage?
(or Why We Need AWS Elemental MediaStore)
Streaming live video online is hard. You may think completing the Kessel Run in less than 12 parsecs is hard, but that’s nothing compared to live streaming.
In the early days, back when the internet was young and users would get online by calling a phone number and hearing the screeching and pings of dial-up connections, online video was still hard, but for different reasons than today.
Back then, broadband connections didn’t exist, and any online live video had to be compressed to fit into the 56 Kbps (kilobits per second) bandwidth that dial-up connections provided. Video came in postage-stamp sizes with some resolutions barely breaking out of double digits.
With the rise of broadband and advances in video compression technology, by 2008, online video was a much bigger business. Protocols had moved from Real Media to Flash video with RTMP, average bandwidths had risen tenfold, and audience appetite had increased to the point where it became a challenge to meet demand for delivery of live online video.
Real Time Streaming Protocol (RTSP) and Real Time Messaging Protocol (RTMP) both stream by establishing and maintaining a persistent connection from server to client. This is great for low-latency video streaming, but requires specialized servers, and video served this way cannot be cached. This means as the size of the audience grows, the size of the streaming server pool has to be increased to cover the additional viewers. Content Delivery Networks were used to deal with this requirement, but distribution was still expensive and had a finite capacity.
The next advancement came with a switch to HTTP distribution (Hyper Text Transfer Protocol, the same protocol used for web content delivery to browsers) for live online video. This protocol had been used for on-demand video content for a while, by playing video as a progressive download. Some will remember the days of waiting for the grey line to progress enough before hitting play and hoping that the rest of the video would complete downloading before the red line caught up. The change also meant a switch to smaller chunks of video, and then Adaptive Bitrate (ABR) video. This enabled HTTP distribution of live video with the added benefit of providing a better-quality experience, with clients able to switch between bitrates as network conditions change. This means a viewer could avoid buffering video (the red line never catches up with the grey line) by dropping to a smaller bitrate stream (the grey line accelerates, if required).
Most HTTP video protocols follow the model of using a master playlist or manifest that points to a list of secondary playlists or manifests, which describe the available bitrates for a live stream. Each bitrate-specific playlist or manifest has a list of the video chunks that make up the live stream, and is constantly updated as new chunks are created. (There are also some HTTP protocols that use rule-based naming for video chunks to avoid having to constantly update playlists.)
Take Apple’s HTTP Live Streaming (HLS) protocol as an example. The manifest (.m3u8 format) is updated as each new video chunk for the live stream is created. Video chunks are uniquely named and can be cached to optimize efficiency for distribution to large audiences. The manifest can also be cached, but only for a few seconds (best practice is to cache for half the duration of a video chunk). This provides the benefit of caching for peaks of requests, while also ensuring that latest versions of a manifest are served up after being updated.
Fig.1 HTTP ABR HLS
HTTP distribution helped address the challenge of delivering live streams to growing audiences and with increased quality. By 2012, HD-quality live streams had raised bitrates to over 3 Mbps (megabits per second), and hundreds of thousands of viewers were watching live streams at the same time. HTTP distribution also enabled new features for live online video that had not been possible before. Video chunks for live streams could be stored instead of just broadcast and lost, which meant DVR-like functionality was now possible, as were pausing and rewinding a live stream, re-starting a live broadcast from the beginning of the program, and creating on-demand versions of live shows or events without needing another transcode pass. All of these features add value for audiences beyond what is possible for traditional broadcast. However, these new benefits brought with them new challenges by introducing a requirement for storage that did not exist before. This is where origin storage for live video streaming becomes essential.
When considering storage for live video streams, the cloud, with object-storage like Amazon S3, is a very attractive solution. It provides elastic scale, durability, and security, with no up-front cost and pay-as-you-go pricing. For live streaming, you don’t want to run out of storage during an event, nor do you want to lose content. The new challenge then becomes meeting the performance demands of live streaming while maintaining all the other benefits of the cloud.
Fig. 1 shows the flow for HTTP ABR video streaming. The encoder in this example is writing out a chunk of video every four seconds, for each bitrate in the ABR set. These chunks of video have to be available to read immediately after they have been written. The encoder is also updating the HLS manifest for each bitrate after each video chunk is created. The manifest must keep the same name, but have updated content, so that the latest version is the one served for new requests. This must be maintained for 60 writes (30 video chunks + 30 updates to the manifest files) and varying volume of reads for every 20 seconds of video. If either the video chunk fails to be available, or an old version of the manifest is served, audiences of the live stream will see buffering video or errors, and have a poor-quality experience as a result.
This performance challenge is compounded by the desire to reduce the latency of live streams. With all the benefits of HTTP distribution, one downside compared to RTSP and RTMP is the increased end-to-end latency of streams. It is not uncommon for live online video to be 30 to 40 seconds behind real time as video chunks are cached at multiple stages of distribution and on the playback client. That’s compared to around 5 seconds or less for RTMP. One way to address this is by reducing the duration of video chunks, which increases the rate at which video files are written and manifests updated. And with higher and higher audience numbers, while the writes and updates are flying in, the reads also have to serve content quickly and consistently maintain a high-quality viewing experience.
This is why a reliable, high-performance origin for live streaming is important. What you need is storage that can deal with continuous, frequent, and fast writes, with high volumes of continuous, frequent, and low latency reads, with an absolute requirement that each write be available to read immediately after the write is complete, and that updated versions be served immediately after the updates are complete. Easy, right (write)?
Object storage on its own struggles. Adding caches and faster file-based storage helps but adds complexity and infrastructure to manage. A service that addresses the challenges of serving live online video is ideal.
This is where AWS Elemental MediaStore comes in. Simply put, it provides the performance, consistency, and low latency required to deliver live streaming video content. And, for anyone wanting to optimize for low end-to-end latency while retaining the the scale of HTTP distribution, AWS Elemental MediaStore enables solutions that can be faster than broadcast TV.
So, streaming live video online is hard, but we’ve made it easier by dealing with a lot of the challenges that make it so difficult.
Finally, looking ahead, as low latency live streaming is a hot topic, we’ll have a series of blog posts about how to compete with broadcast levels of latency using current HTTP/ABR streaming technologies.
Part one on how to define and measure latency is here.