Gain observability of live streaming workflows with AWS Elemental MediaLive and AWS Elemental MediaPackage

Amazon Web Services (AWS) offers a wide range of media services with which you can build live and on-demand video workflows. In this blog post, we discuss how users can gain observability by integrating scattered metrics and analyzing logs produced by AWS Media Services when configuring a live streaming workflow.

Use metrics from Amazon CloudWatch, which collects and visualizes near-real-time logs, metrics, and event data in automated dashboards, to configure an integrated dashboard for live streaming workflow monitoring.
Use Amazon CloudWatch Logs and AWS CloudTrail (which monitors and records account activity across AWS infrastructure) logs for log analysis of AWS Elemental MediaLive, (which encodes live video for broadcast and streaming to any device) and AWS Elemental MediaPackage, (which prepares and protects video for internet delivery) services.
Set up an event-based alarm that can respond in near real time in the event of a live streaming workflow problem.

Solution overview

Live streaming consists of content ingest, processing, origination, and delivery. Each stage can be configured using AWS Media Services as follows:

Source (ingest). Streaming videos can be ingested in various ways, such as URL_PULL, RTMP_PUSH, RTMP_PULL, RTP_PUSH, AWS Elemental MediaConnect (secure, reliable live video transport), and AWS Elemental Link devices (which connect a live video source, like a camera or video production equipment, to MediaLive).
MediaLive (processing). This broadcast-level video processing service compresses live sources into high-quality streams by encoding videos in near real time.
MediaPackage (origination). This origination service ingests streams from MediaLive and packages them in near real time using just-in-time packaging and provides a content encryption feature.
Amazon CloudFront (delivery). This content delivery network service delivers content at low latency and high transfer speeds.

Generic Live Streaming Architecture using AWS Elemental services

Over time, users might have requirements to monitor the status of each media workflow, such as metrics and logs, and to receive alarms if necessary. However, a problem might occur when users try to find logs to determine an issue. Because various logs and metrics can be scattered, this often leads to too much time spent determining the cause of an issue in an emergency. To resolve this situation, use the following steps to configure a monitoring system to integrate metrics provided by each service and receive event-based alarms.

Final Architecture, which you can configure after following below steps

Step summary

Step one: configure an Amazon CloudWatch dashboard with metrics.
Step two: use Amazon CloudWatch logs.
Step three: use AWS CloudTrail logs.
Step four: enhance monitoring with Amazon CloudWatch Events, which delivers a near-real-time stream of system events.

Prerequisites

To follow the instructions in this blog post, you need to prepare the following:

AWS Account
Media workload

*If you don’t have your own workload, see the next section.

Configure with an AWS CloudFormation template

Click CloudFormation Template Link and click
Enter the stack name and click Next twice without any additional settings.
Checkmark the box at the bottom of the page (related to identity and access management) and click
Go to the Events tab to see resources that are created or being created.

*This template automatically configures MediaLive and MediaPackage resources

CloudFormation Resources generated after deploying CloudFormation template

5. Search MediaLive in the console search box and access the MediaLive console to verify that the following channel is configured.

MediaLive Channel configured after deploying CloudFormation

6. Click the channel name and then click Start on the upper right and check to confirm that pipelines are running.

Medialive channel with duplicated pipelines

7. Access the MediaPackage console and click the channel ID (my-mediapackage-channel).

8. Click the Preview button and verify the streaming video.

MediaPackage channel created via CloudFormation. Click 'Preview' for review.

Congratulations! You have completed the prerequisites for the lab.

Step one: Configure an Amazon CloudWatch dashboard with metrics

Amazon CloudWatch collects raw data received from media services, converts it into readable metrics in near real time, and keeps them for 15 months. Users can monitor critical media service metrics, such as network input/output, video frame rate, and lost frames, and configure dashboards to gain visibility. Use the following steps to configure the dashboard comprising key metrics of MediaLive and MediaPackage.

Go to the AWS Management Console → Amazon CloudWatch and click
Click Create dashboard and enter LiveStreamingDashboard as the dashboard name.
Click + on the right upper side after creation and click Widget type.

Click + on the right upper side after creation and click Widget type

Configure MediaLive metrics

The metrics used to configure the MediaLive dashboard are as follows:

Configure the MediaLive Channel section (Widget type: text).
Enter your statement, which will be used as a section to separate the dashboard as shown in the example below, and click Create widget.

Create a text widget and enter your statement

2. Network input/output indicators for Pipeline0 and Pipeline1, respectively (Widget type: Line → Metrics → MediaLive → ChannelId, Pipeline)
NetworkIn: Traffic for push and pull received by MediaLive. If you set the average traffic rate to capture over a long period of time and then change the period to a short period of time, you can find deviations from the normal rate or gather information about the degree of burst in a channel.
NetworkOut: The rate at which traffic exits from MediaLive. It contains media output, HTTP GET requests, and Network Time Protocol and Domain Name Service traffic.

We want to distinguish the indicators for each pipeline in MediaLive. Therefore, first select NetworkIn, NetworkOut for Pipeline0 and click Create widget.

First select NetworkIn, NetworkOut for Pipeline0

Then, create indicators for Pipeline1 in the same way. Consequently, you will see the following screen.

*Change the configuration of the dashboard using drag and drop.

Change the configuration of the dashboard using drag and drop.

Rename the widget.

3. InputVideoFrameRate (select both Pipeline0, Pipeline1)
(Widget type: Line → Metrics → MediaLive → ActiveInputFailoverLabel, ChannelId, Pipeline)
Monitor input frames by pipeline. If the count metric of the input frames is not stable, you need to investigate whether there is a problem with the source or with the network between the upstream system and MediaLive.

4. ActiveOutputs (select both Pipeline0, Pipeline1)
(Widget type: Bar → Metrics → MediaLive → ChannelId, OutputGroupName, Pipeline)
Number of outputs successfully transmitted to the destination. The absence of a data point means that no output audio has been generated on the channel and might be still starting or waiting for initial input.

Configure the contents and click the Save button in the upper right to save the dashboard. Alternatively, toggle Autosave: Off to Autosave: On to avoid loss in case of an emergency.

Toggle Autosave: Off to Autosave: On to avoid loss in case of an emergency.

Thus, you can configure the following dashboards. When you create a MediaLive Channel with standard class, two encoder pipelines (Pipeline0, Pipeline1) are created. Monitoring metrics such as network input/output for each pipeline provides better visibility and insight.

A dashboard you can configure after following above steps.

Configure MediaPackage metrics

The metrics used to configure the MediaPackage dashboard are as follows:

Configure MediaPackage section (Widget type: text).
Enter your statement, which will be used as a section to separate the dashboard as shown in the following example, and click Create widget.

Create a text widget for AWS Elemental MediaPackage and enter your statement

2. IngressBytes (select two channels) (Widget type: Line → Metrics → MediaPackage → Ingest Endpoint per Channel)
The number of bytes of content that MediaPackage receives for input requests.

3. IngressResponseTime (select two channels) (Widget type: Line → Metrics → MediaPackage → Ingest Endpoint per Channel)
The amount of time it takes for MediaPackage to process each input request. Data is not provided if the request is not received for the specified interval.

Select IngressResponseTime Widget (Widget type: Line → Metrics → MediaPackage → Ingest Endpoint per Channel)

4. EgressRequestCount (Widget type: Number → MediaPackage → Per Origin Endpoint)
Number of content requests that MediaPackage receives.

5. EgressBytes (Widget type: Number → MediaPackage → Per Origin Endpoint)
Number of bytes sent successfully by MediaPackage for each request.

6. EgressResponseTime (Widget type: Number → MediaPackage → Per Origin Endpoint)
Time taken for MediaPackage to process each output request.

Configure the contents and click the Save button in the upper right to save the dashboard. Alternatively, toggle Autosave: Off to Autosave: On to avoid loss in case of an emergency.

When you configure MediaPackage metrics, you can configure the following dashboards along with existing MediaLive metrics.

*The following configuration might vary from user to user, and the indicators might look different depending on the state of the network.

A dashboard you can configure with AWS ElementalLive and AWS Elemental Package.

By configuring dashboards, users can monitor the status of MediaLive and MediaPackage by channel, either by pipeline or by endpoint. Consequently, one can detect segment-specific issues and anomalies and make decisions quickly.

An Architecture you have configured by configuring a dashboard.

Step two: Use Amazon CloudWatch Logs

MediaLive and MediaPackage provide useful logs in different ways.

MediaLive logs

MediaLive produces a channel log that contains detailed information about activity and provides sequential descriptions of the activity. Logs are useful when the indicators you see on the dashboard do not provide sufficient information. In addition to the as-run logs provided by default, you can activate encoder logs to receive rich logs. Make sure the channel is idle before changing the log level.

Choose Channel à If the state is Start, change it to Stop à Edit Channel à General settings à Change Log level of Channel logging. (In this lab, change it to INFO.)

After activating logs, you can access them by clicking Amazon CloudWatch → Logs → Log groups. You can also access them directly from the MediaLive console. You might not see the log immediately after changing the log level, so you can check the log quickly by stopping the channel and running Start.

You can access to MediaLive pipeline directly from the console.

You can view logs configured in JSON format in sequence. Examples include (1) probing input media, (2) found master manifest (check the path to M3U8 file), and (3) activity logs by pipeline and events at specific points in time.

Log examples (1) probing input media, (2) found master manifest (check the path to M3U8 file), and (3) activity logs by pipeline and events at specific points in time.

MediaPackage logs

MediaPackage provides access logging that captures detailed information about requests sent to channels or packaging groups. The log contains information such as the time the request was received, the client’s IP address, the delay time, the request path, and the server response. It’s an optional feature, so you need to activate it additionally. Choose MediaPackage Channels → Edit → Enable Ingress/Egress access logs → Update. After it’s updated, go to the Settings tab to make sure logs are enabled.

After activating logs, you can check ingress/egress logs by clicking on each log group name. You can also access by clicking CloudWatch → Logs Insights → Select log groups → /aws/MediaPackage/EgressAccessLogs. You can click LogStream of the desired timeline and look up the details for each timestamp as shown in the following.

*It takes time to collect after activating logs, so it is recommended to set aside 5 minutes before verification.

Example of MediaPackage logs

Step three: Use AWS CloudTrail logs

AWS CloudTrail lets you discover whether a user or role with credentials performed the service. You can simply go to AWS CloudTrail → Event History to discover logs of the latest events. Using the information gathered in AWS CloudTrail, you can check who requested on MediaLive and MediaPackage and when those requests were made.

Use event history

Users can monitor event history in detail by filtering by event source or username. For example, if you’re curious about the detailed history of MediaPackage, first set Event source for Lookup attributes and query mediapackage.amazonaws.com to filter history with specific conditions.

In AWS CloudTrail, you can access event history to monitor events in detail.

For further information, click Event name to view the detailed logs as shown in the following.

For further information, click Event name to view the detailed logs

Use AWS CloudTrail Lake

AWS CloudTrail Lake is a fully managed data lake for storing and analyzing AWS CloudTrail logs, so that you can collect, store, optimize, and query events from multiple regions and accounts. Previously, users sent AWS CloudTrail logs to Amazon Simple Storage Service (Amazon S3) (object storage built to retrieve any amount of data from anywhere) and analyzed them through Amazon Athena, a serverless, interactive analytics service. But AWS CloudTrail Lake lets them analyze logs in an integrated environment.

Go to AWS Management Console → CloudTrail → Lake and click Create event data store.
Enter “Name” and click Create.
You will see the created event data stores as shown in the following.
Click Run query.

Click Run query to use AWS CloudTrail Lake.

5. Write and run SQL queries.

6. See example queries on the Sample queries tab if needed.

Query example

[Situation]

The user has recently confirmed that MediaLive, which had been in the Stop state, has changed to the Start state. The user would like to check who, among the 50 team members, started the channel.

[Action]

Enter the code in Query1 as shown below and click Run.

For ##Event data store ID##, enter your unique ID. (Check your ID location as shown in the following picture.)

SELECT eventID, eventName, eventSource, eventTime, userIdentity.arn AS user
FROM ##Event data store ID##
Where eventname = ‘StartChannel’

The user will thus get the desired information, including the time of the event and who started the channel.

Query example using the code below. SELECT eventID, eventName, eventSource, eventTime, userIdentity.arn AS user FROM ##Event data store ID## Where eventname = 'StartChannel'

Step four: Enhance monitoring with Amazon CloudWatch Events

By integrating with Amazon CloudWatch Events, you can receive alerts for specific events that affect channels and endpoints. This lets users receive issues in near real time and respond quickly. Here, we’ll describe how to send alarms through Amazon Simple Notification Service (Amazon SNS), a fully managed publish/subscribe for application-to-application and application-to-person messaging.

Create subscription

Go to AWS Management Console → Amazon Simple Notification Service (Amazon SNS) → Topics and click Create topic.
Click Standard for type, enter Name field, and click Create topic.
Click Create Subscription after the topic is created.
Select Email from the drop-down list in the Protocol field, and enter the actual email address for the endpoint.
Click Create subscription.
Check the email you have entered and click on Confirm subscription.

Check the email you have entered and click on Confirm subscription.

Create Amazon CloudWatch Events

Go to AWS Management Console → CloudWatch → Events → Rules. (This directs you to Amazon EventBridge, a serverless service that uses events to connect application components together.)
Click Create rule as shown in the following.

Go to AWS Management Console → CloudWatch → Events → Rules. (This directs you to Amazon EventBridge, a serverless service that uses events to connect application components together.)

3. Enter “Name” and click

4. Drag the screen down to check the event pattern.

5. Choose MediaLive for AWS service and choose MediaLive Channel State Change for event type.

Choose MediaLive for AWS service and choose MediaLive Channel State Change for event type.

*In addition to the above example, event types that MediaLive and MediaPackage can offer are as follows:

MediaLive

MediaLive Channel State Change: alarm for pipeline start/stop.

MediaLive Channel Alert: alarm when network data is not received.

MediaLive Multiplex State Change: alarm for pipeline start/stop for multiplex.

MediaLive Multiplex Alert: alarm when data such as User Datagram Protocol input is not received.

MediaLive Channel Input Change: alarm when input is switched and a change occurs.

MediaPackage

MediaPackage Input Notification: alarm when maximum input stream is exceeded or when there is an ingest issue such as input switch.

MediaPackage Key Provider Notification: alarm if the endpoint is using content encryption and cannot reach the key provider.

6. Choose SNS topic in the Select a target drop-down list and choose a topic that you created.

7. Leave the other settings as they are; click Next and then click Create.

Choose SNS topic in the Select a target drop-down list and choose a topic that you created.

This event will send an email alarm whenever the MediaLive channel state changes (Start, Stop).

Example

Go to AWS Management Console → MediaLive and click Channels.
Find your channel and click

Go to AWS Elemental MediaLive, find your channel and click Stop.

3. Log in to your email and click the message named “AWS Notification Message.”

4. You can see notification history such as “pipelines_running_count”:0,”state”:”STOPPED”.

Monitoring system configuration diagram

The configuration diagram of the monitoring system using Amazon CloudWatch metrics, logs, and event-based alarms is shown in the following. As illustrated:

Use the near-real-time indicators of MediaLive and MediaPackage to configure an integrated dashboard.
Analyze resource health using Amazon CloudWatch Logs and Amazon CloudWatch Logs Insights.
Monitor AWS CloudTrail Event history to track activity history, and use the AWS CloudTrail Lake query features for near-real-time investigation.
Configure event-based alarms with Amazon CloudWatch Events and Amazon SNS.

Final architecture

Clean up

Resources used in this blog post should be deleted to prevent unnecessary billing in the future. If you implement an AWS CloudFormation template, you will need to delete it through several procedures.

Go to the MediaLive Channel and press Stop to change its state to idle. (If it’s running, you will get DELETE_FAILED.)
Go to AWS CloudFormation, click the stack you installed, and then click On the Resources tab, verify that all resources have been deleted.

Go to AWS CloudFormation, click the stack you installed, and then click DELETE. On the Resources tab, verify that all resources have been deleted.

3. Go to Actions to delete AWS CloudTrail Lake. On Change Termination Protection, click disable before delete.

4. Delete Amazon CloudWatch and Amazon SNS.

Summary

This blog post introduced a way for you to gain observability of MediaLive-based and MediaPackage-based live streaming workflows and monitor your environment by using metrics, logs, and alarm features. You don’t need to access MediaLive and MediaPackage consoles separately to check the limited metrics and aggregate additional information from scattered logs. By building an integrated monitoring dashboard with Amazon CloudWatch metrics and event-based alarms, you can enhance operational reliability. In addition, AWS CloudTrail log records and AWS CloudTrail Lake search capabilities make it simple to obtain sensitive information, such as account-specific activity or the cause of a specific problem. As a result, you can gain observability of live streaming workflows without buying a solution or spending a lot of time.

AWS for M&E Blog