Live content moderation using machine learning
Managing live channels requires humans to monitor streams for unexpected content. It is particularly important to detect harmful content quickly and take appropriate measures for moderation. This guide describes the steps you can take to implement an automatic solution to extract, store, analyze, and sort images from your live channel using cloud services from Amazon Web Services (AWS) to help maintain a video stream that is clear of offensive content. If content deemed harmful is identified, it is possible to stop a live channel. For reference, and if you need to check that your stream contains the expected content, read the Automate broadcast video monitoring using machine learning on AWS blog post.
For this example, we use the following Amazon Web Services (AWS) services:
- Amazon Simple Storage Service (Amazon S3), object storage built to retrieve any amount of data from anywhere, which stores extracted images
- Amazon Simple Notification Service (Amazon SNS), a fully managed messaging service for both application-to-application and application-to-person communication, which notifies stream owners if potential harmful content has been detected
- AWS Elemental MediaLive, a broadcast-grade live video processing service, which streams the live channel and extracts images
- Amazon Rekognition, a solution to automate image and video analysis with machine learning, which analyzes images and returns moderated content criteria scores
- AWS Lambda, a serverless and event-driven compute service, which initiates image analysis, sorts the image based on analysis results, and stops the live channel if content is inappropriate
AWS Elemental MediaLive automatically extracts and uploads an image from the video stream to Amazon S3 every 30 seconds. An AWS Lambda function requests Amazon Rekognition to analyze the image. Amazon Rekognition uses a built-in model to detect moderated content and returns to the AWS Lambda function the confidence for each criterion. The AWS Lambda function then makes a decision based on the Amazon Rekognition output. If moderated content is detected, the AWS Lambda function updates the AWS Elemental MediaLive channel to stop the channel and uploads the image to a location dedicated to harmful content and a notification is sent through Amazon SNS. If no offensive content is identified, the AWS Lambda function uploads the image to an Amazon S3 bucket folder dedicated to harmless content, without altering the AWS Elemental MediaLive channel.
For further review by a moderator, the upload step is required.
The Lambda function will use Python 3.8 runtime.
To complete this how-to guide, you need access to the following:
- A Linux system to run Shell and Python commands
- AWS Elemental MediaLive to stream content and extract images
- Amazon S3 to store the images from AWS Elemental MediaLive
- AWS Lambda to initiate Amazon Rekognition analysis and make decisions
- Amazon Rekognition to analyze images
- Amazon SNS to be notified when moderated content is detected
- AWS Identity and Access Management (AWS IAM), which provides fine-grained access control across all of AWS
Because Amazon S3 content needs to be in an Amazon Rekognition Region, check that Amazon Rekognition is available in your Amazon S3 bucket Region or export frames to an AWS Region supporting Amazon Rekognition. Please check Amazon Rekognition availability here.
Amazon S3 is used to store the images extracted from the video live stream, upload the images for review, and potentially store a replacement video.
The upload to Amazon S3 is used to initiate the AWS Lambda function to run on the latest frame sent to the source Amazon S3 bucket.
Your channel can use single or standard pipeline. Single pipeline requires users to set one destination (an Amazon S3 bucket) for a frame capture group. Standard pipeline offers resiliency through dual output. Therefore, it is necessary to create at least two Amazon S3 buckets (AWS Elemental MediaLive needs two destinations). These two buckets could be, for example, “my-movies-bucket” and “my-bucket2.” Standard pipeline affects costs because it doubles the amount of stored data.
This example is based on single pipeline.
An Amazon SNS topic is required to publish messages each time moderated content is identified to get event details (picture name and channel ID).
To create an Amazon SNS topic, follow these instructions:
On the AWS console, go to Amazon SNS service and create a topic.
Once you have created a topic, note its Amazon Resource Name (ARN).
AWS Identity and Access Management (IAM)
The AWS Lambda function requires permission to implement the following actions:
- S3:GetObject on your input buckets
- S3:PutObject to sort the images in an output bucket after analysis
- MediaLive:StopChannel to stop the channel
- AWSLambdaBasicExecutionRole (AWS managed policy) to upload function logs to Amazon CloudWatch, a monitoring and observability service for AWS resources and applications on AWS and on-premises
- sns:Publish to notify the channel’s administration team
An execution role (named “lambda-reko-content-detection” in this example) needs to be created for the AWS Lambda function to access the services. This role is restricted to AWS Lambda through a trusted relationship. More details on AWS IAM roles can be found in the AWS IAM documentation.
- Open the AWS console, browse to the AWS IAM service, and then create a new policy.
- Adapt the following AWS IAM policy example by replacing the AWS Elemental MediaLive channel, Amazon S3 buckets, and Amazon SNS topic ARN and save it.
- Create a new role. Select the AWS Lambda service as a trusted entity.
- Attach the permission policy created previously.
- Give a name to this role and save it.
AWS Elemental MediaLive
In the AWS Elemental MediaLive console, edit your AWS Elemental MediaLive channel and add an output group through the following steps:
- Select Frame capture.
- Configure its destination to your Amazon S3 buckets (it is mandatory to set a file name, “asset.m3u8” in this example) and then configure content delivery network settings to “Frame capture S3.”
- If your channel has a standard pipeline, set two destinations for a frame capture group. This second destination isn’t used in this example.
- Configure the frame capture output to set resolution and capture frequency.
Initiate content analysis through Amazon Rekognition and move the captured frame to the correct destination (approved or offensive):
- First, create your AWS Lambda function in the same AWS Region as Amazon S3 (and as Amazon Rekognition). You can use arm64 architecture, which is less expensive because the function does not require x86.
- Attach the execution role created earlier.
The following settings are sufficient for your function.
In Designer, click Add trigger and configure your Amazon S3 bucket and directory where capture images will be uploaded by AWS Elemental MediaLive.
Copy the following code into your AWS Lambda source code and set the DestinationBucket (line 7), DestinationDirectory (line 8), DestinationBucketRegion (line 9), ChannelId (line 11), ChannelRegion (line 12), and SNSTopicArn (line 13).
- Save your AWS Lambda function.
We use FFmpeg to generate a Real-time Transport Protocol (RTP) stream as an input to AWS Elemental MediaLive. You may use any RTP stream as a source.
IMPORTANT LEGAL NOTICE: Before you start, check that you are familiar with the terms of the FFmpeg license and legal considerations as listed here. The FFmpeg static build used in this demo is licensed under the third version of the GNU General Public License as mentioned here.
- Send the stream to your AWS Elemental MediaLive endpoint.
- In order to simulate a piece of offensive content, upload an image containing moderated content into the Amazon S3 source bucket.
- The AWS Lambda function is initiated, and a decision is taken to stop the live channel.
Notes on services
In this example, AWS Elemental MediaLive channel uploads an image to Amazon S3 every 30 seconds, leading to 120 invocations per hour, hence 2,880 invocations per day.
You can set a shorter time period for faster detection of moderated content; however, this will increase the number of calls and affect the cost.
The longest runtime occurs when moderated content is identified, so there is an extra step in our AWS Lambda function: stop the AWS Elemental MediaLive channel.
The following pricing is based on AWS US East (Ohio) Region prices. If you plan to use other AWS Regions, please read service pricing in your target AWS Regions.
Both x86 and arm64 architecture gave the same runtime.
The average billed time during tests was 1.3 seconds.
The maximum billed time was 2.5 seconds.
The AWS Lambda free tier includes one million free requests per month and 400,000 GB-seconds of compute time per month.
Because our AWS Lambda function is run every 30 seconds, in a month of 31 days, it runs 89,280 times.
As per our tests, mean runtime is 1.3 seconds, and memory consumption is 128 MB.
This leads to a total of 116,064 seconds per month (89,280 implementations × 1.3 seconds).
Because we use 128 MB of memory, the consumption of gigabyte-seconds per month is equal to 116,064 seconds × 128 MB / 1024. In other words, it is equal to 14,508 GB-seconds per month.
If you are not using the free tier, the AWS Lambda cost is 14,508 GB-seconds × $0.0000133334/GB-seconds. In other words, it will be equal to $0.193440967.
Because the number of implementations per month is lower than a million, the AWS Lambda free tier excluded, and the requests will cost an extra $0.20 per month, leading to a maximum of approximately $0.40 ($0.1934409 67 + $0.20) per month for a 30-second frequency.
More details on AWS Lambda pricing is available here.
The first million images per month are billed at $0.001 per image. If we consider 2,880 images per day, the maximum billed amount is $89.280 per month.
More details on Amazon Rekognition pricing is available here.
Because Amazon S3 and AWS Lambda are in the same AWS Region, there is no data transfer charge, as per this documentation.
Storage highly depends on capture resolution and content. Using a 1920×1080 resolution, content, and compression in tests, AWS Elemental MediaLive never output a file larger than 250 KB.
On a base of 89,280 images per month, the total storage amount is equal to 21,796.875 MB (89,280 × 250 KB / 1024).
Because we push captured frames to one destination and the first 50 TBs are charged at $0.023 per GB-month, the storage cost is $0.501328125.
As long as AWS Elemental MediaLive is configured to upload frame captures to Amazon S3, overall cost will increase. You may consider moving the images to a different storage class or deleting them. You can automate the transition to a different storage class or automate file deletion through lifecycle policies, as documented here.
More details on Amazon S3 pricing is available here.
In the worst-case scenario, we stop the AWS Elemental MediaLive channel for every AWS Lambda invocation, hence a maximum of 89,280 application programming interface (API) calls per month.
Service is charged at $0.30 per one million API calls and $0.017 per gigabyte of payload data.
Because our message is only a few characters long, our function won’t send more than a few megabytes per month.
More details on Amazon SNS pricing is available here.
The overall cost for an image analysis every 30 seconds can be estimated at less than $91.01 per month per live channel.
The following table details monthly costs based on frame captures every 30 seconds, 10 seconds, and 1 second.
|Capture frequency||AWS Lambda implementation cost||Amazon Rekognition cost||AWS Lambda invocation cost||Amazon S3 cost||Amazon SNS cost||Total monthly cost|
|Every 30 seconds (2,880 per month)||<$0.20||$89.280||$0.20||$0.50||$0.317||<$90.5|
|Every 10 seconds (8,640 per month)||<$0.60||$267.84||$0.20||$1.50||$0.317||<$270.5|
|Every 1 second (86,400 per month)||<$6||$2,678.4||$0.20||$15.04||$0.317||<$2,700|
In this post, we created a workflow that empowers detection of moderated content in a live channel, stores the captured frames for further analysis, informs our media administration team when moderated content is detected, and stops the live channel. Visit this page to learn about multi-modal content moderation capabilities in AWS AI solutions and services, try all services free of cost, and get in contact with our team to brainstorm ways to protect your users, brands, and online communities with accuracy and at a lower cost.
You can now deploy this workflow in your AWS account and start moderating your live channels.
Learn more about content moderation on the AWS Media Blog, including a post about how to Automate broadcast video monitoring using machine learning on AWS and how to Use machine learning to filter user-generated content and protect your brand.