AWS Machine Learning Blog
Video analytics in the cloud and at the edge with AWS DeepLens and Kinesis Video Streams
September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details.
Yesterday we announced the integration of AWS DeepLens with Amazon Kinesis Video Streams, allowing you to easily and securely stream a video feed from AWS DeepLens to Amazon Kinesis Video Streams for analytics, machine learning and other processing.
To help you understand the solution that integrates AWS DeepLens and Kinesis Video Streams, we’ll recap the concept of inference in machine learning (ML). Inference is the process where the trained machine learning model is used to predict the new data sample. In IoT deployment scenarios, where there is a connected device, such as a camera, we can build and train the models in the cloud, and we can deploy the model to the device. Performing inference locally on the device reduces the round-trip latency of sending device data to the cloud, as well as taking actions on the inference results. When we use a camera, for example, we might want to stream the video only when an object is detected. That will allow us to perform more sophisticated analysis of the video by using a service like Amazon Rekognition Video or by using the time-indexed video stored in Kinesis Video Streams for model training.
We love dogs at Amazon, so we thought we’d build a dog monitoring project, to help keep an eye on our furry friends, perhaps to see how often they are eating, or how many times they jump on the sofa when they are home alone! When AWS DeepLens detects a dog, it will send the next 60 seconds of video to Kinesis Video Streams, long enough to get a decent report of what the woofers are up to.
In this blog post, we will set up AWS DeepLens and deploy the object detection model that detects dogs. When a dog is detected, AWS DeepLens will stream the video directly to Amazon Kinesis Video Streams. We will then index the dog detection events in Amazon OpenSearch Service using AWS IoT and Amazon Kinesis Data Firehose. This allows us to search the videos where dogs are detected and then play them in the Kinesis Video Streams console.
We use the new Amazon Kinesis Video Streams for the AWS DeepLens Video Library – a Python module that encapsulates the Kinesis Video Streams Producer SDK for AWS DeepLens. We can use this module to send video feeds from an AWS DeepLens device to Kinesis Video Streams, and to control when to start and stop video streaming from the device.
Amazon Kinesis Video Streams makes it easy to securely stream video from millions of connected devices to AWS for real-time machine learning, storage, and batch-oriented processing and analytics. The time-indexed video from AWS DeepLens can be stored durably for as long as you want, and you don’t need to manage any infrastructure.
We are going to build the following architecture using these AWS services:
- AWS DeepLens (capture video)
- Amazon Kinesis Video Streams (stream, store, and play back video)
- AWS IoT (capture detected metadata)
- Amazon Kinesis Data Firehose (stream detected dog from multiple cameras to Amazon OpenSearch Service and Amazon S3)
- Amazon OpenSearch Service (querying)
We’ll use the following steps to implement the architecture:
- Set up AWS DeepLens and deploy the object detection model that detects dogs. When a dog is detected it streams video directly to Amazon Kinesis Video Streams and publishes events to AWS IoT.
- Set up Amazon OpenSearch Service, Amazon Kinesis Data Firehose, and an AWS IoT rule. Also create an index for storing incoming events with appropriate mapping and index patterns for visualization. After the index is created, enable the IoT rule.
- Visualize detected dog events across the cameras using Kibana Dashboards.
- Play the video that is stored and indexed in Kinesis Video Streams.
Step 1: Set up AWS DeepLens and deploy the object detection model.
We can use any device with the Kinesis Video Producer SDK or a device enabled by AWS Greengrass to capture video and stream to Amazon Kinesis Video Streams for processing. Here we are using the new AWS DeepLens-KinesisVideo APIs. We are going to develop our code using AWS Greengrass, which runs on the IOT device (the AWS DeepLens camera in our case). The AWS Greengrass core can run AWS Lambda functions that are deployed to it. For more information, see the AWS Greengrass Getting Started Guide.
AWS DeepLens automates most of these steps, so we’ll focus in this blog post on developing an AWS Lambda function that streams h264 data (video data) from AWS DeepLens to Amazon Kinesis Video Streams when a particular object (“dog”) is found in the local inference.
Develop an AWS Lambda function to start the camera, capture the video, and send to Kinesis Data Streams
AWS DeepLens runs the Ubuntu operating system (OS). AWS DeepLens is preloaded with AWS Greengrass, which lets you run local compute, messaging, data caching, sync, and ML inference capabilities in a secure way. We’ll use the AWS Lambda blueprint greengrass-hello-world and implement the function to stream video to Kinesis Video Streams.
Open the AWS Lambda console and select the deeplens-object-detection function:
We’ll select the Python2.7 runtime and update the deeplens-object-detection function code.
For the Lambda function, follow these steps in the AWS Lambda console:
- Import appropriate modules.
- Create a Kinesis video stream.
- Apply logic that detects the “dog” object.
- Start the camera and stream the data to Kinesis Video Streams.
- Publish and deploy the function on AWS DeepLens.
Import appropriate modules
The AWS DeepLens device comes preinstalled with the library DeepLens_Kinesis_Video. Use this library with the Kinesis Video API to handle the video data that is captured. We also added the following libraries that are required to run the function:
Create a Kinesis Video Stream, apply logic that detects a “dog” object, publish MQTT with timestamp, and start the camera to stream the data to Kinesis Video Streams
As mentioned earlier, we modified the existing deeplens-object-detection function and published a new version. We’ll create a new Kinesis video stream: “deeplens-dogstream”. (You won’t need to add the “deeplens” prefix because it will be added by DeepLens_Kinesis_Video lib.) When the dog object is identified, we’ll stream 60 seconds of video to Kinesis Video Streams. You can change your configuration of the code and identify other objects as you like, but for this blog post, we used a dog object. We will publish these events and timestamp via MQTT using AWS IoT when we have a video log that shows this information.
Then make sure you use the Existing role: AWSDeepLensLambdaRole. Review the AWS DeepLens Setup guide if this is the first time that you are using it.
Save, publish and deploy the function on AWS DeepLens
In the AWS DeepLens console, on the Object-detection project page, choose Action and select Publish New Version.
Deploy the function on your AWS DeepLens
To deploy the function on your AWS DeepLens device, we’ll use the AWS Deep Lens console. We’ll use the existing Object-Detection model and edit the project.
Choose Edit Project.
Edit the function. We’ll use the updated function version we published earlier.
Make sure that you set the Memory size of the function to 3 GB and save the project.
Remember that every time you register a device, AWS DeepLens creates a unique MQTT topic.
We chose that our function will publish inference output to an MQTT topic. This lets you see the model output on the AWS IoT console, or enable other Lambda functions to receive the output and take actions on it. Learn more.
And the last step is to save the project.
After the project is saved, we can deploy it on the AWS DeepLens device.
Upon successful deployment of the project, we can view the messages and the video on the Kinesis Video console.
In this example I used an image of my all-time favorite dog, Bono. I use this image to trigger AWS DeepLens to send 60 seconds of video to Kinesis Video Streams.
Now we go to the AWS IoT console and subscribe to the topic ID in the Inference output action.
We should see the MQTT message triggered by the dog detection. We will go to the Kinesis Video Streams console. We select the us-east-1 Region that we set in our Lambda function and select the stream deeplens-dogstream.
We choose deeplens-dogstream, and in the Video preview section we can see the live video ingested from AWS DeepLens.
Step 2: Set up Amazon OpenSearch Service, Amazon Kinesis Data Firehose, an IOT rule, and create an index
We’ll launch the AWS CloudFormation template – Videolog-FS-ES.yaml. This launches Amazon OpenSearch Service and Amazon Kinesis Data Firehose. All events published to Amazon Kinesis Data Firehose will be indexed on the Amazon OpenSearch Service domain. The template will ask for the following input parameters:
- HTTPIP: this is the IP address from which the Amazon OpenSearch Service domain can be accessed. Part of the CloudFormation template is restricted to a specific IP address, but you can further secure it by adding a resource based, identity based, or IP based access policy. In addition, you can also authenticate Kibana using Amazon Cognito.
- SQL: So that SQL can be used by the AWS IoT rule, we will write something like
SELECT * FROM '$aws/things/[YOUR-DEEPLENS-THING-ID]/infer, where ID is updated by your ID. This means that we are selecting all attributes coming to the topic.
After the template is launched, it provides the domain endpoint in the output section. We will create the index videoLog by using the following command:
After the index is created, create an index pattern from Kibana for visualizations. Log in to Kibana (the Kibana endpoint is available as one of output values), and create an index pattern. Then, from the management tab, add an index pattern on top of the videolog index by using the timestamp as time filter.
Now that everything is configured, we can enable the IoT rule by using the following command.
Events will start to flow into Amazon OpenSearch Service.
Step 3: Visualize detected dog events across cameras using Kibana Dashboards
As data flows, we can start building visualizations using Kibana. One of the simple visualization here shows the presence of a dog object in Roy’s office over the last 3 days:
Step 4: Play the video that is stored and indexed in Kinesis Video Streams
As we visualize data using Kibana or query directly to domains, we may want to view captured video for a particular timestamp. The Amazon Kinesis Video Streams console allows us to view stored videos for a specific period as follows:
This blog post shows you how AWS DeepLens and Amazon Kinesis Video Streams can be used as a DIY monitor camera, how to run a machine learning inference model locally, and how to build a searchable video index so that you can use it for logging and monitoring.
If you have any questions, please use the comments after this post.
For more general information, take a look at the AWS DeepLens website or browse AWS DeepLens posts on the AWS Machine Learning blog.
About the Authors
Roy Ben-Alta is a Solutions Architect and Principal Business Development Manager at AWS. Roy leads strategic and technical business development initiatives for machine learning (ML) and data analytics, focusing on real-time video and data streaming analytics and ML. He helps customers build ML solutions, and works with multiple AWS organizations including product, marketing, sales, and support, to ensure customer success in their AI journeys.
Nehal Mehta is a Sr Data Architect for AWS Professional Services. As part of professional services team he collaborates with sales, pre-sales, support, and product teams to enable partners and customers to benefit from big data analytics workloads, especially stream analytics workloads. He likes to spend time with friends and family, and has interests in technical singularity, politics, and, surprisingly, finance.