The Internet of Things on AWS – Official Blog

Creating Object Recognition with Espressif ESP32

By using low-cost embedded devices like the Espressif ESP32 family, and the breadth of AWS services, you can create an advanced object recognition system.

ESP32 microcontroller is a highly integrated solution for Wi-Fi and Bluetooth IoT applications, with around 20 external components. In this example, we use AI Thinker ESP32-CAM variant that comes with an OV2640 camera module. This module is a low voltage CMOS image sensor providing full functionality of a single-chip UXGA (1632×1232) camera and image processor in a small footprint package.

As a development platform, we will be using PlatformIO. It is a cross-platform, cross-architecture, multi-framework, professional tool for embedded systems engineers and for software developers who write applications for embedded products. We will be using it to program our ESP32-CAM microcontroller.

This following diagram demonstrates a connection between an ESP32-CAM and AWS IoT Core. It allows publishing and subscribing to MQTT topics. This means that the device can send any arbitrary information to AWS IoT Core while also being able to receive commands back.

 

 

Solution overview

This post will walk you through building a complete object recognition solution, starting with deploying a serverless project on AWS that handles the communication between the cloud and ESP32-CAM device and then setup of AWS IoT Device SDK for Embedded C inside PlatformIO project.

The list of services that we will be using is as follows:

Required equipment:

  • AI Thinker ESP32-CAM
  • USB – TTL Serial Adapter
  • Breadboard
  • 3x LEDs
  • 3x 330 Ohm resistors
  • Jumper Wires
  • 1x button

High-level overview of the steps involved in this demo:

This solution relies on the MQTT protocol to communicate with the cloud. The dataflow starts at the ESP32 embedded device that sends an MQTT message once the button is pressed. An IoT rule forwards this message to a Lambda function that generates an S3 signed URL and sends it back. Once the device receives the URL, it takes the framebuffer data from the camera module and does an HTTPS request to S3 to POST the image. On successful upload, ESP32 sends another MQTT message to a Lambda function and provides the name of the uploaded file. This Lambda function then calls Amazon Rekognition API, identifies the scene on the image and sends the results back to the embedded device that then decides which LED to turn on.

Creating an AWS IoT device

To communicate with the ESP32 device, it must connect to AWS IoT Core with device credentials. You must also specify the MQTT topics it has permissions to publish and subscribe on.

  1. In the AWS IoT console, choose Manage, Things and click Create.
  2. Name the new thing myesp32-cam-example. Leave the remaining fields set to their defaults. Choose Next.
  3. Choose Create certificate. Only the thing cert, private key, and Amazon Root CA 1 downloads are necessary for the ESP32 to connect. Download and save them somewhere secure, as they are used when programming the ESP32 device.
  4. Click on Activate and then click on Attach a policy.
  5. Click on Register Thing without attaching any policies at this step.
  6. In the AWS IoT console side menu, choose Secure, Policies, Create a policy.
  7. Name the policy Esp32Policy. Choose the Advanced tab.
  8. Paste in the following policy template swapping out REGION and ACCOUNT_ID.
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "iot:Connect",
      "Resource": "arn:aws:iot:REGION:ACCOUNT_ID:client/myesp32-cam-example"
    },
    {
      "Effect": "Allow",
      "Action": "iot:Subscribe",
      "Resource": "arn:aws:iot:REGION:ACCOUNT_ID:topicfilter/esp32/sub/data"
    },
    {
      "Effect": "Allow",
      "Action": "iot:Subscribe",
      "Resource": "arn:aws:iot:REGION:ACCOUNT_ID:topicfilter/esp32/sub/url"
    },
    {
      "Effect": "Allow",
      "Action": "iot:Receive",
      "Resource": "arn:aws:iot:REGION:ACCOUNT_ID:topic/esp32/sub/url"
    },
    {
      "Effect": "Allow",
      "Action": "iot:Receive",
      "Resource": "arn:aws:iot:REGION:ACCOUNT_ID:topic/esp32/sub/data"
    },
    {
      "Effect": "Allow",
      "Action": "iot:Publish",
      "Resource": "arn:aws:iot:REGION:ACCOUNT_ID:topic/esp32/pub/data"
    },
    {
      "Effect": "Allow",
      "Action": "iot:Publish",
      "Resource": "arn:aws:iot:REGION:ACCOUNT_ID:topic/esp32/pub/url"
    }
  ]
}

Creating an S3 bucket

To store the images from ESP32-CAM we need to create an S3 bucket. Create a bucket called “esp32-rekognition-YOU_ACCOUNT_ID”. Please use your account ID as a suffix to the bucket name. S3 bucket names are unique and this technique simplifies the process of choosing a non-taken name. Next, we make sure that your bucket is private.

In the permissions, set the bucket policy to the following. Modify account ID and username placeholders respectively:

{
    "Version": "2012-10-17",
    "Id": "Policy1547200240036",
    "Statement": [
        {
            "Sid": "Stmt1547200205482",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::YOUR_ACCOUNT_ID:user/YOUR_USERNAME"
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::esp32-rekognition-YOUR_ACCOUNT_ID/*"
        }
    ]
}

Also, paste the following CORS configuration to allow cross account requests:

<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
    <AllowedOrigin>*</AllowedOrigin>
    <AllowedMethod>GET</AllowedMethod>
    <AllowedMethod>POST</AllowedMethod>
    <AllowedMethod>PUT</AllowedMethod>
    <AllowedMethod>DELETE</AllowedMethod>
    <AllowedHeader>*</AllowedHeader>
</CORSRule>
</CORSConfiguration>

Creating Lambda functions

Lambda functions are responsible for generating S3 Signed URLs as well as interacting with Amazon Rekognition API.

First, we create the Lambda function responsible for getting S3 Signed URLs. Name it ‘esp32-request-url’, choose the ‘Python 3.8’ runtime, and add the following code, swapping out your REGION and account ID:

import boto3
from botocore.client import Config
import json
import uuid

def lambda_handler(event, context):
    bucket_name = 'esp32-rekognition_YOUR_ACCOUNT_ID'
    file_name = str(uuid.uuid4()) + '.jpg'

    mqtt = boto3.client('iot-data', region_name='REGION')
    s3 = boto3.client('s3')

    url = s3.generate_presigned_url('put_object', Params={'Bucket':bucket_name, 'Key':file_name}, ExpiresIn=600, HttpMethod='PUT')
    # command = "curl --request PUT --upload-file {} '{}'".format(file_name, url)
    # print(command) # for local testing purpose
    # print(file_name + '/' + url[8:]) # for local testing purposes

    response = mqtt.publish(
            topic='esp32/sub/url',
            qos=0,
            payload=file_name + '/' + url[8:]
        )
       

Now we modify your new Lambda functions execution role, adding the following permissions, so that it can generate S3 Signed URLs, and publish them to the MQTT topic:

 

...{
            "Sid": "VisualEditor3",
            "Effect": "Allow",
            "Action": [
                "iot:Publish"
            ],
            "Resource": "*"
        },
        {
            "Sid": "VisualEditor2",
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::esp32-rekognition-YOUR_ACCOUNT_ID",
                "arn:aws:s3:::esp32-rekognition-YOUR_ACCOUNT_ID/*"
            ]
        }...

 

Now create the second Lambda function. Name it “esp32-request-rekognition” and follow the same structure as the first, with Python 3.8 runtime. Paste the following code:

import boto3
import json

def detect_labels(bucket, key, max_labels=10, min_confidence=90, region="REGION"):
        rekognition = boto3.client("rekognition", region)
        response = rekognition.detect_labels(
               Image={
                       "S3Object": {
                               "Bucket": bucket,
                               "Name": key,
                       }
               },
               MaxLabels=max_labels,
               MinConfidence=min_confidence,
        )
        return response['Labels']

def lambda_handler(event, context):
    results = ''
    mqtt = boto3.client('iot-data', region_name='REGION')

    bucket_name = 'esp32-rekognition_YOUR_ACCOUNT_ID'
    file_name = str(event['payload'])

    for label in detect_labels(bucket_name, file_name):
        if (float(label['Confidence']) > 90):
                results += (label['Name'] + ';')


    response = mqtt.publish(
            topic='esp32/sub/data',
            qos=0,
            payload=results
        )
 

 

Again, we need to modify function’s execution role, adding the following permissions, so that it can invoke Rekgonition detection service and publish the results to the MQTT topic:

...{
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "rekognition:DetectLabels",
                "iot:Publish"
            ],
            "Resource": "*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
            ],
            "Resource": [
                "arn:aws:s3:::esp32-rekognition-YOUR_ACCOUNT_ID",
                "arn:aws:s3:::esp32-rekognition-YOUR_ACCOUNT_ID/*"
            ]
        }...

These permissions allow Lambda function to talk to Amazon Rekognition API and publish the results to another MQTT topic.

Creating IoT Core rules

To allow AWS IoT Core to Invoke our Lambda functions when MQTT messages arrive, we need to create two AWS IoT Core rules.

The first rule will forward the request esp32-request-url Lambda function. For that, you need to specify the following query statement:

SELECT * FROM 'esp32/pub/url'

Select “Send a message to a Lambda function” as an Action and point it to “esp32-request-url” Lambda function. Optionally you can add error action and forward the results to CloudWatch logs.

The second rule will forward the request to the esp32-request-rekognition Lambda function. For that you need to specify the following query statement:

SELECT * FROM 'esp32/pub/data

Select “Send a message to a Lambda function” as an Action and point it to “esp32-request-rekognition” Lambda function. Also, here you can add an error action and forward the results to CloudWatch logs.

Testing the cloud solution.

Before we move on to the embedded device, let us test this solution.

First, let’s test if our Lambda function returns signed URLs for our ESP32-Cam to use as an upload location. Open AWS IoT core and go to “Test” section.

Subscribe to esp32/sub/url MQTT topic

Publish the following json to esp32/pub/url MQTT topic:

{"payload":"virtual-esp32-device"}

If you check on the topic that you subscribed to, you should receive a response with the filename/signed URL.

You can test this signed URL with tools like Postman. Paste the URL, select the PUT method, and set the body to ‘binary’, which allows you to select the file you want to upload. Once you submit the request, check your S3 bucket, a new file should be there.

Second, let’s upload an image called test.jpg to the S3 bucket that you created earlier. Then open AWS IoT core and go to “Test” section again.

Subscribe to esp32/sub/data MQTT topic

Publish the following JSON to esp32/pub/data MQTT topic:

{"payload":"test.jpg"}

Check the response in the esp32/sub/data topic. You should see keywords associated to what Amazon Rekognition service identified on your test image.

Setting up ESP32-CAM

Before we start, make sure you have the right components at hand.

Use the following diagram to assemble this solution:

Download the source of this demo project.

This project has the following structure:

Extract it into your working directory, open src/certs and substitute private.pem.key and certificate.pem.crt with the certificates that you downloaded during IoT device creation (maintaining the same filenames).

Open PlatformIO IDE (part of Visual Studio Code), make sure Espressif 32 framework is installed. Then inside PlatformIO terminal run the following command:

platformio run -t menuconfig

Select Example Configuration option and configure your WiFi details as well as IoT client ID (myesp32-cam-example).

Also, make sure to select “Support for external, SPI-connected RAM” under Component config → ESP32-specific option.

Open sdkconfig file in the root folder and modify CONFIG_AWS_IOT_MQTT_HOST variable to the one that is under IoT Core – Manage – Things – myesp32-cam-example – Interact (HTTPS).

After completion of these steps, you are ready to build and flash to your device.

After a successful build and flash process, you can point the camera at a subject, press the button and wait for an LED to light up based on the results from Amazon Rekognition.

Now let us test it on a real subject knowing the following:

  • Red LED: animal
  • Green LED: person
  • Yellow LED: everything else

As you can see, my cat was correctly identified as an animal.

To debug any issues, connect a serial monitor to the serial port associated with your device. In addition, you can see logs associated to this solution inside CloudWatch Logs in the AWS Management Console.

Clean up

To avoid incurring future charges, delete the IoT Things together with both Lambda functions.

Conclusion

AWS supports an array of widely available IoT embedded devices to connect your workloads to the cloud. This post showed that with a sub 20 USD IoT board, you could implement an object identification. Lighting up three LEDs is a very basic example, but this can be easily modified to trigger some other physical device that can solve your business problem.

With the use of AWS serverless services, advanced functionality can be added to an IoT device. This is done by offloading work to the cloud to maintain and use devices in a more power- efficient way. The next step is to explore what you can do with FreeRTOS and the capabilities of AWS serverless.