Classify a Large Number of Images with Amazon Rekognition and AWS Batch
Amazon Rekognition, one of the first AWS AI services, makes it easy to quickly add sophisticated deep-learning-based visual search and image classification to your applications. With the Rekognition API, you can detect objects, scenes, and faces in images, and search and compare faces.
Many AWS customers who have images stored in S3 want to query for objects and scenes detected by Rekognition. This post describes how to use AWS services to create an application called bucket-rekognition-backfill that cost effectively gets and stores the Rekognition labels (objects, scenes, or concepts) from images stored in an S3 bucket, and processes images that are added to, updated, and deleted from the bucket.
Challenges and a solution
The first challenge is getting a list of the images in an S3 bucket. Traditionally, you do this by running the Amazon S3 synchronous List API operation. Depending on the number of objects in a bucket, this can take a long time. The new Amazon S3 inventory tool Amazon S3 inventory produces a comma-separated values (.csv) flat-file output of objects and their corresponding metadata on a daily or weekly basis. It can inventory all of the objects in an S3 bucket or all objects that share a prefix (that is, objects that have names that begin with a common string). We use this tool in our solution to get the list of objects stored in a specified bucket and to process only images in .png or .jpeg format, the formats supported by Rekognition.
After we have the list of images, how can we call Rekognition without being throttled for sending too many concurrent requests to the service (because there is a limit of calls per second per account)? The new AWS Batch service allows us to process the images asynchronously with an upper threshold of concurrent calls to Rekognition.
To search for images, we use Amazon Elasticsearch (ES), which provides full-text search, to look up the labels for the images stored in S3. Amazon ES also provides Kibana, a nice graphical user interface (GUI).
We also need the ability to process images that are uploaded to, updated, or deleted from the S3 images bucket. We do this by using the S3 event notification feature with AWS Lambda to call Rekognition and save or delete the image entries in the Amazon ES domain index.
So, the solution has two components:
- A backfill component that gets labels for all of the images in a specified S3 bucket and saves them to an Amazon ES domain index
- A component that gets labels for uploaded images in near real-time and saves them to the same Amazon ES domain index
The following figure shows the image backfill workflow component:
This is how it works:
- The Amazon S3 inventory tool produces a .csv file, that lists the images that are stored in an images bucket, and saves the csv file to an S3 bucket called the inventory bucket. The inventory file is produced daily by the S3 service.
- When a new version of the zipped S3 inventory .csv file is saved to the destination S3 inventory bucket, a Lambda function is called as the inventory bucket is configured to invoke the Lambda function on any object with a .csv.gz file extension.
- The Lambda function reads the contents of the .csv file, and for each image it finds, it creates a new AWS Batch Job with the image bucket and name, and submits it to the AWS Batch queue. AWS Batch processes the batch jobs submitted to the job queue by launching EC2 instances and executing the batch jobs on those instances.
- The Lambda function removes the S3 event trigger to avoid the AWS Batch backfill workflow to run more than once.
- The AWS Batch jobs receive an image bucket and name as input parameters and check if the image has already been processed by querying the Amazon ES domain index. If the image has not been processed, the AWS Batch Job calls the Amazon Rekognition detect_label API.
- The AWS Batch jobs save the labels that Rekognition returns for the image into the Amazon ES domain index.
Images that are uploaded to the bucket are processed in near real-time with an S3 event notification that is processed by a Lambda function. The following figure shows this workflow:
This is how it works:
- The user uploads an image to the images bucket.
- The images bucket is configured to invoke a Lambda function when a new image is uploaded or deleted.
- The Lambda function calls Rekognition to detect the labels for the image.
- The Lambda function saves the Rekognition labels to an Amazon ES domain index. If the image already exists, the function updates the labels in the Amazon ES domain index. If the image was deleted from the images bucket then the Lambda function removes all entries for that image in the Amazon ES domain index.
- Users can look up the labels for an image in the Elasticsearch index.
The bucket-rekognition-backfill application is created by a single CloudFormation template. It consists of a number of resources including the following:
- A nested CloudFormation stack that creates a number of Lambda functions including the following:
- RekognitionLabelerFunction: When a new image is uploaded to the S3 images bucket, this function retrieves the Rekognition labels for the image and saves them to an Amazon ES domain index.
- S3InventoryProcessorFunction: Processes the Amazon S3 inventory file, located in the inventory bucket, with the list of images stored in the images bucket. For each image, it creates a job to be processed by AWS Batch.
- CodeBuildTriggerFunction: Triggers the Docker build of the container image used by the AWS Batch job to retrieve the Rekognition labels of an image and save them into an Amazon ES domain. CloudFormation calls this function as a custom resource.
- S3NotificationConfigFunction: When a new image is uploaded or deleted from the S3 bucket, configures the S3 bucket notification to call the RekognitionLabelerFunction Lambda function. CloudFormation calls this function as a custom resource.
- S3InventoryConfigFunction: Configures the S3 Inventory configuration for the images bucket so that the S3 service generates a .csv file daily that contains all objects present in the images bucket and saves this to the separate inventory bucket. CloudFormation calls this function as a custom resource.
- AWSBatchComputeEnvFunction: Creates and deletes the AWS Batch compute environment as a CloudFormation custom resource.
- AWSBatchJobQueueFunction: Creates and deletes the AWS Batch job queue as a CloudFormation custom resource.
- AWSBatchJobDefinitionFunction: Creates and deregisters the AWS Batch job definition as a CloudFormation custom resource.
- IAM service roles for AWS CodeBuild and AWS Batch
- IAM roles and a security group for the EC2 instances used by AWS Batch
- An S3 bucket to store the inventory csv file of images contained in the images bucket
- An EC2 Container Registry (ECR) repository, named image_labeler, to store the Docker image run by the AWS Batch jobs
- A CodeBuild project to build the Docker image used by AWS Batch to process the images. The CodeBuild project pushes the built Docker container to the ECR repository named image_labeler.
- A custom CloudFormation resource to trigger the build of the CodeBuild project
- An AWS Batch Compute Environment that launces the EC2 resources needed to run the AWS Batch jobs
- An AWS Batch Job Queue where jobs are submitted by the Lambda function.
- An AWS Batch Job Definition that specifies the Docker image and other parameters that are used to execute the AWS Batch job.
A diagram of the CloudFormation, CodeBuild and AWS Batch resources and their interactions is shown below:
AWS Batch explained
In the bucket-rekognition-backfill application, AWS Batch performs image backfill processing. The application creates the following AWS Batch resources using custom CloudFormation resources:
- AWS Batch Compute Environment: The AWS Batch Compute Environment manages the EC2 instances used to run the containerized batch jobs. A Compute Environment is mapped to one or more AWS Batch Job Queues. The AWS Batch scheduler takes jobs from the queue and schedules them to run on an EC2 host in the AWS Batch Compute Environment. The maximum number of vCPUs is configured to be 64 to prevent launching too many concurrent calls to Rekognition and the Amazon ES domain. We start with a desired and min vCPU count of zero to prevent launching unnecessary EC2 instances in advance. We also provide a security group for each EC2 instance and an IAM instance profile. Once all the AWS Batch jobs in the queue have been successfully processed then the EC2 resources are terminated automatically by the AWS Batch service.
- AWS Batch Job Definition: An AWS Batch job definition specifies how the batch jobs are to be run. It is very similar to an ECS task definition in that you specify the following attributes:
- IAM role associated with the job
- URI of the Docker image used to execute the AWS Batch job
- vCPU and memory constraints
- Volume mount points
- Container properties
- Environment variables
In this specific application, each job requires 1 vCPU and 50 MiB of memory. The Docker image used by AWS Batch will be stored in EC2 Container Registry. This is the Docker image built by the CodeBuild service in the main CloudFormation template and triggered by a CloudFormation custom resource.
- AWS Batch Job Queue: Jobs to be run by the AWS Batch service must be submitted to an AWS Batch Job Queue, where they reside until they are able to be scheduled to run on a compute resource. There can be multiple Job Queues with different priorities defined for a particular Compute Environment. The AWS Batch scheduler picks jobs to be executed from the higher priority Job Queue first. In this case, there is only one Job Queue, so set an arbitrary number for the queue priority of 10.
Application setup and prerequisites
For this application, we use the AWS N. Virginia Region (us-east-1) as not all regions contain the AWS Batch service.
You also need the following:
- An S3 bucket that contains the images you want to process
- A VPC with at least one public subnet and one private subnet
- An Amazon Elasticsearch (ES) domain
If you want to create a new VPC and an Amazon ES domain, launch the CloudFormation template . This stack called RekognitionBackfillNet-ES creates the VPC, subnets, NAT gateway, Internet gateway, Amazon ES domain with Kibana authenticated using HTTP Basic Authentication via a proxy application deployed using Elasticbeanstalk. You will need to provide as input parameters to the CloudFormation stack an HTTP username and password. Please note that it may take up to 30 minutes to complete the creation of the CloudFormation stack.
Launching the application
Launch the CloudFormation template which creates a stack named RekognitionBackfillMaster with all the necessary AWS resources for the bucket-rekognition-backfill application. There are no further steps needed other than launching this CloudFormation stack to get the application working.
As input parameters to the CloudFormation stack, you will need to provide:
- The name of the images bucket
- The Amazon ES domain endpoint
- The VPC and private subnets where the AWS Batch compute resources will run
The images S3 bucket must be located in the US Standard region otherwise the CloudFormation stack creation step will fail. Also, the images bucket name parameter is NOT a URL so DO NOT prefix it with s3://.
If you launched the previous CloudFormation stack to create the VPC and Elasticsearch resources then the Elasticsearch domain endpoint and VPC id and subnets are provided as output parameters of that CloudFormation stack.
The backfill processing will only commence when the S3 inventory tool creates the inventory file of all images present in the images bucket. This can take up to 48 hours from when you create the CloudFormation stack.
Note: If you need to re-run the CloudFormation stack creation due to any issue with the first run setup, make sure you manually remove the S3 inventory bucket and ECR repository as the CloudFormation stack deletion command will not remove these resources.
All of the resources were created with the CloudFormation template. When you are done with the application, tear down the resources by deleting the stack named RekognitionBucketProcessor.
If you have a bucket that contains many images that you want to analyze to see what objects and scenes they contain, and want to process subsequent images uploaded to the same bucket, use Rekognition and the application described in this post.
If you have any questions or suggestions, please comment below.
About the Author
Matt McClean is a Partner Solution Architect for AWS. He works with technology partners in the EMEA region providing them guidance on developing their solutions using AWS technologies and is a specialist in Machine Learning. In his spare time, he is a passionate skier and cyclist.