AWS Machine Learning Blog

Detect manufacturing defects in real time using Amazon Lookout for Vision

In this post, we look at how we can automate the detection of anomalies in a manufactured product using Amazon Lookout for Vision. Using Amazon Lookout for Vision, you can notify operators in real time when defects are detected, provide dashboards for monitoring the workload, and get visual insights from the process for business users.

Amazon Lookout for Vision is a machine learning (ML) service that spots defects and anomalies in visual representations using computer vision (CV). With Amazon Lookout for Vision, manufacturing companies can increase quality and reduce operational costs by quickly identifying differences in images of objects at scale.

Defect and anomaly detection during manufacturing processes is a vital step to ensure the quality of the products. The timely detection of faults or defects and taking appropriate actions is important to reduce operational and quality-related costs. According to Aberdeen’s research, “Many organizations will have true quality-related costs as high as 15 to 20 percent of sales revenue, in extreme cases some going as high as 40 percent.”

Manual inspection, either in-line or end-of-line, is a time-consuming and expensive task. Firstly, you require trained human experts to perform visual inspections. Secondly, the feedback loop is slower and can cause bottlenecks in production and time-to-market timelines. Lastly, the process is subjective, and difficult and costly to scale effectively.

Therefore, a robust, effective, and scalable defect detection mechanism is necessary to provide objective decisions on visual inspection with a quick feedback loop and at low cost to maximize the quality of manufactured goods.

Overview of solution

The solution is composed of seven different building blocks (as shown in the following diagram), which we dive deep into in the following sections.

Image ingestion and storage

The following section of the architecture illustrates the components for image ingestion and storage.

The images from a manufacturing facility camera can be ingested either directly by the camera, which supports compute, or via a client application that collects images from the cameras, optionally preprocesses them so that they match the image properties used to train the model, and uploads them to Amazon Simple Storage Service (Amazon S3).

We achieve this by invoking an Amazon API Gateway URL to get a presigned URL from Amazon S3. The client invokes an API Gateway REST API endpoint with request parameters, which include metadata such as assembly line ID, camera ID, image ID, and an authorization token. The API Gateway uses a custom AWS Lambda authorizer function that uses the authorization token to determine the caller identity and grant authorization. After authorization, the request is processed by a Lambda function that gets the signed URL from Amazon S3.

With the presigned URL, a client gets time-bound access to be able to upload a specific object to your S3 bucket without needing AWS security credentials or permissions.

This provides a secure and scalable pattern for uploading images for anomaly detection.

Defect detection workflow

The anomaly detection workflow relies on AWS Step Functions to orchestrate the process of detecting whether an image is anomalous, storing the inference result, and sending notifications. The following diagram illustrates this process.

Step Functions is a serverless function orchestrator that allows you to sequence Lambda functions and multiple AWS services into business-critical applications. Through its visual interface, you can create and run a series of checkpointed and event-driven workflows that maintain the application state. The output of one step acts as an input to the next. Each step in your application runs in order, as defined by your business logic.

In our case, when an image is uploaded to the S3 bucket, it triggers an event notification that in turn invokes a Lambda function to start the workflow or state machine run. The workflow consists of the following steps and Lambda functions:

  1. DetectAnomalies – Gets the image and its associated metadata from Amazon S3 and invokes the DetectAnomalies API for Amazon Lookout for Vision. It enriches the response with the metadata and passes it to the next step.
  2. PutResultInDynamoDB – Invokes the PutItem API of Amazon DynamoDB to store the response from the previous stage to a DynamoDB table.
  3. PublishMessageToSNS – Invokes the Publish API of Amazon Simple Notification Service (Amazon SNS) to publish a message to a SNS topic that notifies subscribers in case an anomaly or a low-confidence result is detected. This enables operators to get real-time alerts and notifications to take appropriate action—for example, to manually review low-confidence results, label them correctly, and feed them back into the dataset in Amazon Lookout for Vision to retrain the model.

ML model and management front end

At the core of the solution, we have Amazon Lookout for Vision, which enables us to train a ML model to spot defects and anomalies in images of objects at scale. It requires no specialized ML skills or high-cost machine vision systems or cameras to get started. It provides an easy, fast, and low-cost way to improve quality control processes and reduce operational and quality-related costs.

You can get started with as few as 30 images for the process you want to visually inspect. Amazon Lookout for Vision builds a model in minutes that you can then use to automate the visual inspection processes in real time or in batch and receive notifications when defects are detected. You can continuously improve and tune the model by adding more images and providing feedback on the identified product defects to improve precision and accuracy.

The following diagram illustrates these components of the architecture.

Moreover, you can set up a static website to act as a management front end to provide a secure and simple way for administrators to start and stop a model depending on usage requirements. The model startup and shutdown can also be automated using Amazon CloudWatch scheduled events and Lambda functions, though this is not covered in this post.


We use DynamoDB to store inference results from Amazon Lookout for Vision. DynamoDB is a NoSQL database service that delivers consistent, single-digit millisecond latency at any scale and lets you easily store and query data.

Each record that is stored in the table contains the basic information of the uploaded image such as S3 URI, camera ID, and assembly line ID. It also contains the result of the defect detection process, which includes whether an image is classified as anomalous or not along with its corresponding confidence value.

The following screenshot illustrates the DynamoDB table structure.


To gain insights from defect detection results stored in the DynamoDB table, you can transfer the results to a destination S3 bucket, optionally analyze the data using Amazon Athena, an interactive query service that analyzes data in Amazon S3 using standard SQL, and then easily build visualizations using Amazon QuickSight.

The following diagram demonstrates the high-level process.

We use DynamoDB Streams configuration to capture a time-ordered sequence of item-level modifications in the table and durably store the information for up to 24 hours. For more information, see Change Data Capture for DynamoDB Streams.

When a new record is added to the table, a new record appears in the table’s stream. Lambda polls the stream and invokes a Lambda function synchronously, which transforms the record and puts it in an Amazon Kinesis Data Firehose delivery stream. This delivery stream is configured to batch receiving records within 1 minute into a file in JSON format and put the file into the destination S3 bucket. We can then use QuickSight to import the data in the S3 bucket and create visualizations based on the data.


We use Amazon SNS, a fully managed notification service to send messages via email reliably and securely for both application-to-application and application-to-person communication. With Amazon SNS, we can notify operators and quality managers when an image is classified as anomalous or has a low-confidence inference result via SMS, mobile push, or email. Similarly, we also use Amazon SNS to notify users when a CloudWatch alarm is triggered, such as when the number of detected anomalies exceeds a predefined threshold. In this solution, we demonstrate how relevant stakeholders can receive email notifications based on anomaly detection results.

The following diagram illustrates these steps in the architecture.

Monitoring and alerting

We use CloudWatch for monitoring and observability. With CloudWatch, we can monitor image processing and anomaly detection metrics for Amazon Lookout for Vision and other services. Additionally, you can access logs generated from the Lambda function and Step Functions state machine runs, create dashboards to provide you visibility on these metrics, and create alarms and get notifications when predefined thresholds are exceeded.

The following diagram illustrates these steps in the architecture.

In the next sections, we walk you through setting up the prerequisites, deploying and testing the solution, and creating visualizations with QuickSight.


To set up prerequisites related to Amazon Lookout for Vision, refer to Setting up Amazon Lookout for Vision. Specifically, you need to set up the following:

Prepare your dataset

To prepare your dataset for model training, you can use a sample dataset, prepare a custom labeled dataset, or use a publicly available dataset.

A sample dataset is available in the repository located at ../resources/circuitboard/. Run the following command after updating the details of your S3 bucket (which you created earlier) to upload the images for training the model in Amazon Lookout for Vision:

>aws s3 cp --recursive your-repository-folder/resources/circuitboard s3://your-lookout-for-vision-bucket/custom-dataset/circuitboard/

For more information, see Step 8: (Optional) Prepare example images.

After the dataset is uploaded, you can start with project creation and model training via the Amazon Lookout for Vision console.

To prepare a custom labeled dataset, you first gather and preprocess the images, then divide the dataset into training and testing data.

For this post, we use the third option, and use a dataset from a public repository. The dataset we use is a subset of the “casting product image data for quality inspection” dataset from Kaggle. It contains labeled images of anomalous and normal metal casting products. These are 300 x 300 pixel grey-scaled images and in all images, augmentation has already been applied. To follow along with this post, you can download the dataset from Kaggle, extract the data, and move the image files so that the resulting folder structure resembles the following screenshot.

The subfolders in metal-casting-defects are as follows:

  • extra_images – Images you can use to test inference
  • test – Images you can use in a test dataset
  • train – Images you can use in a training dataset

After you set up the folder structure, run the following command via the AWS CLI to upload the dataset to Amazon S3:

>aws s3 cp --recursive your-repository-folder/resources/metal-casting-defects s3://your-lookout-for-vision-bucket/custom-dataset/metal-casting-defects/

The following screenshot illustrates the folder structure in the S3 bucket after the images are successfully uploaded.

Create a project and dataset in Amazon Lookout for Vision

In this section, we walk you through the steps for setting up an Amazon Lookout for Vision project and a dataset to train a model for anomaly detection. For more information, see Getting Started with the Amazon Lookout for Vision console or watch the videos available on Amazon Lookout for Vision Resources.

  1. On the Amazon Lookout for Vision console, choose Create project.

  1. Create a project called metal-casting-defects.
  2. On the project details page, choose Create dataset.

  1. Select Create a training dataset and a test dataset.

  1. Select Import Images from S3.

  1. Choose Copy S3 URI to get the S3 URI from the bucket created previously and append the appropriate folder name (train/ or test/).

  1. Ensure that Automatically attach labels to images based on the folder name is selected.

  1. After the configuration details are complete for the training and test dataset, choose Create dataset.

Train the model

When the dataset has been imported, you can start the model training using the default settings.

  1. On your model details page, choose Train model.

  1. Choose Train model

  1. Choose Train model

The model training takes some time depending on the size of the dataset used. If you’re using the dataset that is part of the repository, it should take 45–60 minutes to complete the training process. You can monitor the training status on the model details page.

When the model has been trained successfully, you can use it for detecting anomalies in new images.

Evaluate the model

You can evaluate whether your model is ready to be deployed to production in a few different ways. The first is to review the performance metrics of the model; the second is to run some production tests to help you verify if the model is ready to be deployed.

We use three main performance metrics: precision, recall, and F1 score. Precision measures the percentage of times the model prediction is correct, and recall measures the percentage of true defects the model identified. We use the F1 score to determine the model performance metric.

We can improve the model iteratively by retraining with new data.

To detect anomalies in the image, start your model with the StartModel operation.

After your model starts, you can use the DetectAnomalies operation to detect anomalies in an image.

Run the model using the AWS CLI

You can run the model via the AWS CLI or SDK. However, we can also set up a management front end that admin users can use to easily start and stop the model via a web user interface.

Run the following command in the terminal:

aws lookoutvision start-model --project-name "metal-casting-defects" \
  --model-version model 1 \
  --min-inference-units 1

The code has the following parameters:

  • project-name – The name of the project that contains the model you want to start
  • model-version – The version of the model you want to start
  • min-inference-units – The number of anomaly detection units you want to use (1–5)

Make sure to stop the model after you complete the testing so you don’t incur any additional cost. For more details about pricing, see Amazon Lookout for Vision Pricing.

Start the model using a management front end

You can also set up a management front end that admin users can use to easily start and stop the model via a web user interface. You can deploy the solution located on GitHub to set up the front-end application. After it’s deployed, you can sign up and log in to the management front end and start or stop the model. To start the model, complete the following steps:

  1. Choose Start the model.

  1. Enter the minimum number of inference units to use.
  2. Choose Start the model.

You see a message that the model is starting.

Deploy the solution

Now that all prerequisites have been set up, we can proceed with the solution deployment. The solution sample code is available on GitHub.

In this section, I show you how to launch an AWS CloudFormation template, which creates the following resources:

  • S3 buckets for the source images and inference results
  • IAM roles for the Lambda functions
  • An AWS CodeDeploy application and CodeDeploy deployment groups.
  • An Amazon API Gateway REST API to use for getting a signed URL from Amazon S3 to upload images
  • A CloudWatch monitoring dashboard and alarm
  • The following Lambda functions:
    • APIGatewayCustomAuthorizerFunction
    • CreateManifestFileFunction
    • DetectAnomaliesFunction
    • DynamoDBToFirehoseFunction
    • PublishAlertMessageToSnsTopicFunction
    • PutItemInDynamoDBFunction
    • StartStateMachineLambda
  • A Step Functions state machine that orchestrates defect detection, stores the results in DynamoDB, and publishes messages via Amazon SNS
  • A SNS topic for sending email notifications to the subscribed email address
  • A DynamoDB table to store inference results
  • The following custom resources (Lambda functions):
    • S3ToLambdaTrigger – Creates an event notification trigger on the source images S3 bucket to invoke a Lambda function to start running the state machine
    • CreateManifestFile – Creates a manifest file in the results S3 bucket to use with QuickSight for data import

Deploying the solution on CloudFormation

  1. Deploy the latest CloudFormation template by choosing ‘Launch on AWS’ for your preferred AWS Region:
US East (N. Virginia) (us-east-1)
US East (Ohio) (us-east-2)
US West (Oregon) (us-west-2)
EU (Ireland) (eu-west-1)
  1. If prompted, log in using your AWS account credentials.

On the Create stack page, the fields specifying the CloudFormation template are pre-populated.

  1. Choose Next.
  2. On the Specify stack details page, you can customize the following parameters:
    1. Stack Name – The name that is used to refer to this stack in AWS CloudFormation once deployed. The default is L4VServerlessApp.
    2. AlertsEmailAddress – The email address used for subscribing to email notifications.
    3. ResourcePrefix – AWS resources are named based on the value of this parameter. You must customize this if you’re launching more than one instance of the stack within the same account.
    4. LookoutProjectName – The Amazon Lookout for Vision project name.
    5. LookoutModelVersion –The model version of the specified project. The default is 1.
    6. ConfidenceThresholdForAlerts –The threshold value (0.00 – 1.00) for alerting on low-confidence inference results. The default is 0.20.
    7. ImageFileExtension –The extension type of images that are used for inference. Amazon Lookout for Vision supports JPEG, JPG, and PNG. The default is JPEG.
  3. Choose Next.
  4. Configure stack options if desired, then choose Next.
  5. On the review screen, select the check boxes for:
    1. I acknowledge that AWS CloudFormation might create IAM resources
    2. I acknowledge that AWS CloudFormation might create IAM resources with custom names
    3. I acknowledge that AWS CloudFormation might require the following capability: CAPABILITY_AUTO_EXPAND

These are required to allow AWS CloudFormation to create the IAM roles specified in the CloudFormation stack using both fixed and dynamic names.

  1. Choose Create Change Set.
  2. On the Change Set page, choose Execute to launch your stack.

You may need to wait for the Execution status of the changeset to show as AVAILABLE before the Execute button becomes available.

  1. Wait for the CloudFormation stack to launch.

The stack provisioning is complete when the Stack status shows as CREATE_COMPLETE.

You can monitor the stack creation progress on the Events tab.

  1. On the Outputs tab for the stack, note the API URL value.

We use this to request the Amazon S3 signed URL to upload an image when we test the solution.

  1. Check your inbox (the AlertsEmailAddress address passed as a parameter) for an email from Amazon SNS to confirm subscription.
  2. Confirm the subscription by choosing the link in the email to receive defect detection emails.

Test the solution

Now that we have all the prerequisites and the solution deployed, we can test the model using the code in <your-repository-folder>/scripts/ This code simulates the process of a camera uploading an image as a product passes the inspection point.

  1. Open your terminal and change the directory to the repository folder.
  2. Install the Python requests module:
> pip3 install -t scripts/packages/ requests
  1. Run the following command to test an image upload from a client after making the appropriate changes to the parameters:

For this post, we use the following inputs:

  • DIRECTORY/resources/images/extra_images
  • CAMERA_IDCAM123456
  • API_ENDPOINT (this is an output from the CloudFormation stack provisioned previously)
  • AUTH_TOKENallow or deny

The following example code shows allowing authorization to upload:

> python3 scripts/ resources/images/extra_images CAM123456 ASM123456 allow 0

The following screenshot shows a successful upload of the images from the client.

The following code shows an example of denying authorization to upload:

> python3 scripts/ resources/images/extra_images CAM123456 ASM123456 deny 0

The following screenshot shows that an image upload has been explicitly denied because the client doesn’t have the required authorization token.

Alternatively, you can update the parameters in the file and run it via a terminal:

> sh /path/to/

As images are uploaded to Amazon S3, the state machine starts, and you get email notifications when an image has been classified as an anomaly or if a low-confidence result is returned by the model.

Get notifications

After the email subscription has been confirmed for the SNS topic, Amazon SNS sends email messages to notify the appropriate team of detected anomalies via email.

The relevant stakeholders (such as operators or quality managers) can take appropriate actions based on the notifications. For example, they can choose to classify or grade the product, bin the product and ship, rework, scrap or recycle, investigate the process, or review the inference result and provide feedback to Amazon Lookout for Vision to retrain the model and improve inference results.

You can also use SNS topics to fan out messages to several subscriber systems, including Amazon Simple Queue Service (Amazon SQS), Lambda functions, HTTPS endpoints, and Kinesis Data Firehose.

The following screenshot shows an example of a defect detection alert.

The following screenshot shows an example of a low-confidence inference result email.

Monitor the model

You can use the Amazon Lookout for Vision dashboard to visualize the total images processed, anomalies detected, and anomaly ratio, as in the following screenshot.

Monitor the application

The solution also creates a CloudWatch dashboard to provide a single pane of glass to monitor the serverless application. Specifically, the dashboard provides metrics related to Amazon Lookout for Vision image processing and the metrics related to anomaly detection workflow. You can enhance the dashboard based on your requirements to add further alarms and widgets.

The following screenshots illustrate some of the widgets created as part of the solution.

Visualize insights using QuickSight

You can use QuickSight to build visualizations based on the inference results stored in Amazon S3. In this section, we walk you through setting up QuickSight as a first-time user or existing user, then creating your dataset and visuals.

First-time user

If you’re using QuickSight for the first time in your account, you’re asked to sign up for the service before being able to use it.

  1. Choose Sign up for QuickSight.

  1. Select your preferred edition (the Standard edition is sufficient for this post).

  1. Choose your preferred Region.
  2. Enter a unique QuickSight account name and an email address to receive the notifications.
  3. Select Amazon S3 so QuickSight can auto-discover S3 buckets.

  1. In the pop-up window, select the defects-results S3 bucket, then choose Finish.

  1. Choose Finish to create your QuickSight account.

You can skip the next section and proceed with creating your dataset.

Existing user

If you’re an existing QuickSight user, you need to grant Quicksight access to the destination S3 bucket.

  1. On the QuickSight console, on the user drop-down menu, choose Manage QuickSight.

  1. In the navigation pane, choose Security & Permissions.
  2. Under QuickSight access to AWS Services, choose Add or remove.

  1. Choose Select S3 buckets.
  2. Select the defects-results bucket.

  1. Choose Finish to grant QuickSight access to your S3 bucket.

Create your dataset

To get started creating visualizations, you first need to create a dataset. When you import data into a dataset, it becomes SPICE data because of how it’s stored. SPICE is the QuickSight Super-fast, Parallel, In-memory Calculation Engine. It’s engineered to rapidly perform advanced calculations and serve data.

To create a dataset from Amazon S3, you need a manifest that QuickSight can use to identify the files that you want to use and the upload settings needed to import them.

In the solution that we deployed, the defects-results S3 bucket also stores a manifest file created as part of the stack provisioning using a Lambda-backed custom resource. You can use the S3 URI of this manifest file in QuickSight to import the data in the bucket. To create a dataset and specify the data in the defects-results S3 bucket as a data source, complete the following steps:

  1. On the QuickSight console, choose Datasets in the navigation pane.
  2. Choose New dataset.

  1. For Create a Dataset, choose S3.

  1. For Data source name, enter a name.
  2. For Upload a manifest file, select URL.
  3. Enter the URL of the JSON manifest file created as part of the stack provisioning.

You can locate the URL on the Outputs tab of the CloudFormation stack.

  1. Choose Connect.

Create visuals

A visual is a graphical representation of your data. You can create a wide variety of visuals in an analysis, using different datasets and visual types. The following screenshots show a few sample visuals. For more information, see Creating an Amazon QuickSight Visual.

The following bar charts show anomalous vs. non-anomalous results based on CameraID and AssemblyLineID.

The following charts show the composition of overall records based on CameraID and AssemblyLineID.

The following line charts demonstrate inference results over a period of time.

We can see the distribution of confidence results using calculated fields. For example, we calculate the confidence level field using the following formula:

ifelse({Confidence}>=0.90,"VERY HIGH",({Confidence}>0.70 AND {Confidence}<0.90),"HIGH",{Confidence}>=0.5 AND {Confidence}<=0.7,"MEDIUM",{Confidence}>=0.2 AND {Confidence}<0.5,"LOW","VERY LOW")

The following charts show the confidence levels and the distribution of confidence across inference results.

Clean up

To clean up the resources provisioned as part of the solution, carry out the following steps:

  1. Make sure that the source and defects-results S3 buckets are empty. You can either empty the buckets via the Amazon S3 console or move the objects to another bucket.
  2. On the AWS CloudFormation console, choose the LookoutVisionApp project then right-click and select “Delete Stack”.

The stack takes time to delete; you can track its progress on the Events tab. When the stack deletion is complete, the status changes from DELETE_IN_PROGRESS to DELETE_COMPLETE. The stack then disappears from the list.

  1. Delete the management front-end stack – on the AWS CloudFormation console, choose the LookoutVisionDemo project then right-click and select “Delete Stack”.
  2. Delete the QuickSight dashboard, analysis, and dataset. See the following links for the steps:
    1. Deleting a Dashboard
    2. Deleting an Analysis
    3. Deleting a Dataset

Additional considerations

Amazon Lookout for Vision supports inferencing in the cloud and therefore you need to evaluate your network availability, bandwidth, and latency requirements accordingly. For inferencing at the edge, you can explore AWS Panorama and AWS IoT Greengrass.

Amazon Lookout for Vision provides a direct integration with Amazon SageMaker Ground Truth, which you can use to automate image labeling. For more information, see Automate Data Labeling and Creating a dataset using an Amazon SageMaker Ground Truth manifest file.

You can also set up human review workflows to inspect low-confidence results using Amazon Augmented AI (Amazon A2I).


In this post, we looked at how to use Amazon Lookout for Vision and combine it with other serverless services to automate defect detection for manufactured products, alert operators or quality managers in real time when a defect is detected, and generate visual insights for business users.

Amazon Lookout for Vision allows customers in the manufacturing domain to set up a low-cost solution for improving quality and reducing operational costs without any specialized ML expertise.

To learn more about Amazon Lookout for Vision, see Amazon Lookout for Vision Documentation. For pricing, refer to Amazon Lookout for Vision Pricing.

About the Authors

Mohsin Khan is a Solutions Architect at AWS, based in Manchester, UK. He is passionate about helping customers achieve success on their cloud journeys, enjoys designing solutions with serverless technologies and has a developing interest in machine learning and AI. Apart from work, he likes reading history and watching sports.



Amir Khairalomoum is a Solutions Architect at AWS, based in London, UK. He supports customers in their digital transformation and their cloud journey to AWS. He is passionate about serverless technologies. Outside of work, he loves reading, biking, and traveling.




Ibtehaj Ahmed is a Solutions Architect at AWS, based in London, UK. He loves to help early cloud adopters set up for success and utilize the right technologies. He is passionate about mobile development and purpose-built databases. Outside of work, he plays football regularly and enjoys participating in other sports.