Amazon SageMaker shadow testing

Validate the performance of new ML models against production models to prevent costly outages

Spot potential configuration errors before they impact end users by comparing new ML models against production models.

Improve inference performance by evaluating model changes, container updates, and new instances with production traffic.

Cut down on weeks of building a testing infrastructure and release models to production faster.

How it works

SageMaker helps you run shadow tests to evaluate a new machine learning (ML) model before production release by testing its performance against the currently deployed model. Shadow testing can help you catch potential configuration errors and performance issues before they impact end users.

Diagram shows comparing the performance of new ML models against production models using Amazon SageMaker shadow testing

Key features

Fully managed testing

With SageMaker shadow testing, you don’t need to invest in building your own testing infrastructure, so you can focus on model development. Just select the production model that you want to test against, and SageMaker automatically deploys the new model in a test environment. It then routes a copy of the inference requests received by the production model to the new model in real time and collects performance metrics such as latency and throughput.

Live performance comparison dashboards

SageMaker creates a live dashboard that shows performance metrics such as latency and error rate of the new model and the production model in a side-by-side comparison. Once you have reviewed the test results and validated the model, you can promote it to production.

Fine-grain traffic control

When running shadow tests in SageMaker, you can configure the percentage of inference requests sent to the test models. This control over the input traffic allows you to start small and increase testing only after you gain confidence in model performance.


"Amazon SageMaker’s new testing capabilities allowed us to more rigorously and proactively test ML models in production and avoid adverse customer impact and any potential outages because of an error in deployed models. This is critical, since our customers rely on us to provide timely insights based on real-time location data that changes every minute.”

Giovanni Lanfranchi, Chief Product and Technology Officer, HERE Technologies


Blog Post

Perform Shadow Tests to Compare Inference Performance Between ML Model Variants

Developer Guide

Learn more about SageMaker support for shadow testing


AWS re:Invent 2022 - Minimizing the production impact of ML model updates with shadow testing (AIM343)