Posted On: Aug 23, 2023

We are excited to announce that customers can now update their Amazon SageMaker Endpoints using a rolling deployment strategy. Rolling deployment makes it easier for you to update fully-scaled endpoints that are deployed on hundreds of popular accelerated compute instances.

Amazon SageMaker makes it easy to deploy ML models to an endpoint and invoke it to make predictions (also known as inference) at the best price-performance for any use case. Previously, SageMaker only supported blue-green deployments when endpoints had to be updated with new models. Blue-green deployments provisioned a new fleet of instances with the updated model first before shifting traffic from the old fleet to the new one. So, when you updated your endpoint with a new model, you needed twice the number of instances used by your endpoint. With rolling deployments, instances on the old endpoint are cleaned up after each traffic shift to the new endpoint, reducing the amount of additional instances needed to update your endpoint. This new update strategy is part of deployment guardrails which lets you control the size of the traffic shifting steps, as well as specify an evaluation period to monitor the new instances for issues before terminating instances from the old endpoint.

This feature is available through our APIs, SDKs, and CloudFormation in all commercial regions where Amazon SageMaker is available.

To learn more about rolling deployments, including how to set it up, please see our documentation. To learn about different endpoint update strategies, see our updating models in production documentation.