Posted On: Jun 23, 2023

Amazon SageMaker Inference Recommender is a capability of Amazon SageMaker that reduces time required to get machine learning (ML) models in production by automating load testing and model tuning across SageMaker ML instances. Today, SageMaker Inference Recommender is announcing two key features. First, you can now use Inference Recommender from the AWS console for SageMaker. Second, Inference Recommender now offers recommendations on prospective instances to deploy a model at the time of model creation.

Customers can now view the prospective list of instances to deploy their model, as part of the model creation workflow. To tailor recommendations provided at model creation time for optimal cost or performance, users can run benchmarking or load testing jobs with their custom sample input payload. Users can view list of recommended instances either programmatically by using the DescribeModel API, or via the SageMaker console UI.

Additionally, customers can now access SageMaker Inference Recommender in AWS console. Previously, customers could only run Inference Recommender jobs through the AWS SDK, AWS CLI, or SageMaker Studio. Customers who prefer AWS console had to navigate between the SDK, Studio, and AWS console to get recommendations, and customers exclusively using the AWS console couldn’t benefit at all. With this launch, AWS console users can now run Inference Recommender jobs in the console to get prospective instance types and run benchmarking jobs to get recommendations optimized for cost and performance.

To learn more on the launches please visit the documentations here and here. To get started, log into Amazon SageMaker console.