Customize your Amazon SageMaker model deployment software and driver versions

Posted on: Sep 25, 2024

You can now pick the software and driver versions used by the instances that best fits your needs when deploying models on SageMaker. Amazon SageMaker makes it easier to deploy ML models including foundation models (FMs) to make inference requests at the best price performance for any use case.

Previously, customers had to use preset software and driver versions defined by SageMaker on the managed instances behind an endpoint. Now customers can specify the “InferenceAmiVersion” parameter when configuring endpoints to select the combination of software and driver versions (such as Nvidia driver and CUDA version) that best meets their requirements. This allows you to tailor your hosting environment to meet your performance, compatibility, scalability, and operational requirements of your ML applications. By using this parameter, you can also downgrade and upgrade driver versions for your endpoints on your own schedule.

This feature is available in all regions where SageMaker is available. You can learn more about deploying model on SageMaker here and more about this feature in our documentation.