Posted On: Mar 4, 2022

Amazon SageMaker Serverless and Asynchronous Inference now support Amazon SageMaker Python SDK, which abstracts the steps required for deployment and thereby simplifies the model deployment workflow. The SageMaker Python SDK is an open source library for deploying machine learning models on Amazon SageMaker. You can use any of the optimized machine learning frameworks, SageMaker supported first-party algorithms, or bring your own model to deploy using the Python SDK.

SageMaker offers multiple inference options such as Real-Time, Serverless Inference (in Preview), Asynchronous Inference, and Batch Transform so you can pick the option that best suits your workload. The SageMaker Python SDK already supports Real-Time Inference and Batch Transform. Now with support for serverless inference (in Preview) and asynchronous inference, you can use the same Python SDK API methods across all inference options. You now have the option to choose between AWS management console, AWS Boto3 SDK, AWS CLI and Python SDK for model deployment.

You can invoke an asynchronous inference endpoint through the Python SDK by passing the payload in-line with the request. The SageMaker SDK will upload the payload to your S3 bucket and invoke the endpoint on your behalf. The Python SDK also adds support to periodically check and return the inference result upon completion.

To get started please read the Python SDK documentation for serverless inference and asynchronous inference.