Posted On: Oct 17, 2023

We are pleased to announce the preview of ml.p4d, ml.trn1, and ml.g5 instances, in new regions, for asynchronous and real-time inference of machine learning (ML) models on Amazon SageMaker. These instances are also generally available in other regions for inference.

  • ml.p4d.24xlarge instances, now available as a preview in AWS GovCloud(US-West), Europe (Ireland), Asia Pacific (Tokyo), and Asia Pacific (Singapore) regions, deliver high performance for deep learning models. With 40 GB of memory per Nvidia A100 GPU, P4d instances enable high performance machine learning inference on large models and generative AI use cases.
  • ml.trn1 instances, now available as a preview in US West (Oregon), offer support for high-performance inference workloads on 100B+ parameter deep learning and generative AI models, spanning applications such as text summarization, code generation, and question answering.
  • ml.g5 instances, now available as a preview in Asia Pacific (Seoul) and South America (Sao Paulo), are ideal for use cases like recommendations, chatbots, smart assistants, and image recognition.

To access to these previews, simply request limit increases through AWS Service Quotas. For pricing information on these instances, please visit our pricing page. For more information on deploying models with SageMaker, see the overview here and the documentation here. To learn more about the instances in preview see the G5 product page, Trn1 product page, and P4 product page.