Posted On: Aug 24, 2023

The Amazon Elastic Compute Cloud (Amazon EC2) Trn1 instances are now generally available in the US East (Ohio) region. Trn1 instances deliver high performance training of popular Generative AI models on AWS, while offering up to 50% lower cost-to-train over comparable Amazon EC2 instances. 

You can use EC2 Trn1 instances to train popular Large Language Models such as GPT and LLaMA, vision models such as Stable Diffusion, and a variety of other deep learning models for recommendation, fraud detection, forecasting, and more. Trn1 instances are enabled by the AWS Neuron SDK, which is integrated with leading ML frameworks such as PyTorch and TensorFlow, and libraries such as Megatron-LM, NeMo, Neuron Distributed, and Hugging Face, so you can continue using your existing frameworks and run your application with minimal code changes. Developers can run Deep Learning training workloads on Trn1 instances using AWS Deep Learning AMIs, AWS Deep Learning Containers, or managed services such as AWS ParallelCluster, Amazon Elastic Kubernetes Service (Amazon EKS), Amazon SageMaker, AWS Batch etc.

Amazon EC2 Trn1 instances are available in two sizes: trn1.2xlarge, for experimenting with a single accelerator and training small models cost effectively, and trn1.32xlarge for training large-scale models. They are available in the following AWS Regions as On-Demand Instances, Reserved Instances, and Spot Instances, or as part of a Savings Plan: US East (N. Virginia), US West (Oregon), and US East (Ohio).  

To learn more about Trn1 instances, see the Amazon EC2 Trn1 Instances webpage and the AWS Neuron Documentation