AWS Trainium is the second custom machine learning (ML) chip designed by AWS that provides the best price performance for training ML models in the cloud. In addition to delivering the most cost-effective ML training, Trainium offers the highest performance with the most teraflops (TFLOPS) of compute power for ML in the cloud and enables a broader set of ML applications. The Trainium chip is specifically optimized for deep learning training workloads for applications including image classification, semantic search, translation, voice recognition, natural language processing and recommendation engines.
As the use of ML accelerates, there's a pressing need to increase performance and reduce infrastructure costs driven by inference and training. Last year, AWS launched AWS Inferentia, a custom chip that provides customers with high performance ML inference at the lowest cost in the cloud. While Inferentia addressed the cost of inference, which constitutes up to 90% of ML infrastructure costs, many development teams are also limited by fixed ML training budgets. This puts a cap on the scope and frequency of training needed to improve their models and applications. AWS Trainium addresses this challenge by providing the highest performance and lowest cost for ML training in the cloud. With both Trainium and Inferentia, customers will have an end-to-end flow of ML compute from scaling training workloads to deploying accelerated inference.
AWS Trainium shares the same AWS Neuron SDK as AWS Inferentia making it easy for developers using Inferentia to get started with Trainium. Because the Neuron SDK is integrated with popular ML frameworks including TensorFlow, PyTorch, and MXNet developers can easily migrate to AWS Trainium from GPU-based instances with minimal code changes. AWS Trainium will be available via Amazon EC2 instances and AWS Deep Learning AMIs as well as managed services including Amazon SageMaker, Amazon ECS, EKS, and AWS Batch.
Sign-up for early access
AWS Trainium will be available in 2021. To be notified about early access to AWS Trainium, sign up here, and we’ll contact you when more information becomes available.
AWS Inferentia is a ML inference chip custom-built by AWS to deliver high performance and the lowest cost machine learning inference in the cloud. AWS Inferentia enables up to 30% higher throughput and up to 45% lower cost-per-inference than Amazon EC2 G4 instances, which were already the lowest cost instances for ML inference in the cloud.