AWS Trainium Customers
See how customers are using AWS Trainium to build, train, and fine-tune deep learning models.
Anthropic
At Anthropic, millions of people rely on Claude daily for their work. We're announcing two major advances with AWS: First, a new "latency-optimized mode" for Claude 3.5 Haiku which runs 60% faster on Trainium2 via Amazon Bedrock. And second, Project Rainier—a new cluster with hundreds of thousands of Trainium2 chips delivering hundreds of exaflops, which is over five times the size of our previous cluster. Project Rainier will help power both our research and our next generation of scaling. For our customers, this means more intelligence, lower prices, and faster speeds. We're not just building faster AI, we're building trustworthy AI that scales.

Databricks
Databricks’ Mosaic AI enables organizations to build and deploy quality Agent Systems. It is built natively on top of the data lakehouse, enabling customers to easily and securely customize their models with enterprise data and deliver more accurate and domain-specific outputs. Thanks to Trainium's high performance and cost-effectiveness, customers can scale model training on Mosaic AI at a low cost. Trainium2’s availability will be a major benefit to Databricks and its customers as demand for Mosaic AI continues to scale across all customer segments and around the world. Databricks, one of the largest data and AI companies in the world, plans to use TRN2 to deliver better results and lower TCO by up to 30% for its customers.

poolside
At poolside, we are set to build a world where AI will drive the majority of economically valuable work and scientific progress. We believe that software development will be the first major capability in neural networks that reaches human-level intelligence because it's the domain where we can combine Search and Learning approaches the best. To enable that, we're building foundation models, an API, and an Assistant to bring the power of generative AI to your developers' hands (or keyboard). A major key to enable this technology, is the infrastructure we are using to build and run our products. With AWS Trainium2 our customers will be able to scale their usage of poolside at a price performance ratio unlike other AI accelerators. In addition, we plan to train future models with Trainium2 UltraServers with expected savings of 40% compared to EC2 P5 instances.

Itaú Unibanco
We have tested AWS Trainium and Inferentia across various tasks, ranging from standard inference to fine-tuned applications. The performance of these AI chips have enabled us to achieve significant milestones in our research and development. For both batch and online inference tasks, we have seen a 7x improvement in throughput compared to GPUs. This enhanced performance is driving the expansion of more use cases across the organization. The latest generation of Trainium2 chips unlocks groundbreaking features for GenAI and opens the door for innovation at Itau.

NinjaTech AI
We are extremely excited for the launch of AWS TRN2 because we believe it’ll offer the best cost per token performance and most the fastest speed currently possible for our core model Ninja LLM which is based off of Llama 3.1 405B. It’s amazing to see Trn2’s low latency coupled with competitive pricing and on-demand availability; we couldn’t be more excited about Trn2’s arrival!

Ricoh
The migration to Trn1 instances was easy and straightforward. We were able to pretrain our 13B parameter LLM in just 8 days, utilizing a cluster of 4,096 Trainium chips! After the success we saw with our smaller model, we fine-tuned a new, larger LLM based on Llama-3-Swallow-70B, and leveraging Trainium we were able to reduce our training costs by 50% and improve the energy efficiency by 25% as compared to using latest GPU machines in AWS. We are excited to leverage the latest generation of AWS AI Chips, Trainium2, to continue to provide our customers with the best performance at the lowest cost.

PyTorch
What I liked most about AWS Neuron NxD Inference library is how seamlessly it integrates with PyTorch models. NxD's approach is straightforward and user-friendly. Our team was able to onboard HuggingFace PyTorch models with minimal code changes in a short time frame. Enabling advanced features like Continuous Batching and Speculative Decoding was straightforward. This ease of use enhances developer productivity, allowing teams to focus more on innovation and less on integration challenges.

Refact.ai
Customers have seen up to 20% higher performance and 1.5x higher tokens per dollar with EC2 Inf2 instances compared to EC2 G5 instances. Refact.ai’s fine-tuning capabilities further enhance our customers’ ability to understand and adapt to their organizations’ unique codebase and environment. We are also excited to offer the capabilities of Trainium2, that will bring even faster, more efficient processing to our workflows. This advanced technology will enable our customers to accelerate their software development process, by boosting developer productivity while maintaining strict security standards for their code base.

Karakuri Inc.
KARAKURI, builds AI tools to improve the efficiency of web based customer support and simplify customer experiences. These tools include AI chatbots equipped with generative AI functions an FAQ centralization tools, and an email response tool, all of which improve the efficiency and quality of customer support. Utilizing AWS Trainium, we succeeded in training KARAKURI LM 8x7B Chat v0.1. For startups, like ourselves, we need to optimize the time to build and the cost required to train LLMs. With the support of AWS Trainium and AWS Team, we were able to develop a practical level LLM in a short period of time. Also, by adopting AWS Inferentia, we were able to build a fast and cost-effective inference service. We're energized about Trainium2 because it will revolutionize our training process, reducing our training time by 2x and driving efficiency to new heights!

Stockmark Inc.
With the mission of “reinventing the mechanism of value creation and advancing humanity,” Stockmark helps many companies create and build innovative businesses by providing cutting-edge natural language processing technology. Stockmark’s new data analysis and gathering service called Anews and SAT, a Data structuring service that dramatically improves generative AI uses by organizing all forms of information stored in an organization, required us to rethink how we built and deployed models to support these products. With 256 Trainium accelerators, we have developed and released stockmark- 13b, a large language model with 13 billion parameters, pre-trained from scratch on a Japanese corpus dataset of 220B tokens. Trn1 instances helped us to reduce our training costs by 20%. Leveraging Trainium, we successfully developed an LLM that can answer business- critical questions for professionals with unprecedented accuracy and speed. This achievement is particularly noteworthy given the widespread challenge companies face in securing adequate computational resources for model development. With the impressive speed and cost reduction of Trn1 instances, we are excited to see the additional benefits that Trainium2 will bring to our workflows and customers.
