Customer Stories / Software & Internet

2023

AI21 Labs Accelerates Generative AI Model Adoption Using Amazon SageMaker

Learn how AI21 Labs, a leader in generative AI and large language models, rapidly pretrained and released a 17-billion-parameter model using Amazon SageMaker.

Overview | Opportunity | Solution | Outcome | AWS Services Used

Less than 2 months

from project initiation to completion

Pretrained a generative model

with 17 billion parameters efficiently

Saved engineers time

to focus on core tasks rather than on infrastructure setup

Two-thirds of customers

rapidly adopted the Grande model

Achieved low-latency inference

that improves customers’ user satisfaction

Overview

AI21 Labs (AI21), a leader in generative artificial intelligence (AI) and large language models (LLMs), wants to empower businesses with state-of-the-art LLMs and AI applications to build generative AI solutions. Initially, AI21 released two models: one with 7 billion parameters and another with 178 billion parameters. However, the company saw an opportunity to offer customers a midsize model of 17 billion parameters that bridged the gap between the existing sizes. The new pretrained language model would preserve the quality of text generation, making it nearly the same as the largest-size model at a much lower inference cost to AI21 and its customers.

To build that model efficiently, AI21 looked to Amazon Web Services (AWS) and trained the foundation model in under 20 days using Amazon SageMaker, which builds, trains, and deploys machine learning (ML) models for nearly any use case with fully managed infrastructure, tools, and workflows.

Opportunity | Using Amazon SageMaker to Pretrain a LLM with 17 Billion Parameters Efficiently for AI21

Founded in 2017, AI21 offers businesses access to its proprietary language models with AI21 Studio, which over 30,000 developers use to build their own generative AI applications. The company also offers its AI-powered writing-and-reading assistant, Wordtune, which helps tens of millions of users worldwide engage with written language.

In August 2021, AI21 released its Jurassic-1 language model in two sizes: the Large model is fast and cost effective with 7.5 billion parameters, and the Jumbo model offers higher-quality text output at a higher cost with 178 billion parameters. Although bigger models offer the highest quality, they can be costly to run at scale and are less nimble to operate. To help its customers optimize the trade-off between cost and quality when operating at scale, AI21 pretrained and released its third model, Grande, with 17 billion parameters using Amazon SageMaker in December 2022.

AI21 rapidly completed the project in under 2 months from initiation, taking less than 20 days to pretrain the model. Because LLMs are huge neural networks with billions of parameters, training is a challenging and time-consuming project, requiring massive compute resources. Using Amazon SageMaker, AI21 experienced a simpler and more efficient model training process, and the company could scale the distributed training jobs across as many GPUs as needed. “The solutions architects at AWS were responsive and interactive, and we were able to troubleshoot and get the project done on time,” says Dan Padnos, vice president of Platform at AI21.

The company already had experience using AWS and chose Amazon SageMaker because it is cost effective, simple to use, and fully managed. AI21 could also continue using its existing training software stack and get up and running quickly, which was important while the company was building its business. To pretrain the Grande model in less than 20 days, AI21 needed to use 256 A100 GPUs, spread over 32 instances. Large-scale training required a tool that could orchestrate the allocation of the nodes, make logging available in a central location, and reduce manual oversight. “When you’re running a distributed training job of this scale, all sorts of technical challenges that might seem trivial or mundane can become a headache,” says Padnos. “Amazon SageMaker has features that you can use to manage that complexity and reduce the amount of effort that your team needs to invest in the details.” For example, Amazon SageMaker has features like health checks and central logging that companies can use to increase efficiency.

kr_quotemark

Because Amazon SageMaker handles node failures, restarts elegantly, and orchestrates large distributed runs, the team working on pretraining the model can focus on core tasks.”

Dan Padnos
Vice President of Platform, AI21 Labs

Solution | Reducing Latency and Facilitating Growth with a Model Pretrained Using Amazon SageMaker

Using Amazon SageMaker, AI21 released the new model quickly. The company estimates saving several weeks of time compared with its previous training methods. “Because Amazon SageMaker handles node failures, restarts elegantly, and orchestrates large distributed runs, the team working on pretraining the model can focus on core tasks,” says Padnos. “Instead of addressing technical challenges, they can assess how the model is performing and how training is progressing.”

The accelerated timeline was important because the capabilities of the Grande model better meet the needs of the majority of AI21’s customers. Customers with use cases for consumers, such as automatic email drafting, valued migrating from the Jumbo model to the Grande model because their large scale requires cost efficiency. Only a few months after introducing the Grande model, it accounted for roughly two-thirds of the company’s traffic. “We’ve seen rapid adoption and are very pleased with the result,” says Padnos. “Our experience using Amazon SageMaker was very positive. We achieved the outcome we were hoping for—on time, on budget, and without unexpected challenges.”

A key consideration for generative AI applications is low-inference latency because the user experience needs to be smooth. When users draft content using a tool like Wordtune, they want the AI to serve as a quick reference without slowing down their thought process. Using Amazon SageMaker, AI21 achieved low-inference latency with the Grande model to best meet customer needs, reducing latency by four times for one of its large clients. As a result, AI21’s customers can serve millions of users on a daily basis in near real time without detracting from the user experience. “One of our large-scale clients has seen a significant improvement in user satisfaction metrics, which it attributes to the massive reduction in latency when migrating from the Jumbo model to the Grande model,” says Padnos.

The release of the Grande model has also contributed to growth for both AI21 and its customers. “After releasing the Grande model, which was trained using Amazon SageMaker, we’ve seen growth to our overall traffic,” says Padnos. “Individual clients who have migrated to the Grande model have also grown their traffic.”

Outcome | Building the Next Generation of LLMs Using Amazon SageMaker

The Grande model (now called Mid) is available on Amazon SageMaker JumpStart, an ML hub with built-in algorithms, foundation models, and prebuilt ML solutions that Amazon SageMaker users can deploy with a few clicks. The data life cycle is contained within a user’s environment to maintain privacy, and an organization can apply the language model to its data without writing code or needing a code playground. AI21’s next-generation series of foundation models, Jurassic-2, as well as task-specific models are also available on Amazon SageMaker JumpStart.

AI21 is enthusiastic about the increasing adoption of generative AI around the world in the coming months and years. Using AWS services, the company is actively working on LLMs that will be faster as well as more accurate, reliable, and cost effective. “We have a really good relationship with the AWS team,” says Padnos. “Team members have gone deep into the technical details with us and collaborated on challenging tasks. Throughout the process, the AWS team has been creative and has had awareness about our challenges and goals.”

To learn more, visit https://aws.amazon.com/sagemaker.

About AI21 Labs

Software company AI21 Labs offers access to its proprietary language models for developers to create generative artificial intelligence applications as well as its writing-and-reading assistant, Wordtune, which is powered by artificial intelligence.

AWS Services Used

Amazon SageMaker

Amazon SageMaker helps data scientists and developers to prepare, build, train, and deploy high-quality machine learning (ML) models quickly by bringing together a broad set of capabilities purpose-built for ML.

Amazon SageMaker JumpStart

Amazon SageMaker JumpStart is a machine learning (ML) hub with foundation models, built-in algorithms, and prebuilt ML solutions that you can deploy with just a few clicks.

Learn more »

More Generative AI Customer Stories

no items found

1 …

…

Get Started

Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.

Contact Sales