AWS Machine Learning Blog
Using task-specific models from AI21 Labs on AWS
In this blog post, we will show you how to leverage AI21 Labs’ Task-Specific Models (TSMs) on AWS to enhance your business operations. You will learn the steps to subscribe to AI21 Labs in the AWS Marketplace, set up a domain in Amazon SageMaker, and utilize AI21 TSMs via SageMaker JumpStart.
AI21 Labs is a foundation model (FM) provider focusing on building state-of-the-art language models. AI21 Task Specific Models (TSMs) are built for answering questions, summarization, condensing lengthy texts, and so on. AI21 TSMs are available in Amazon SageMaker Jumpstart.
Here are the AI21 TSMs that can be accessed and customized in SageMaker JumpStart: AI21 Contextual Answers, AI21 Summarize, AI21 Paraphrase, and AI21 Grammatical Error Correction.
AI21 FMs (Jamba-Instruct, AI21 Jurassic-2 Ultra, AI21 Jurassic-2 Mid) are available in Amazon Bedrock and can be used for large language model (LLM) use cases. We used AI21 TSMs available in SageMaker Jumpstart for this post. SageMaker Jumpstart enables you to select, compare, and evaluate available AI21 TSMs.
AI21’s TSMs
Foundation models can solve many tasks, but not every task is unique. Some commercial tasks are common across many applications. AI21 Labs’ TSMs are specialized models built to solve a particular problem. They’re built to deliver out-of-box value, cost effectiveness, and higher accuracy for the common tasks behind many commercial use-cases. In this post, we will explore three of AI21 Labs’ TSMs and their unique capabilities.
Foundation models are built and trained on massive datasets to perform a variety of tasks. Unlike FMs, TSMs are trained to perform unique tasks.
When your use case is supported by a TSM, you quickly realize benefits such as improved refusal rates when you don’t want the model to provide answers unless they’re grounded in actual document content.
- Paraphrase: This model is used to enhance content creation and communication by generating varied versions of text while maintaining a consistent tone and style. This model is ideal for creating multiple product descriptions, marketing materials, and customer support responses, improving clarity and engagement. It also simplifies complex documents, making information more accessible.
- Summarize: This model is used to condense lengthy texts into concise summaries while preserving the original meaning. This model is particularly useful for processing large documents, such as financial reports, legal documents, and technical papers, making critical information more accessible and comprehensible.
- Contextual answers: This model is used to significantly enhance information retrieval and customer support processes. This model excels at providing accurate and relevant answers based on specific document contexts, making it particularly useful in customer service, legal, finance, and educational sectors. It streamlines workflows by quickly accessing relevant information from extensive databases, reducing response times and improving customer satisfaction.
Prerequisites
To follow the steps in this post, you must have the following prerequisites in place:
AWS account setup
Completing the labs in this post requires an AWS account and SageMaker environments set up. If you don’t have an AWS account, see Complete your AWS registration for the steps to create one.
AWS Marketplace opt-in
AI21 TSMs can also be accessed through Amazon Marketplace for subscription. Using AWS Marketplace, you can subscribe to AI21 TSMs and deploy SageMaker endpoints.
To do these exercises you must subscribe to the following offerings in the AWS Marketplace
Service quota limits
To use some of the GPU’s required to run AI21’s task specific models, you must have the required service quota limits. You can request a service quota limit increase in the AWS Management Console. Limits are account and resource specific.
To create a service request, search for service quotas in the console search bar. Select the service to land go to the dashboard and enter the name of the GPU (for example, ml.g5.48xlarge). Ensure the quota is for endpoint usage
Estimated cost
The following is the estimated cost to walk through the solution in this post.
Contextual answers:
- We used an ml.g5.48xlarge
- By default, AWS accounts don’t have access to this GPU. You must request a service quota limit increase (see the previous section: Service Quota Limits).
- The notebook runtime was approximately 15 minutes.
- The cost was $20.41 (billed on an hourly basis).
Summarize notebook
- We used an ml.g4dn.12xlarge GPU.
- You must request a service quota limit increase (see the previous section: Service Quota Limits).
- The notebook runtime was approximately 10 minutes.
- The cost was $4.94 (billed on an hourly basis).
Paraphrase notebook
- We used the ml.g4dn.12xlarge GPU.
- You must request a service quota limit increase (see the previous section: Service Quota Limits).
- The notebook runtime approximately 10 minutes.
- The cost was $4.94 (billed on an hourly basis).
Total cost: $30.29 (1 hour charge for each deployed endpoint)
Using AI21 models on AWS
Getting started
In this section, you will access AI21 TSMs in SageMaker Jumpstart. These interactive notebooks contain code to deploy TSM endpoints and will also provide example code blocks to run inference. These first few steps are pre-requisites to deploying the same notebooks. If you already have a SageMaker domain and username set up, you may skip to Step 7.
- Use the search bar in the AWS Management Console to navigate to Amazon SageMaker , as shown in the following figure.
If you don’t already have one set up, you must create a SageMaker domain. A domain consists of an associated Amazon Elastic File System (Amazon EFS) volume; a list of authorized users, and a variety of security, application, policy, and Amazon Virtual Private Cloud (Amazon VPC) configurations.
Users within a domain can share notebook files and other artifacts with each other. For more information, see Learn about Amazon SageMaker domain entities and statuses. For today’s exercises, you will use Quick Set-Up to deploy an environment.
- Choose Create a SageMaker domain as shown in the following figure.
- Select Quick setup. After you choose Set up the domain will begin creation
- After a moment, your domain will be created.
- Choose Add user.
- You can keep the default user profile values.
- Launch Studio by choosing Launch button and then selecting Studio.
- Choose JumpStart in the navigation pane as shown in the following figure.
Here you can see the model providers for our JumpStart notebooks.
You will see the model providers for JumpStart notebooks.
- Select AI21 Labs to see their available models.
Each of AI21’s models has an associated model card. A model card provides key information about the model such as its intended use cases, training, and evaluation details. For this example, you will use the Summarize, Paraphrase, and Contextual Answers TSMs.
- Start with Contextual Answers. Select the AI21 Contextual Answers model card.
A sample notebook is included as part of the model. Jupyter Notebooks are a popular way to interact with code and LLMs.
- Choose Notebooks to explore the notebook.
- To run the notebook’s code blocks, choose Open in JupyterLab.
- If you do not already have an existing space, choose Create new space and enter an appropriate name. When ready, choose Create space and open notebook.
It can take up to 5 minutes to open your notebook.
SageMaker Spaces are used to manage the storage and resource needs of some SageMaker Studio applications. Each space has a 1:1 relationship with an instance of an application.
- After the notebook opens, you will be prompted to select a kernal. Ensure Python 3 is selected and choose Select.
Navigating the notebook exercises
Repeat the preceding process to import the remaining notebooks.
Each AI21 notebook demonstrates required code imports, version checks, model selection, endpoint creation, and inferences showcasing the TSM’s unique strengths through code blocks and example prompts
Each notebook will have a clean up step at the end to delete your deployed endpoints. It’s important to terminate any running endpoints to avoid additional costs.
Contextual Answers JumpStart Notebook
AWS customers and partners can use AI21 Labs’s Contextual Answers model to significantly enhance their information retrieval and customer support processes. This model excels at providing accurate and relevant answers based on specific context, making it useful in customer service, legal, finance, and educational sectors.
The following are code snippets from AI21’s Contextual Answers TSM through JumpStart. Notice that there is no prompt engineering required. The only input is the question and the context provided.
Input:
Output:
As mentioned in our introduction, AI21’s Contextual Answers model does not provide answers to questions outside of the context provided. If the prompt includes a question unrelated to 2020/2021 economy, you will get a response as shown in the following example.
Input:
Output:
None
When finished, you can delete your deployed endpoint by running the final two cells of the notebook.
You can import the other notebooks by navigating to SageMaker JumpStart and repeating the same process you used to import this first notebook.
Summarize JumpStart Notebook
AWS customers and partners can uses AI21 Labs’ Summarize model to condense lengthy texts into concise summaries while preserving the original meaning. This model is particularly useful for processing large documents, such as financial reports, legal documents, and technical papers, making critical information more accessible and comprehensible.
The following are highlight code snippets from AI21’s Summarize TSM using JumpStart. Notice that the input must include the full text that the user wants to summarize.
Input:
Paraphrase JumpStart Notebook
AWS customers and partners can use AI21 Labs’s Paraphrase TSM through JumpStart to enhance content creation and communication by generating varied versions of text.
The following are highlight code snippets from AI21’s Paraphrase TSM using JumpStart. Notice that there is no extensive prompt engineering required. The only input required is the full text that the user wants to paraphrase and a chosen style, for example casual, formal, and so on.
Input:
Input:
Output:
Less prompt engineering
A key advantage of AI21’s task-specific models is the reduced need for complex prompt engineering compared to foundation models. Let’s consider how you might approach a summarization task using a foundation model compared to using AI21’s specialized Summarize TSM.
For a foundation model, you might need to craft an elaborate prompt template with detailed instructions:
That’s it! With the Summarize TSM, you pass the input text directly to the model; there’s no need for an intricate prompt template.
Lower cost and higher accuracy
By using TSMs, you can achieve lower costs and higher accuracy. As demonstrated previously in the Contextual Notebook, TSMs have a higher refusal rate than most mainstream models, which can lead to higher accuracy. This characteristic of TSMs is beneficial in use cases where wrong answers are less acceptable.
Conclusion
In this post, we explored AI21 Labs’s approach to generative AI using task-specific models (TSMs). Through guided exercises, you walked through the process of setting up a SageMaker domain and importing sample JumpStart Notebooks to experiment with AI21’s TSMs, including Contextual Answers, Paraphrase, and Summarize.
Throughout the exercises, you saw the potential benefits of task-specific models compared to foundation models. When asking questions outside the context of the intended use case, the AI21 TSMs refused to answer, making them less prone to hallucinating or generating nonsensical outputs beyond their intended domain—a critical factor for applications that require precision and safety. Lastly, we highlighted how task-specific models are designed from the outset to excel at specific tasks, streamlining development and reducing the need for extensive prompt engineering and fine-tuning, which can them a more cost-effective solution.
Whether you’re a data scientist, machine learning practitioner, or someone curious about AI advancements, we hope this post has provided valuable insights into the advantages of AI21 Labs’s task-specific approach. As the field of generative AI continues to evolve rapidly, we encourage you to stay curious, experiment with various approaches, and ultimately choose the one that best aligns with your project’s unique requirements and goals. Visit AWS GitHub for other example use cases and codes to experiment in your own environment.
Additional resources
- AI21 Labs
- Transform your business with generative AI
- Amazon SageMaker
- Getting started with Amazon SageMaker JumpStart
- Amazon Bedrock
- Task-Specific Models overview
About the Authors
Joe Wilson is a Solutions Architect at Amazon Web Services supporting nonprofit organizations. He has core competencies in data analytics, AI/ML and GenAI. Joe background is in data science and international development. He is passionate about leveraging data and technology for social good.
Pat Wilson is a Solutions Architect at Amazon Web Services with a focus on AI/ML workloads and security. He currently supports Federal Partners. Outside of work Pat enjoys learning, working out, and spending time with family/friends.
Josh Famestad is a Solutions Architect at Amazon Web Services. Josh works with public sector customers to build and execute cloud based approaches to deliver on business priorities.