Get Started with Your Machine Learning Project Quickly Using Amazon SageMaker JumpStart

TUTORIAL

Overview

In this tutorial, you will learn how to fast-track your machine learning (ML) project using pretrained models and prebuilt solutions offered by Amazon SageMaker JumpStart. You will then deploy the selected model through Amazon SageMaker Studio notebooks.

Amazon SageMaker JumpStart helps you quickly and easily get started with ML by providing access to hundreds of built-in algorithms with pretrained models from popular model hubs through the user interface. Using the SageMaker Python SDK, you can select a prebuilt model from the model zoo to train on custom data or deploy to a SageMaker endpoint for inference. To make it easier to get started, SageMaker JumpStart provides a set of solutions for the most common use cases that can be deployed readily with just a few clicks. The solutions are fully customizable and showcase the use of AWS CloudFormation templates and reference architectures so you can accelerate your ML journey.

What you will accomplish

In this guide, you will:

  • Deploy a SageMaker JumpStart pretrained model
  • Run inferences using the endpoint deployed from SageMaker JumpStart

Prerequisites

Before starting this guide, you will need:

Implementation

For this tutorial, you will deploy a model called BERT Base Cased that has been pretrained on English text using Wikipedia and that performs well on text classification use cases.

 AWS experience

Beginner

 Time to complete

15 minutes

 Cost to complete

See Amazon SageMaker pricing to estimate cost for this tutorial.

 Requires

You must be logged into an AWS account.

 Services used

Amazon SageMaker JumpStart

 Last updated

June 28, 2022

Step 1: Set up Amazon SageMaker Studio domain

An AWS account can have only one SageMaker Studio domain per AWS Region. If you already have a SageMaker Studio domain in the US East (N. Virginia) Region, follow the SageMaker Studio setup guide to attach the required AWS IAM policies to your SageMaker Studio account, then skip Step 1, and proceed directly to Step 2. 

If you don't have an existing SageMaker Studio domain, continue with Step 1 to run an AWS CloudFormation template that creates a SageMaker Studio domain and adds the permissions required for the rest of this tutorial.

Choose the AWS CloudFormation stack link. This link opens the AWS CloudFormation console and creates your SageMaker Studio domain and a user named studio-user. It also adds the required permissions to your SageMaker Studio account. In the CloudFormation console, confirm that US East (N. Virginia) is the Region displayed in the upper right corner. Stack name should be CFN-SM-IM-Lambda-catalog, and should not be changed. This stack takes about 10 minutes to create all the resources.

This stack assumes that you already have a public VPC set up in your account. If you do not have a public VPC, see VPC with a single public subnet to learn how to create a public VPC.

Select I acknowledge that AWS CloudFormation might create IAM resources, and then choose Create stack.

Step 1:  Set up Amazon SageMaker Studio domain

On the CloudFormation pane, choose Stacks. When the stack is created, the status of the stack should change from CREATE_IN_PROGRESS to CREATE_COMPLETE.

Step 1:  Set up Amazon SageMaker Studio domain

Enter SageMaker Studio into the CloudFormation console search bar, and then choose SageMaker Studio.

Choose US East (N. Virginia) from the Region dropdown list on the upper right corner of the SageMaker console. For Launch app, select Studio to open SageMaker Studio using the studio-user profile.

Step 2: Create a new launcher window and start JumpStart

Getting started with machine learning can be challenging, from knowing which models suit which use case to knowing where to start. Amazon SageMaker JumpStart solves this problem by providing a set of solutions for the most common use cases that can be deployed readily with just a few clicks. Pretrained models and solutions are minutes away from production-capable deployment endpoints.

To get started, you need to open a new launcher window by clicking the + icon at the top of the file window view.

In the top left of the launcher view, choose the JumpStart models, algorithms, and solutions button. This will start SageMaker JumpStart and you will see a new window with a wide variety of featured content, solutions, models, problem types, and more. For this tutorial, you will be running the BERT Base Cased Text pretrained model.

 

Step 2:  Create a new Launcher window and start JumpStart

To find the pretrained model, use the search bar in the upper right and type BERT. It will bring up the models for BERT. Choose the model titled BERT Base Cased Text - Text Classification. Alternatively, you can browse the available models to find the correct one.

The BERT model includes the option to either deploy the pretrained model as-is or to retrain the model. For this tutorial, you’ll deploy the pretrained model as is. To start, choose the dropdown next to Deployment Configuration. Next, choose the dropdown SageMaker hosting instance. You’ll see a number of instance types, which correspond to the resources that will be used to host the endpoint. Select ml.m5.large. The second box corresponds to the endpoint name. Keep the default value and note that you could rename the endpoint if needed.

Choose the arrow on the next dropdown labeled Security Settings. You can set up execution roles, VPC connection, and encryption. For this tutorial these steps won’t be necessary, but note that these options exist and you would likely want to change them for a production deployment. Choose the Deploy button to begin setting up the model endpoint.

Next, you’ll see a dialog box that shows the model deployment status. This part of the process can take 5 to 10 minutes. The dialog box will change to show metadata around the model type, task, endpoint identifier, endpoint name, instance type, number of instances, and model data location as the process progresses. Once the endpoint deployment completes, the service status should update to In Service.

Step 3: Use the demo notebook provided to query the new JumpStart endpoint

Now that you have a model endpoint deployed, you can run inferences against it to retrieve predictions. In this part of the tutorial, you’ll run a short notebook to query the endpoint created in the prior step.

In this step, you will use the provided demo notebook to test the endpoint. Choose the Open Notebook button to open the notebook. The notebook contains Python code to run two text examples through the endpoint and view the model outputs. This model predicts a sentiment score probability and predicted label.

Step 3:  Use the demo notebook provided to query the new JumpStart endpoint

To advance through the notebook, choose the Play icon as noted in the screenshot. As an alternative, you can also hold Shift and press Return to advance through the cells. The predicted labels and associated probabilities will be printed out at the bottom of the cell.

Step 3:  Use the demo notebook provided to query the new JumpStart endpoint
Step 3:  Use the demo notebook provided to query the new JumpStart endpoint

You have now deployed a model endpoint using the pretrained BERT Base Cased Text - Text Classification model with minimal manual effort. Congratulations!

Step 4: Clean up the resources

It is a best practice to delete resources that you are no longer using so that you don't incur unintended charges.

If you used an existing SageMaker Studio domain in Step 1, skip the rest of Step 4 and proceed directly to the conclusion section. 

If you ran the CloudFormation template in Step 1 to create a new SageMaker Studio domain, continue with the following steps to delete the domain, user, and the resources created by the CloudFormation template.

To open the CloudFormation console, enter CloudFormation into the AWS console search bar, and choose CloudFormation from the search results.

Step 4: Clean up your AWS resources

In the CloudFormation pane, choose Stacks. From the status dropdown list, select Active. Under Stack name, choose CFN-SM-IM-Lambda-catalog to open the stack details page.

Step 4: Clean up your AWS resources

On the CFN-SM-IM-Lambda-catalog stack details page, choose Delete to delete the stack along with the resources it created in Step 1.

Step 4: Clean up your AWS resources

Conclusion

Congratulations! You have completed the Get Started with Your Machine Learning Project Quickly Using Amazon SageMaker JumpStart tutorial.
 

You have learned how to select and deploy an Amazon SageMaker JumpStart pretrained model and make predictions.

Was this page helpful?

Next steps