AWS Marketplace

Using TorchServe to list PyTorch models at scale in AWS Marketplace

Recently, AWS announced the release of TorchServe, a PyTorch open-source project in collaboration with Facebook. PyTorch is an open-source machine learning framework created by Facebook, which is popular among ML researchers and data scientists. Despite its ease of use and “Pythonic” interface, deploying and managing models in production is still difficult as it requires data scientists to write prediction APIs and scale them. However, using TorchServe, data scientists and data engineers can deploy and host their machine learning models without writing custom code.

TorchServe provides a convenient framework for AWS Marketplace sellers to list their products without writing their own endpoint controllers and handlers. Before the release of TorchServe, if you wanted to list a PyTorch machine learning model, you needed to develop custom handlers and build your own Docker image. You also had to figure out how to make correct API calls in and out of the container network, and solve other one-time problems in developing the model server. TorchServe can simplify this process, and the whole listing process can happen in fewer than 10 minutes.

We previously published a blog post about how to host a PyTorch machine learning model in Amazon SageMaker. In this blog post, I further explore TorchServe with SageMaker and show how to use TorchServe to list PyTorch models at scale in AWS Marketplace.

Prerequisites

This solution has the following prerequisites:

Solution overview

I use an end-to-end notebook to demonstrate the listing process. The notebook is held in https://github.com/aws-samples/aws-marketplace-machine-learning. You can clone the notebook listing-torchserve-models.ipynb to create your PyTorch products. The notebook comes with a Docker file and other configuration and script files to build the Docker image for TorchServe. For more information, see Docker image and Dockerfile reference on the Docker Docs website.

The following steps show you how to:

  • install TorchServe
  • create a Docker image of TorchServe
  • create a model archive format (.mar) file from a PyTorch data format (.pth) file
  • create a SageMaker model package with the TorchServe Docker image and model archive file
  • validate it and list your product in AWS Marketplace.

You can run the steps on SageMaker notebook instances, Amazon EC2 instances, and your computer in a terminal window. If you’re running on your local environment, install the AWS CLI and configure it, AWS SDK for Python (boto3), and Amazon SageMaker Python SDK.

I recommend running the steps on a SageMaker notebook instance so that you can use the provided sample notebook. For more information, see Use Amazon SageMaker Notebook Instances in the Amazon SageMaker Developer Guide.

Solution walkthrough

Step 0: Update AWS CLI, AWS SDK and Amazon SageMaker SDK

Before you begin, update the AWS CLI, the AWS SDK, and the SageMaker SDK. You can run the following commands in a terminal window or the notebook:

pip install --upgrade pip 
pip -q install sagemaker awscli boto3 --upgrade

Step 1: Git clone TorchServe and install the model archiver

As the first step, you need to download and install TorchServe and its model archiver. The model archiver has the torch-model-archiver tool, which is used to convert model data from PyTorch data format (.pth) file to model archive format (.mar) file. TorchServe uses model archive format file to host the model server.

Run the following command in the notebook to clone the serve folder and install model-archiver. There are different default handlers in the serve/examples/folder directory. Use them to launch your model:

!git clone https://github.com/pytorch/serve.git 
!pip install serve/model-archiver/

 

Step 2: Build a TorchServe Docker image and push it to Amazon ECR

AWS Marketplace and SageMaker require the seller to provide the machine learning product in a container for better data privacy and intellectual property (IP) protection. In this step, you create your own Docker image of TorchServe and push it to Amazon ECR. You can reuse the same Docker image for various models and listings.

a. Create a boto3 session and get the account

Run the following code in the notebook:

import boto3, time, json
sess    = boto3.Session()
sm      = sess.client('sagemaker')
region  = sess.region_name
account = boto3.client('sts').get_caller_identity().get('Account')

b. Create an Amazon ECR registry through AWS CLI

Name the registry torchserve-base because you can use it for different PyTorch model listings. To do that, run the following in the notebook:

registry_name = 'torchserve-base' 
!aws ecr create-repository --repository-name {registry_name}

c. Build the Docker image and push it to Amazon ECR

i. Name your image label v1 and push it to the registry by running the following command in the notebook:

image_label = 'v1'
image = f'{account}.dkr.ecr.{region}.amazonaws.com/{registry_name}:{image_label}'

!docker build -t {registry_name}:{image_label} .
!$(aws ecr get-login --no-include-email --region {region})
!docker tag {registry_name}:{image_label} {image}
!docker push {image}

ii. Scan your Docker image in Amazon ECR after you push the image. To do that, do the following:

      • Sign in your AWS console and navigate to Amazon ECR.
      • Choose the repository you created in step 2.b, select the image, and then click Scan to scan your image.
      • The Scan status shows as Complete after you scan the Docker image. The following screenshot shows the TorchServe image with tag of v1 and a scan status as Complete.

Congratulations! You have created a TorchServe Docker image.

 

Step 3: Create a TorchServe model archive with a PyTorch model and upload it to Amazon S3

TorchServe needs a specific model archive format (.mar) file to serve the model. The torch-model-archiver tool can convert the model from a .pth model file to .mar. You don’t need to create a custom handler and can specify with the option --handler image_classifier, for example. The torch-model-archiver tool automatically sets up a handler for you. TorchServe supports default handlers for image classification, image segmentation, object detection, and text classification. Most machine learning models fall into these categories. You can also write your own custom hander to serve your model.

After converting the model file, you must convert it into a compressed tar format (tar.gz) and upload it to an Amazon S3 bucket.

a. Create a TorchServe archive with a PyTorch model (your own model or a downloaded version)

In this notebook, I recommend downloading a DenseNet-161 model for demonstration.

i. First, download the model from PyTorch.org and name the file densenet161. To do this, run the following command in the notebook:

!wget -q https://download.pytorch.org/models/densenet161-8d451a50.pth 
model_file_name = 'densenet161'

The model file is .pth. You can also use your own trained version here instead of downloaded one, if you prefer.

ii. Second, covert the .pth model file to .mar by running the following code in the notebook:

!torch-model-archiver --model-name {model_file_name} \
--version 1.0 --model-file serve/examples/image_classifier/densenet_161/model.py \
--serialized-file densenet161-8d451a50.pth \
--extra-files serve/examples/image_classifier/index_to_name.json \
--handler image_classifier

iii. To view the created .mar files, run the following command in the notebook:

!ls *.mar

b. Upload the generated .mar archive file to Amazon S3

SageMaker expects that models are in a tar.gz file. You need to convert the .mar file to tar.gz and then upload the model to your default SageMaker S3 bucket in the models directory.

To do this, run the following code in the notebook:

import sagemaker
sagemaker_session = sagemaker.Session(boto_session=sess)
bucket_name = sagemaker_session.default_bucket()
prefix = 'torchserve'

!tar cvfz {model_file_name}.tar.gz densenet161.mar
!aws s3 cp {model_file_name}.tar.gz s3://{bucket_name}/{prefix}/models/

Step 4: Deploy an endpoint and make a prediction using Amazon SageMaker SDK

Now you can create a SageMaker model with the TorchServe Docker image and Pytorch model file that you created. You also create a real-time endpoint based on the SageMaker model through the SageMaker SDK. This helps you make sure that TorchServe is actually working as the base image for model files.

a. Create a SageMaker model with the TorchServe Docker image and model file

Run the following code in the notebook and then get the SageMaker execution role and create a model:

from sagemaker.model import Model
from sagemaker.predictor import Predictor

role = sagemaker.get_execution_role()
model_data = f's3://{bucket_name}/{prefix}/models/{model_file_name}.tar.gz'
sm_model_name = 'torchserve-densenet161'

torchserve_model = Model(model_data = model_data, 
                         image_uri = image,
                         role  = role,
                         predictor_cls=Predictor,
                         name  = sm_model_name)

b. Deploy an endpoint with the SageMaker model that you created

In your notebook, specify the endpoint name and use SageMaker SDK to deploy an endpoint by executing the following code:

endpoint_name = 'torchserve-endpoint-' + sm_model_name + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())

predictor = torchserve_model.deploy(instance_type='ml.m4.xlarge',
                                    initial_instance_count=1,
                                    endpoint_name = endpoint_name)

c. Test the TorchServe hosted endpoint

Use a public image from Amazon S3 to test the endpoint. To do that, enter the following into the notebook:

!wget -q https://s3.amazonaws.com/model-server/inputs/kitten.jpg    
file_name = 'kitten.jpg'
with open(file_name, 'rb') as f:
    payload = f.read()
    payload = payload
    
response = predictor.predict(data=payload)
print(*json.loads(response), sep = '\n')

You should receive the following response:

[
  {
    "tiger_cat": 0.46933549642562866
  },
  {
    "tabby": 0.4633878469467163
  },
  {
    "Egyptian_cat": 0.06456148624420166
  },
  {
    "lynx": 0.0012828214094042778
  },
  {
    "plastic_bag": 0.00023323034110944718
  }
]

d. Delete the endpoint

To avoid unnecessary billing, delete the endpoint that you created by running the following command in the notebook:

predictor.delete_endpoint()

 

Step 5: Test the batch transform on SageMaker before listing the PyTorch model in AWS Marketplace

After you make sure that the inference endpoint works, you can test the batch transform job on the SageMaker model that you created. SageMaker also conducts batch transform validation as part of creating the model package before you can list your models in AWS Marketplace.

This step helps you get familiar with the batch transform process and detect potential bugs in the model. First, prepare your transform input data and upload it to an Amazon S3 bucket. Then create the batch transform job through the SageMaker SDK.

a. Create the batch transform input folder

To create the batch transform input folder, in the notebook, enter the following:

sm_model_name = 'torchserve-densenet161'
batch_inference_input_prefix = "batch-inference-input-data"
TRANSFORM_WORKDIR = "transform"

Use two public images from Amazon S3 to test the transform job. To do that, enter the following in the notebook:

%%sh
mkdir transform
cd transform
wget https://s3.amazonaws.com/model-server/inputs/kitten.jpg
wget https://s3.amazonaws.com/model-server/inputs/flower.jpg

b. Upload the batch transform input folder to an Amazon S3 bucket

To upload the batch transform input folder to an Amazon S3 bucket, enter the following in the notebook:

transform_input = sagemaker_session.upload_data(TRANSFORM_WORKDIR, key_prefix=batch_inference_input_prefix)
print("Transform input uploaded to " + transform_input)

c. Create the batch transform job in SageMaker

Use the SageMaker SDK to create the transform job. Wait for the transform job to end. To do that, enter the following in the notebook:

transformer = sagemaker.transformer.Transformer(model_name=sm_model_name, instance_count=1, instance_type='ml.m4.xlarge',
                            strategy=None, assemble_with=None, output_path=None, sagemaker_session=sagemaker_session)

transformer.transform(transform_input, content_type='image/jpeg')
transformer.wait()

print("Batch Transform output saved to " + transformer.output_path)

Congratulations! The batch transform succeeded.

 

Step 6: Create the model package

Now you can start creating your own model package for listing. Before you create it, you need to specify several fields for inference specification and model package validation specification. Creating the model package also does the validation job.

a. Create the model package inference specification

To create the model package inference specification, specify several fields in the pre-defined inference specification template that I provide. To do that, enter the following in the notebook:

from src.inference_specification import InferenceSpecification
import json

modelpackage_inference_specification = InferenceSpecification().get_inference_specification_dict(
    ecr_image=image,
    supports_gpu=True,
    supported_content_types=["image/jpeg", "image/png"],
    supported_mime_types=["application/json"])

# Specify the model data resulting from the previously completed training job
modelpackage_inference_specification["InferenceSpecification"]["Containers"][0]["ModelDataUrl"]= model_data
print(json.dumps(modelpackage_inference_specification, indent=4, sort_keys=True))

b. Create the model package validation specification

Specify several fields in the pre-defined model package validation specification template that I provide. To do that, enter the following in the notebook:

from src.modelpackage_validation_specification import ModelPackageValidationSpecification
import time

modelpackage_validation_specification = ModelPackageValidationSpecification().get_validation_specification_dict(
    validation_role = role,
    batch_transform_input = transform_input,
    input_content_type = "image/jpeg",
    output_content_type = "application/json",
    instance_type = "ml.c4.xlarge",
    output_s3_location = 's3://{}/{}'.format(sagemaker_session.default_bucket(), "/batch-inference-output-data"))

print(json.dumps(modelpackage_validation_specification, indent=4, sort_keys=True))

c. Create the model package with inference specification and validation specification

To create a model package using SageMaker SDK, run the following code in the notebook:

model_package_name = sm_model_name + "-" + str(round(time.time()))
create_model_package_input_dict = {
    "ModelPackageName" : model_package_name,
    "ModelPackageDescription" : "Model of pre-trained DenseNet161",
    "CertifyForMarketplace" : True
}
create_model_package_input_dict.update(modelpackage_inference_specification)
create_model_package_input_dict.update(modelpackage_validation_specification)
print(json.dumps(create_model_package_input_dict, indent=4, sort_keys=True))

sm.create_model_package(**create_model_package_input_dict)

Creating the model package is an asynchronous process. To check its status, run the following command in the notebook:

while True:
    response = sm.describe_model_package(ModelPackageName=model_package_name)
    status = response["ModelPackageStatus"]
    print (status)
    if (status == "Completed" or status == "Failed"):
        print (response["ModelPackageStatusDetails"])
        break
    time.sleep(100)

Congratulations! You have created a model package for listing in AWS Marketplace.

 

Step 7: Create another model package

This step demonstrates that the same TorchServer Docker image works for different PyTorch model packages. Download a VGG-11 model and create a corresponding model package. You can also use your own model.

a. Create the .mar file and upload to Amazon S3

To do this, run the following code in the notebook:

!wget -q https://download.pytorch.org/models/vgg11-bbd30ac9.pth
model_file_name_vgg11 = 'vgg11'

!torch-model-archiver --model-name {model_file_name_vgg11} \
--version 1.0 --model-file serve/examples/image_classifier/vgg_11/model.py \
--serialized-file vgg11-bbd30ac9.pth \
--extra-files serve/examples/image_classifier/index_to_name.json \
--handler image_classifier
!ls *.mar
prefix = 'torchserve'
!tar cvfz {model_file_name_vgg11}.tar.gz vgg11.mar
!aws s3 cp {model_file_name_vgg11}.tar.gz s3://{bucket_name}/{prefix}/models/

b. Create a new model and its corresponding inference specification and validation specification

To create the model and the specifications, run the following code in the notebook:

model_data = f's3://{bucket_name}/{prefix}/models/{model_file_name_vgg11}.tar.gz'
sm_model_name_vgg11 = 'torchserve-vgg11'

    ecr_image=image,
    supports_gpu=True,
    supported_content_types=["image/jpeg", "image/png"],
    supported_mime_types=["application/json"])

# Specify the model data resulting from the previously completed training job
modelpackage_inference_specification["InferenceSpecification"]["Containers"][0]["ModelDataUrl"]= model_data
print(json.dumps(modelpackage_inference_specification, indent=4, sort_keys=True))

modelpackage_validation_specification = ModelPackageValidationSpecification().get_validation_specification_dict(
    validation_role = role,
    batch_transform_input = transform_input,
    input_content_type = "image/jpeg",
    output_content_type = "application/json",
    instance_type = "ml.c4.xlarge",
    output_s3_location = 's3://{}/{}'.format(sagemaker_session.default_bucket(), "/batch-inference-output-data"))

print(json.dumps(modelpackage_validation_specification, indent=4, sort_keys=True))

c. Create the model package with inference specification and validation specification

To create the model package using SageMaker SDK, run the following code in the notebook:

model_package_name = sm_model_name_vgg11 + "-" + str(round(time.time()))
create_model_package_input_dict = {
    "ModelPackageName" : model_package_name,
    "ModelPackageDescription" : "Model of pre-trained VGG11",
    "CertifyForMarketplace" : True
}
create_model_package_input_dict.update(modelpackage_inference_specification)
create_model_package_input_dict.update(modelpackage_validation_specification)
print(json.dumps(create_model_package_input_dict, indent=4, sort_keys=True))

sm.create_model_package(**create_model_package_input_dict)

To check the creation status, run the following code in the notebook:

while True:
    response = sm.describe_model_package(ModelPackageName=model_package_name)
    status = response["ModelPackageStatus"]
    print (status)
    if (status == "Completed" or status == "Failed"):
        print (response["ModelPackageStatusDetails"])
        break
    time.sleep(100)

Congratulations! You have created another model package for listing.

 

Step 8: List your model package in the AWS Marketplace Management Portal

After you created your model packages, they appear on the SageMaker console. Go to the SageMaker console, in the left panel choose model packages, and see the model packages you just created. Select a model package and choose Publish new ML Marketplace listing. In the following screenshot, I selected the torchserve-vgg11 model and chose Publish new ML Marketplace listing from the Actions menu in the upper right.

You’re redirected to AWS Marketplace Management Portal. To start publishing your first PyTorch model, follow the instructions in the Management Portal. For more information, see Machine learning products in the AWS Marketplace Seller Guide.

 

Conclusion

In this blog post, I showed you how to use TorchServe to create two machine learning model package listings (with the same TorchServe Docker image) for AWS Marketplace. TorchServe provides a convenient and flexible way to host an endpoint for PyTorch machine learning models. TorchServe supports a variety of default torch handlers. You can also write your own handler to better support your unique model and then update your Docker image.

About the Author

Rick Cao is a machine learning engineer with the AWS Marketplace machine learning group. He enjoys applying cutting-edge technologies and building AI/ML solutions with cloud architecture to solve business needs. Prior to AWS, Rick has over five years’ experience in the financial industry. He has a master’s degree in computer science (machine learning major) and a master’s degree in financial engineering.