AWS Machine Learning Blog

Deploy trained Keras or TensorFlow models using Amazon SageMaker

Amazon SageMaker makes it easier for any developer or data scientist to build, train, and deploy machine learning (ML) models. While it’s designed to alleviate the undifferentiated heavy lifting from the full life cycle of ML models, Amazon SageMaker’s capabilities can also be used independently of one another; that is, models trained in Amazon SageMaker can be optimized and deployed outside of Amazon SageMaker (or even out of the cloud on mobile or IoT devices at the edge). Conversely, Amazon SageMaker can deploy and host pre-trained models from model zoos, or other members of your team.

In this blog post, we’ll demonstrate how to deploy a trained Keras (TensorFlow or MXNet backend) or TensorFlow model using Amazon SageMaker, taking advantage of Amazon SageMaker deployment capabilities, such as selecting the type and number of instances, performing A/B testing, and Auto Scaling.  Auto Scaling clusters are spread across multiple Availability Zones to deliver high performance and high availability.

Your trained model will need to be saved in either the Keras (JSON and weights hdf5) format or the TensorFlow Protobuf format. If you’d like to begin from a sample notebook that supports this blog post, download it here.

For more on training the model on SageMaker and deploying, refer to this notebook on Github.

Step 1. Set up

In the AWS Management Console, go to the Amazon SageMaker console. Choose Notebook Instances, and create a new notebook instance. Upload the current notebook and set the kernel to conda_tensorflow_p36.

The get_execution_role function retrieves the AWS Identity and Access Management (IAM) role you created at the time of creating your notebook instance.

import boto3, re
from sagemaker import get_execution_role

role = get_execution_role()

Step 2. Load the Keras model using the JSON and weights file

If you saved your model in the TensorFlow ProtoBuf format, skip to “Step 4. Convert the TensorFlow model to an Amazon SageMaker-readable format.”

import keras
from keras.models import model_from_json

!mkdir keras_model

Navigate to keras_model from the Jupyter notebook home, and upload your model.json and model-weights.h5 files (using the “Upload” menu on the Jupyter notebook home). To use a sample model for this exercise download and unzip the files found here, then upload them to keras_model.

!ls keras_model
json_file = open('/home/ec2-user/SageMaker/keras_model/'+'model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
loaded_model.load_weights('/home/ec2-user/SageMaker/keras_model/model-weights.h5')
print("Loaded model from disk")

Step 3. Export the Keras model to the TensorFlow ProtoBuf format

from tensorflow.python.saved_model import builder
from tensorflow.python.saved_model.signature_def_utils import predict_signature_def
from tensorflow.python.saved_model import tag_constants
# Note: This directory structure will need to be followed - see notes for the next section
model_version = '1'
export_dir = 'export/Servo/' + model_version
# Build the Protocol Buffer SavedModel at 'export_dir'
builder = builder.SavedModelBuilder(export_dir)
# Create prediction signature to be used by TensorFlow Serving Predict API
signature = predict_signature_def(
    inputs={"inputs": loaded_model.input}, outputs={"score": loaded_model.output})
from keras import backend as K

with K.get_session() as sess:
    # Save the meta graph and variables
    builder.add_meta_graph_and_variables(
        sess=sess, tags=[tag_constants.SERVING], signature_def_map={"serving_default": signature})
    builder.save()

Step 4. Convert TensorFlow model to an Amazon SageMaker-readable format

Move the TensorFlow exported model into a directory export\Servo. Amazon SageMaker will recognize this as a loadable TensorFlow model. Your directory and file structure should look like this:

!ls export

!ls export/Servo

!ls export/Servo/1

!ls export/Servo/1/variables

Tar the entire directory and upload to Amazon S3

import tarfile
with tarfile.open('model.tar.gz', mode='w:gz') as archive:
    archive.add('export', recursive=True)
import sagemaker

sagemaker_session = sagemaker.Session()
inputs = sagemaker_session.upload_data(path='model.tar.gz', key_prefix='model')

Step 5. Deploy the trained model

The entry_point file train.py can be an empty Python file. The requirement will be removed at a later date.

!touch train.py
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '1.12',
                                  entry_point = 'train.py')
%%time
predictor = sagemaker_model.deploy(initial_instance_count=1,
                                   instance_type='ml.m4.xlarge')

Note: You need to update the endpoint in the following command with the endpoint name from the output of the previous cell (INFO:sagemaker:Creating endpoint with name sagemaker-tensorflow-2019-01-29-17-36-55-987).

endpoint_name = 'sagemaker-tensorflow-2019-01-29-17-36-55-987'
import sagemaker
from sagemaker.tensorflow.model import TensorFlowModel
predictor=sagemaker.tensorflow.model.TensorFlowPredictor(endpoint_name, sagemaker_session)

Step 6. Invoke the endpoint

Invoke the Amazon SageMaker endpoint from the notebook

import numpy as np

# The sample model expects an input of shape [1,50]
data = np.random.randn(1, 50)
predictor.predict(data)

Invoke the Amazon SageMaker endpoint using a boto3 client

import json
import boto3
import numpy as np
import io
 
client = boto3.client('runtime.sagemaker')
# The sample model expects an input of shape [1,50]
data = np.random.randn(1, 50).tolist()
response = client.invoke_endpoint(EndpointName=endpoint_name, Body=json.dumps(data))
response_body = response['Body']
print(response_body.read())

Step 7. Clean up

To avoid incurring unnecessary charges, use the AWS Management Console to delete the resources that you created for this exercise: https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-cleanup.html

Conclusion

In this blog post, we demonstrated deploying a trained Keras or TensorFlow model at scale using Amazon SageMaker, independent of the computing resource used for model training. This gives you the flexibility to use your existing workflows for model training, while easily deploying the trained models to production with all the benefits offered by a managed platform. These benefits include the ability to select the optimal type and number of deployment instances, perform A/B testing, and auto scale. The Auto Scaling clusters of Amazon SageMaker ML instances can be spread across multiple Availability Zones to deliver both high performance and high availability.


About the author

Priya Ponnapalli is a principal data scientist at Amazon ML Solutions Lab, where she helps AWS customers across different industries accelerate their AI and cloud adoption.