AWS Machine Learning Blog

Call an Amazon SageMaker model endpoint using Amazon API Gateway and AWS Lambda

At AWS Machine Learning workshops, customers often ask, “After I deploy an endpoint, where do I go from there?” You can deploy an Amazon SageMaker trained and validated machine learning model as an endpoint in production. Alternatively, you can choose which Amazon SageMaker functionality to use. For example, you could choose just to train a model or to host one. Whether you choose one Amazon SageMaker functionality or use them all, you will invoke the model as an endpoint deployed somewhere.

The following diagram shows how the deployed model is called using serverless architecture. Starting from the client side, a client script calls an Amazon API Gateway API action and passes parameter values. API Gateway is a layer that provides API to the client. In addition, it seals the backend so that AWS Lambda stays and executes in a protected private network. API Gateway passes the parameter values to the Lambda function. The Lambda function parses the value and sends it to the SageMaker model endpoint. The model performs the prediction and returns the predicted value to AWS Lambda. The Lambda function parses the returned value and sends it back to API Gateway. API Gateway responds to the client with that value.

In this blog post, I’ll show you how to invoke a model endpoint deployed by Amazon SageMaker using API Gateway and AWS Lambda. For testing purposes, we will use Postman.

Using the breast cancer model notebook

We are using a sample notebook provided by Amazon SageMaker, called Breast Cancer Prediction.ipynb. You have access to this notebook in the SageMaker Examples tab shown in the following image. Choosing the Use button creates a folder and loads the notebook. The breast cancer prediction model predicts whether the breast mass is a malignant turmor or benign by looking at features computed from a digitized image of a fine needle aspirate of a breast mass. The data used to train the model consists of the diagnosis as well as the ten real-valued features that are computed for each cell nucleus. Such features include radius, texture, perimeter, area, smoothness, compactness, concavity, concave points, symmetry, and fractal dimension. The prediction returned by the model will be either 0 or 1; 0 being benign and 1 being malignant tumor. The Lambda function will convert this value to be either “B” for benign or “M” for malignant turmor.

Amazon SageMaker has managed built-in Jupyter Notebooks that allow you to write code in Python, Julia, R, or Scala to explore, analyze, and do some modeling with small set of data. The sample Jupyter notebooks get loaded onto a notebook instance when the notebook instance boots up. Each sample notebook consists of markdown-based comments that explain each step, from downloading training data, performing the training, to deploying a model endpoint. After the model is trained and deployed, you can invoke the model endpoint using the Amazon SageMaker runtime API. In order to make it free from server and infrastructure management, we will encapsulate this invocation using Amazon API Gateway and AWS Lambda.

Create a SageMaker model endpoint

The first step is to execute the Breast Cancer Prediction.ipynb sample notebook. After opening the notebook, in the Setup section you fill in your Amazon S3 bucket name as shown in the following image.

Before moving on, comment out the last cell by inserting a # as it deletes the endpoint that is created in the previous cell.

After filling in the S3 bucket name and commenting out the endpoint deletion, you can execute the entire notebook by choosing Run All from the Cell menu. Alternatively, you can execute each cell one by one by pressing the Shift and Enter keys at the same time. If execute each cell, you will learn what each step is doing. For the purpose of this blog post, I chose Run All to deploy the model as an endpoint after the model training was complete.

Upon creation, you can view this endpoint from the Amazon SageMaker console. The default endpoint name looks like this “linear-endpoint-201803211721,” but you can make it more meaningful. I called mine “linear-learner-breast-cancer-prediction-endpoint.”

Create a Lambda function that calls the SageMaker Runtime Invoke_Endpoint

Now we have a SageMaker model endpoint. Let’s look at how we call it from Lambda. There is an API action called SageMaker Runtime and. We use the boto3 sagemaker-runtime.invoke_endpoint(). From the AWS Lambda console, choose Create function to display the following screen.

Name your Lambda function and choose an IAM execution role that includes the following policy, which gives your Lambda function permission to invoke a model endpoint.

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "sagemaker:InvokeEndpoint",
            "Resource": "*"

The sample Lambda function code follows.

import os
import io
import boto3
import json
import csv

# grab environment variables
runtime= boto3.client('runtime.sagemaker')

def lambda_handler(event, context):
    print("Received event: " + json.dumps(event, indent=2))
    data = json.loads(json.dumps(event))
    payload = data['data']
    response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
    result = json.loads(response['Body'].read().decode())
    pred = int(result['predictions'][0]['predicted_label'])
    predicted_label = 'M' if pred == 1 else 'B'
    return predicted_label

ENDPOINT_NAME is an environment variable that holds the name of the SageMaker model endpoint you just deployed using the sample notebook as shown in the following screenshot. Replace the value “linear-learner-breast-cancer-prediction-endpoint” with the endpoint name you created, if it is different. The event that invokes the Lambda function is triggered by API Gateway. API Gateway simply passes the test data through an event.   

Create an API Gateway – Integration request setup

You can create an API Gateway by following these steps.

  1. Open the Amazon API Gateway console. Choose the Create API Choose an API name, which should be something like BreastCancerPrediction. Leave the Endpoint Type as Regional. Choose Create API.
  1. Next, create a Resource choosing from the Actions drop-down list, giving it a name like “predictbreastcancer.” When the resource is created, from the same drop-down list, choose Create Method to create a POST method.
  1. On the screen that appears, do the following:
    1. For Integration type, choose Lambda Function.
    2. For Lambda Function, enter the function you created in step 2.
  1. After the setup is complete, deploy the API to a stage. From the Actions drop-down list, choose Deploy API.
  2. On the page that appears, create a new stage. Call it “test,” and choose the Deploy This step will give you the Invoke URL.

For more detailed information on how to create an API with API Gateway, refer to the documentation. In addition, you can make the API more secure using various methods.

Now that you have an API Gateway and a Lambda function in place, let’s look at the test data next. 

Test data

When the sample notebook loads the data set to the Amazon S3 bucket that you specified, CSV files with data get loaded. The sample notebook separates the data set into two files. The first file is for training data, and the second file is for validation data. Each file is saved in it’s respective folder as shown in the following screenshot. The Amazon S3 path to the file looks like <your bucket name>/sagemaker/DEMO-breast-cancer-prediction/validation.

For example, see one row of the validation data from the file called in the validation folder. A single row of the data follows. The label for each was discussed earlier in the Using the Breast Cancer Model notebook section .


Test with Postman

Now that we have the Lambda function, an API Gateway,  and the test data, let’s test it using Postman, which is an HTTP client for testing web services. You can download the latest version of Postman here.

When you deployed your API Gateway, it provided the invoke URL that looks like:


It follows the format:


Documentation on Invoking an API in API Gateway is found here.

Place the Invoke URL into Postman as shown in the following screenshot and choose POST as method. In the Body tab, place the test data as shown in the following screenshot. Choose the Send button and you will see the returned result as “B” for the case of the test data I picked earlier.


That is it. You have created a model endpoint deployed and hosted by Amazon SageMaker. Then you created serverless components (an API Gateway and a Lambda function) that invoke the endpoint. Now you know how to call a machine learning model endpoint hosted by Amazon SageMaker using serverless technology.

If you have feedback about this blog post, please add it to the following Comments section. If you have questions about implementing the example used in this post, please comment below or open a thread on the Developer Tools forum.

About the Author

Rumi Olsen is a Solutions Architect in the AWS Partner Program. She specializes in serverless and machine learning solutions in her current role, and has a background in natural language processing technologies. She spends most of her spare time with her daughter exploring the nature of Pacific Northwest.