AWS Big Data Blog

Serving Real-Time Machine Learning Predictions on Amazon EMR

by Derek Graeber and Guy Ernest | on | in Amazon EMR* | Permalink | Comments |  Share

The typical progression for creating and using a trained model for recommendations falls into two general areas: training the model and hosting the model. Model training has become a well-known standard practice. We want to highlight one of many ways to host those recommendations (for example, see the Analyzing Genomics Data at Scale using R, AWS Lambda, and Amazon API Gateway post).

In this post, we look at one possible way to host a trained ALS model on Amazon EMR using Apache Spark to serve movie predictions in real time. It is a continuation of two recent posts that are prerequisite:

In future posts we will cover other alternatives for serving real-time machine-learning predictions, namely AWS Lambda and Amazon EC2 Container Service, by running the prediction functions locally and loading the saved models from S3 to the local execution environments.

Walkthrough: Trained ALS model

For this walkthrough, you use the MovieLens dataset as set forth in the Building a Recommendation Engine post; the data model should have already been generated and persisted to Amazon S3. It uses the Alternating Least Squares (ALS) algorithm to train the data for generating the proper model.

Using JobServer, you take that model and persist it in memory in JobServer on Amazon EMR. After it’s persisted, you can expose RESTful endpoints to AWS Lambda, which in turn can be invoked from a static UI page hosted on S3, securing access with Amazon Cognito.

Here are the steps that you follow:

  1. Create the infrastructure, including EMR with JobServer and Lambda.
  2. Load the trained model into Spark on EMR via JobServer.
  3. Stage a static HTML page on S3.
  4. Access the AWS Lambda endpoints via the static HTML page authenticated with Amazon Cognito.

The following diagram shows the infrastructure architecture.


Create the hosting environment

Because you are already familiar with JobServer on EMR from the prerequisite Installing and Running JobServer post, this post won’t review most of that work. You use EMR 4.7.1, as does the earlier post. At the time, of this writing, JobServer does not have a version for Spark 2.0, so you use JobServer v.0.6.2.

Be sure to get the updated aws-blog-jobserver-emr . The project structure has not changed, but the code and configuration has been added. Specifically, under the <project> directory, reference the following:

  • <project>/BA – the bootstrap actions to install JobServer on EMR
  • <project>/cfn – the location of the necessary AWS CloudFormation templates
  • <project>/html – the static HTML page to be hosted in S3
  • <project>/jobserver_configs – the configurations for JobServer
  • <project>/ml_commands – sample invocation commands using CURL (just reference)
  • <project>/policy – the role policies for artifacts
  • <project>/python_lambda – the source code for the AWS Lambda Python code
  • <project>/src – the Scala source code for JobServer

You expose three endpoints in JobServer (and Lambda):

  1. LoadModelAndData – This takes both the model and data stored on S3 and loads them into JobServer. You MUST invoke this endpoint first.
  2. MoviesRec – The recommendation endpoint for all movies for a user (in this case, userId=100).
  3. MoviesRecByGenre – The recommendation for moves by a specific genre for a specific user (in this case, userId=100)

Use AWS CloudFormation to create the EMR cluster that JobServer runs on and the Lambda functions that access JobServer.

To get started, prep the deployment and review the assumptions.

Prep the deployment and stage configurations for JobServer

You use S3 as the landing point for all necessary configurations for JobServer and Lambda. You MUST be familiar with the JobServer on EMR post and understand the configurations discussed in detail. We touch on them only briefly in this post as we walk you through the following steps. For each step, be sure to note the full S3 object path.

  1. Compile the JobServer code with Maven and put the output .jar file in the S3 bucket (jobserverEmr-1.0.jar).
  2. Put the file <project>/jobserver_configs/emr_contexts_ml.conf in the S3 bucket.
  3. Put the file <project>/jobserver_configs/emr_v1.6.2.conf in the S3 bucket.
  4. Open the BA at <project>/BA/ and update the following properties:
    • EMR_SH_FILE property – the path to the object on S3 to emr_v1.6.2.conf
    • EMR_CONF_FILE property – the path to the object on S3 to emr_contexts_ml.conf
  5. Put the BA in the S3 bucket.
  6. Open the BA at <project>/BA/ and update the properties:
    • EMR_JOBSERVER_FILE – object on S3 to jobserverEmr-1.0.jar
  7. Put the BA in the S3 bucket.

At this point, your JobServer is configured. Right now, you should have noted the following:

  • The full S3 object path to the bootstrap action
  • The full S3 object path to bootstrap action

Now, you can create the Python code for Lambda.

Prep the deployment and stage configurations for Lambda

Create a zip file with the Python code and a dependency, using Python 2.7 for the runtime. Under <project>/python_lambda, the file is the source code. It uses the sjs-client API. Install it for Python to use and put it in the deployment package zip file. For more information, see Creating a Deployment Package (Python).

pip install python-sjsclient –t <project> 

Zip the script with the dependency sjs-client libraries (make sure that is at the root and not in a directory). In the working environment, name the output .zip file or similar. Place this zip file in the S3 bucket and be sure to make a note of the full S3 object path.

Now that you have all the dependencies, you can move onto creating the environment for CloudFormation.

Create the JobServer environment

To create the JobServer environment, use CloudFormation and the items you staged on S3. The CloudFormation template is located at <project>/cfn/ml-jobserver.template.

It assumes the following:

  1. The user creating the environment has access to IAM, Route 53, Lambda, and EMR.
  2. It is being run in the default VPC of the account, in a public subnet.
  3. The account already has the EMR_DefaultRole and EMR_EC2_DefaultRole roles defined.

Get the VPCID value of the default VPC and the SubnetID value for the subnet in which the artifacts are created. Provide the following seven parameters as inputs to the CloudFormation template:

  • VPCID – the default VPC
    • vpc-XXXXXXXX
  • SubnetID – the subnet in the VPC to use
    • subnet-XXXXXXXX
  • EMRKeyPair – the name of an EC2 key to be installed on the EMR master node
  • BaJobServerLoc – where to put the install JobServer BA
    • s3://<S3folder>/jobserver/BA/
  • StepStartJobServerLoc – where to put the BA
    • s3:// <S3folder>/jobserver/BA/
  • LambdaS3BucketName – the name of the bucket where the Python code is located
    • <S3folder>
  • LambdaS3BucketKey – the key (without the bucket) where your Python code is located
    • ml/lambda/

Browse to the console and execute this template being sure to supply the required parameters. This template takes approximately 10-12 minutes to execute.

When the template has completed, the EMR cluster with JobServer is installed with your ML-Jobserver code and the Lambda endpoints are installed. You can find the names of the Lambda endpoints in the output section. You need these later on to host a static webpage in S3 and invoke the endpoints. The three endpoints created correspond to the three access points you deployed in JobServer:

  • LoadDataLambdaARN
  • RecommendationLambdaARN
  • GenreLambdaARN


Using the following template for each of the Lambda functions created, you can test each of the Lambda endpoints (be sure to test LoadDataLambda first). In this use case, the model you created with Zeppelin was persisted to S3 as denoted and the MovieLens data is on S3:

  "userId": 100,
  "genre": "Comedy",
  "s3DataLoc": "dgraeberaws-blogs/ml/data/movielens/small/",
  "s3ModelLoc": "dgraeberaws-blogs/ml/models/movielens/recommendations/"

With the server deployed and the model loaded and tested via Lambda test events, you can move on to creating a static HTML page to access these endpoints.

Create the static UI

You host a static HTML page on S3 and authenticate (unauthorized) with Amazon Cognito. In this use case, you use a federated identity pool, created using the console. You can either create a new IAM user manually or let the Amazon Cognito console create one for you. For the role that is the unauthenticated role, you must have a policy attached to that role to allow access to Lambda as well as the proper Amazon Cognito trust policy (see <project>/policy for samples).

After the identity pool is created, make note of the pool ID. Now, edit the mltest.html code located in <project>/html. Make four updates to give this page access to the Lambda code:

  • The Amazon Cognito identity pool ID
  • The ARN of the LoadDataLambdaARN function (from the CloudFormation output)
  • The ARN of the GenreLambdaARN function (from the CloudFormation output)
  • The ARN of the RecommendationLambdaARN function (from the CloudFormation output)

Edit the mltest.html page with these values, as indicated below:

Create an S3 bucket and configure it as a static website. After it’s created, place the mltest.html page in the bucket (with proper permissions for public read access) and browse to that object.



We have shown you one possible way to host a real-time recommendation engine on Amazon EMR with JobServer. You can quickly create infrastructure and load the model data. Also, using RESTful endpoints and AWS Lambda, you can expose the model to end users to provide recommendations in real time. This is only one example of a way to provide real-time predictions using standard technologies. Look for future posts related to Machine Learning on the AWS Big Data Blog.

If you have questions or suggestions, please comment below.

About the Authors

derek_graeber_90_1Derek Graeber is a Senior Consultant in Big Data & Analytics for AWS Professional Services.
He works with enterprise customers to provide leadership on big data projects, helping them realize their business goals when running on AWS. In his spare time, he enjoys spending time with his wife and family, occasionally getting out to reaffirm that he will never be a good golfer.


guy_ernest_90Guy Ernest is a principal business development manager for machine learning in AWS. He has the exciting opportunity to help shape and deliver on a strategy to build mind share and broad use of Amazon’s cloud computing platform for AI, machine learning and deep learning use cases. In his spare time, he enjoys spending time with his wife and family, gathering embarrassing stories, to share in talks about Amazon and the future of AI.


Readmission Prediction Through Patient Risk Stratification Using Amazon Machine Learning