AWS Marketplace

Integrating machine learning models into your Java-based microservices

Machine learning (ML) enables you to deliver more value to your customers by using your data to automate decisions and transform your business. Pre-trained ML models can speed outcomes for real-time object and person detection, optical character recognition, and other use cases. By performing inferences on an ML model in the application’s workflow, you can implement intelligent features that improve the user’s experience, such as Know Your Customer (KYC) checks, Credit scoring automation, and object or people detection.

In my customer interactions, I come across teams who want to integrate these ML models into their Java-based microservices’ real-time inference workflow. In this blog post, I share how to perform a real-time inference on a general-purpose ML model in AWS Marketplace by using the AWS SDK for Java.

ML model background concepts

An ML model is a mathematical model that generates predictions by finding patterns in your data. Performing an inference refers to the activity of sending input data and getting a prediction back from the ML model. ML Models are deployed to perform two types of inferences:

  • Real-time Inference to synchronously generate predictions for individual data observations.
  • Batch Inference to asynchronously generate predictions for multiple input data observations.

Pre-trained Machine Learning (ML) models from AWS Marketplace are ready-to-use models that can be quickly deployed on Amazon SageMaker, a fully managed cloud machine learning service.

In this blog post, I will deploy a general-purpose ML model that accepts an image and returns a list of objects found in the image as part of the inference. Once deployed, I will show you how to write java code to integrate the ML model in your application workflow.


To walk through this solution, you need the following prerequisites:

  • Java 8 and Apache Maven installed in your development environment.
  • An Integrated Development Environment (IDE) for writing your code.

Solution overview

The solution consists of the following steps.

  1. Choose and deploy an ML model. In this step, you first identify an ML model listing and create an AWS Marketplace subscription to the ML Model. You then use AWS CloudFormation to deploy the ML model in form of an Amazon SageMaker Endpoint.
  2. Configure the credentials that your application uses to make the inference call. In this step, you create an Identity and Access Management (IAM) user role while following principle of least privileges and configure AWS access keys in your development environment.
  3. Write the Java code to perform an inference.

Step 1: Choose and deploy an ML model

First, choose the ML model to integrate in your Java-based microservice. You can browse hundreds of ML models in AWS Marketplace. For this post, I use GluonCV YOLOv3 Object Detector.

If you’re using Private Marketplace, you must add the model to the Private Marketplace before you can subscribe to it. If you’re deploying an ML model that was made available to you via a private offer, the seller gives you a link to the product detail page that you need to subscribe to. To subscribe to the private offer product from an account in an AWS Organizations organization, you must subscribe to the offer first from the management account and then from the account that you want to deploy the product in.

A. Subscribe to the model package

  1. Log in to your AWS account.
  2. Open the GluonCV YOLOv3 Object Detector listing in AWS Marketplace.
  3. Review the product overview, pricing, highlights, usage information, instance types that the listing is compatible with, and additional resources.
  4. Choose Continue to Subscribe.
  5. Review the end user license agreement and software pricing.
  6. After your organization agrees to the licensing and pricing, choose Accept offer.

B. Configure the model to use AWS CloudFormation for deployment

  1. To open the model’s configuration page, choose Continue to Configuration.
  2. On the Configure and Launch page, choose AWS CloudFormation as the launch method.
  3. In the Configure for CloudFormation section, choose the latest version and the US East (N. Virginia) Region.
  4. Under Service access, do one of the following:
    • Choose an IAM role that has the AmazonSageMakerFullAccess IAM policy attached.
    • If no such IAM role exists, choose Create and use a new service role.
  5. Choose Launch CloudFormation Template.

This opens the AWS CloudFormation console’s Quick create stack page.

C. Deploy the ML model

  1. For EndPointName, enter gluon-endpoint.
  2. For the rest of the parameters, keep the default values.
  3. Choose Create Stack.

The AWS CloudFormation template deploys the model behind an Amazon SageMaker endpoint. After you have created the stack, wait for its status to change to Create Complete.

Step 2: Create an IAM user

If you don’t have a user with the necessary credentials, you must create one. To do so, follow these steps.

A. Create an IAM user

  1. In the IAM console, open the Users page.
  2. Choose Add user.
  3. For User name, enter TestSDK and select Programmatic access.
  4. Choose Next: Permissions.
  5. Choose Next: Tags.
  6. Choose Next: Review.
  7. Choose Create User.

B. Download the credentials

  1. On the Add User Success page, choose Download.csv.
  2. Save the file in a trusted location. The downloaded file contains the access key ID and the secret access key. Treat the secret access key as a password and don’t share it. This is your only opportunity to download or copy the secret access key.

C. Configure the credentials in your development environment

Next, configure the credentials in your development environment. The AWS SDK for Java uses the access key ID and secret access key as credentials when your application makes requests to AWS.

  1. If a file exists at the following location, open it. Otherwise, create a file at the location and open it with any text editor.
    Operating system File name
    Windows C:\Users\<yourUserName>\.aws\credentials
    Linux, macOS, Unix ~/.aws/credentials
  1. Append the following code to the file.
    aws_access_key_id = YOUR_AWS_ACCESS_KEY_ID
    aws_secret_access_key = YOUR_AWS_SECRET_ACCESS_KEY
  1. Replace YOUR_AWS_ACCESS_KEY with the value corresponding to Access key ID and replace YOUR_AWS_SECRET_ACCESS_KEY with the value corresponding to Secret access key from the file that you downloaded.
  2. Save and close the file.

D. Add permissions to your IAM user

The user you created in Step 2A doesn’t have IAM permissions to invoke the endpoint yet. To add the necessary permissions, follow these steps.

  1. In the IAM console, open the TestSDK.
  2. Choose Add inline policy.
  3. On the Create Policy page, choose the JSON
  4. Replace the contents of the text area with the following policy.
    			"Version": "2012-10-17",
    			"Statement": [{
    				"Sid": "stmt",
    				"Effect": "Allow",
    				"Action": "sagemaker:InvokeEndpoint",
    				"Resource": "arn:aws:sagemaker:us-east-1:*:endpoint/gluon-endpoint"
  1. Choose Review Policy.
  2. On the Review Policy page, for Name, specify SageMakerInvokeEndpoint.
  3. Choose Create Policy.

Congratulations, you have configured an IAM user’s credentials for performing inference using the AWS SDK for Java.

Step 3: Perform the inference via the AWS SDK for Java in the Eclipse IDE

For this blog post, I use the Eclipse IDE, version 2021-03 (4.19.0), build 20210312-0638).

A. Create a project in your Eclipse workspace

To set up your Eclipse workspace, you can either decompress and import this Maven project and skip to Step 3.C.4 or create a project by following these steps.

  1. Open the Eclipse IDE. On the main menu, choose File and the New and then Project.
  2. Under Maven, choose Maven Project. Choose Next.
  3. Choose Create a simple Project (Skip archetype selection) by checking the check box. Choose Next.
  4. For Group Id, enter example.myapp.
  5. For Artifact Id, enter performinference.
  6. Keep the remaining default values and choose Finish.

The new project appears in Package Explorer.

B. Declare dependencies

Next, declare dependencies for the AWS SDK and SageMakerruntime libraries in pom.xml so that required libraries are included in the build-path at runtime. To do so, in Eclipse IDE, replace the contents of performinference/pom.xml with the following configuration and then save and close the file.

<project xmlns="" xmlns:xsi="" xsi:schemaLocation="">

C. Create a Java package in Eclipse

  1. In Package Explorer of Eclipse IDE, create a Java package named example.myapp and a Java class named Test.
  2. Specify the following imports in the java file. These import statements import the relevant Java SDK classes. You will use these classes to write the code to perform an inference on the ML model you deployed in Step 1 of this blog post.
    import java.nio.ByteBuffer;
  3.  To construct an InvokeEndpointRequest and use SageMakerRuntime library to perform an inference via invokeEndpoint function call, add the following code to the class. After the inference has been performed, the code extracts and prints the inference. If you skipped Step 2, update the value of the awsCredentialsProfileName variable to appropriate profileName that you have configured in your development environment.
    public static void main(String[] args) throws IOException {
       // Set Variables
    String endpointName = "gluon-endpoint";
    	String contentType = "image/jpeg";
    	String fileName = "/car.jpg";
    	String awsCredentialsProfileName = "TestSDKProfile";
    	// Read payload into a variable
    	InputStream fs = Test.class.getResourceAsStream(fileName);
    	SdkBytes body = SdkBytes.fromByteBuffer(ByteBuffer.wrap(fs.readAllBytes()));
    	// Build an Invocation request object
    	InvokeEndpointRequest request = InvokeEndpointRequest.builder().contentType(contentType).body(body)
    	// Load credentails into a profile
    	ProfileCredentialsProvider profile = ProfileCredentialsProvider.builder().profileName(awsCredentialsProfileName)
    	// Build AmazonSageMakerRuntime client
    	SageMakerRuntimeClient runtime = SageMakerRuntimeClient.builder().credentialsProvider(profile)
    	// Perform an inference
    	InvokeEndpointResponse result = runtime.invokeEndpoint(request);
    	// Print inference result

    As you can see, you must call AmazonSageMakerRuntime‘s invokeEndpoint with the image  encapsulated in an InvokeEndpointRequest object. The invokeEndpoint method returns the prediction in and InvokeEndpointResponse instance. For more information, see tutorials, quick start guides, and Java documentation at Build Java applications on AWS. You can find more information on how to pass credentials to your environment in this documentation.

  4. Upload this image payload to the <eclipse_workspace>/<project_name>/src/main/resources. Rename it car.jpg. You will send this image to the ML model for performing real-time inference.
  5. Build and run the project.
    • In Package Explorer, open the context menu (right-click) for the project and choose Run as Java application.
    • On the Select Java Application panel, choose Test as the main class.
    • Choose Ok.

The following output appears in the console.

	"right": 1223,
	"bottom": 1146,
	"top": 540,
	"score": 0.9871261715888977,
	"id": "car",
	"left": 174
}, {
	"right": 1793,
	"bottom": 791,
	"top": 723,
	"score": 0.4036855697631836,
	"id": "fire hydrant",
	"left": 1758
}, {
	"right": 778,
	"bottom": 721,
	"top": 592,
	"score": 0.15938007831573486,
	"id": "person",
	"left": 628

Congratulations, you have successfully invoked a SageMaker endpoint hosting an ML model using the AWS SDK for Java. You can now integrate the model with your Java-based microservice.

Cleaning up

To avoid incurring costs from this solution, follow these steps.

  1. Open the AWS CloudFormation console.
  2. Choose the stack named Stack-GluonCV-YOLOv3-Object-Detector-1.
  3. Choose Delete.
  4. Choose Delete Stack.

You can also cancel the subscription to the model you subscribed to in Step 1 from the Manage subscriptions page.


In this post, I showed how to invoke a SageMaker endpoint hosting a general-purpose ML model, using the AWS SDK for Java. I also showed how to integrate the ML model with your Java-based microservice to perform real-time inferences in your application workflow. By performing inferences on an ML model directly in your application’s workflow, you can implement intelligent features such as KYC checks, Credit scoring automation, object/people detection and many more and improve your user’s experience.

Next steps

If you need a customization for a model to meet your specific requirements, contact for assistance. If you’re interested in selling an ML algorithm or a pre-trained model package, see Sell Amazon SageMaker Algorithms and Model Packages.

For more information, see the following domain-specific videos that use models and algorithms available in AWS Marketplace:

For more videos, see AWS Marketplace for Machine Learning video playlist.

In AWS Marketplace, check out:

  • ML models to directly deploy and perform real-time and batch inference.
  • AWS Marketplace Professional services, which enable you to request ML model customization from the ML vendor as per your model-interface, latency, and metric requirements.
  • Private offers to procure third-party ML models

About the author

kanchan waikarKanchan Waikar is a Senior Specialist Solutions Architect at Amazon Web Services with AWS Marketplace for machine learning group. She has over 14 years of experience building, architecting, and managing natural language processing (NLP) and software development projects. She has a masters degree in computer science (data science major) and enjoys helping customers build solutions backed by AI/ML based AWS services and partner solutions.