Amazon SageMaker AI now hosts NVIDIA Evo-2 NIM microservices

This post is co-written with Neel Patel, Abdullahi Olaoye, Kristopher Kersten, Aniket Deshpande from NVIDIA.

Today, we’re excited to announce that the NVIDIA Evo-2 NVIDIA NIM microservice are now listed in Amazon SageMaker JumpStart. You can use this launch to deploy accelerated and specialized NIM microservices to build, experiment, and responsibly scale your drug discovery workflows on Amazon Web Services (AWS).

In this post, we demonstrate how to get started with these models using Amazon SageMaker Studio.

NVIDIA NIM microservices on AWS

NVIDIA NIM integrates closely with AWS managed services, such as Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Kubernetes Service (Amazon EKS), and Amazon SageMaker AI, to support deployment of generative AI models at scale. As part of NVIDIA AI Enterprise, which is available in the AWS Marketplace, NVIDIA NIM is a set of microservices designed to accelerate the deployment of generative AI. These prebuilt containers support a broad spectrum of generative AI models, from open source community models, to NVIDIA Nemotron and custom models. NIM microservices are deployed with just a few lines of code, or with a few actions in the SageMaker Studio console. Engineered to facilitate seamless generative AI inferencing at scale, NIM ensures that generative AI applications can be deployed on various AWS services.

NVIDIA BioNeMo Evo 2 overview

NVIDIA BioNeMo is a platform of NIM microservices, developer tools, and AI models that accelerate building, adapting, and deploying biomolecular AI models for drug discovery. It packages curated training recipes, data loaders, and domain-optimized pretrained models for DNA, RNA, and proteins, alongside NVIDIA CUDA-X libraries such as NVIDIA cuEquivariance. These components power tasks such as 3D structure prediction, de novo design, virtual screening, docking, and property prediction with GPU-accelerated performance.

NVIDIA NIM microservices provide optimized, API-first inference that integrates directly into enterprise pipelines across on-premises and the cloud, providing scalable and secure deployment with faster time-to-market and lower Total Cost of Ownership (TCO). The Evo 2 NIM delivers a 40-billion parameter foundation model (FM) trained on a vast dataset of genomes that can be used to predict protein function, identify mutations, and accelerate bioengineering research. Furthermore, the Evo 2 NIM can be chained with other NIM microservices such as ESMFold to create end-to-end, containerized workflows that cut time-to-insight while streamlining deployment through consistent APIs.

SageMaker Studio overview

SageMaker Studio is a web-based integrated development environment (IDE) for machine learning (ML) that provides a unified visual interface for all of the tools that you need to complete each step of the ML development lifecycle. SageMaker Studio provides complete access, control, and visibility into each step of the ML workflow, from data preparation to model building, training, and deployment.

The key features of SageMaker Studio include:

Unified interface: Access all SageMaker capabilities through a single, web-based visual interface
Jupyter notebooks: Fully managed Jupyter notebooks with pre-configured kernels for popular ML frameworks
Model management: Browse, deploy, and manage models from AWS Marketplace and other sources through an intuitive interface
Collaboration: Share notebooks, experiments, and models with your team members
Built-in security: Integrated with AWS Identity and Access Management (IAM) for secure access control
Cost management: Monitor and control costs with built-in usage tracking and resource management tools

Amazon SageMaker JumpStart overview

SageMaker JumpStart is a fully managed service that offers state-of-the-art foundation models for various use cases such as content writing, code generation, question answering, copywriting, summarization, classification, and information retrieval. It provides a collection of pre-trained models that you can deploy quickly, accelerating the development and deployment of ML applications. One of the key components of SageMaker JumpStart is model hubs, which offer a vast catalog of pre-trained models, such as Mistral, for a variety of tasks. You can now discover and deploy Evo 2 NIM in Amazon SageMaker Studio or programmatically through the SageMaker Python SDK, so you can derive model performance and MLOps controls with Amazon SageMaker AI features such as Amazon SageMaker Pipelines, Amazon SageMaker Debugger, or container logs. The model is deployed in a secure AWS environment and in your VPC, helping to support data security for enterprise security needs.

Prerequisites

Before getting started with deployment, make sure that your IAM service role for SageMaker AI has the SageMakerFullAccess permission policy attached. To deploy the NVIDIA NIM microservices successfully, confirm one of the following:

Make sure that your IAM role has the following permissions, and that you have the authority to make AWS Marketplace subscriptions in the AWS account used:

aws-marketplace:ViewSubscriptions
aws-marketplace:Unsubscribe
aws-marketplace:Subscribe

If your account is already subscribed to the model, then you can skip to the following Deploy section. Otherwise, start by subscribing to the model package and move to the Deploy section after.

Subscribe to the model package

To subscribe to the model package, complete the following steps:

Open the SageMaker Jumpstart portal from the SageMaker AI page.
Search for Evo 2 NIM.
Choose View model, and on the Model details page choose Subscribe. This will take you to the AWS Marketplace listing for the Evo 2 NIM.
On the AWS Marketplace listing page, choose View purchase options, review the purchase terms and choose the Subscribe button if you and your organization agree with EULA, pricing, and support terms.
Choose Continue to with the configuration and choose an AWS Region where you have the service quota for the desired instance type.

A product Amazon Resource Name (ARN) is displayed. This is the model package ARN that you need to specify while creating a deployable model using the SageMaker SDK.

Option 1: Deploy the Evo 2 NIM using SageMaker Studio

The following section outlines how to deploy the EVO 2 NIM using SageMaker Studio.

Getting started with SageMaker Studio

Begin by accessing the AWS Management Console and navigating to the SageMaker AI service. When you’re in the SageMaker AI console, locate Studio in the left navigation panel and choose Open Studio next to your user profile. If you haven’t set up a SageMaker Studio domain yet, then you must create a new domain and user profile first. This launches the web-based SageMaker Studio interface where you can manage all aspects of your ML workflow.

Navigating to model packages

Within SageMaker Studio, look for Models in the left sidebar and choose JumpStart base models tab within the Models interface. This section contains all available model packages in SageMaker JumpStart, including those from the AWS Marketplace

Locating the Evo-2 NIM model

Use the search functionality to find the NVIDIA Evo-2 NIM model by searching for terms such as “Evo-2” or “NVIDIA”. When you locate the model package in the filtered results, choose it to view the Model overview page. This page provides an overview of the model and can have a Notebooks tab that will show a sample notebook that contains an example showing how to use the NIM. You can choose Open in JupyterLab to open the notebook in JupyterLab and use it as a starting point for using the NIM.

Configuring the model deployment

On the model package overview page, choose the Deploy button on the top right to begin the deployment process. You must configure several important settings: provide a unique endpoint name (such as “Evo-2-nim-endpoint”), choose an appropriate instance type (ml.g6e.12xlarge is recommended for optimal performance), set the initial instance count (typically 1 for initial testing), and specify an endpoint configuration name. Review all of these settings carefully before proceeding.

Initiating and monitoring the deployment

After verifying your configuration settings, choose Deploy to start the deployment process for creating a Real-time inferance endpoint. Navigate to the Deployments section and then the Endpoints section in the left sidebar to monitor the deployment progress. The endpoint status initially shows Creating and typically takes 5–10 minutes to complete. You can track the progress and should see the status change to InService once the deployment is successful.

Testing and validation

When your endpoint is deployed and shows the In Service status, you can optionally test it directly through the SageMaker Studio interface. Choose your deployed endpoint from the endpoints list to access the Endpoint summary page. Scroll down and select the Playground tab. If available, you will see two options: Test the sample request and Use Python SDK example code. You can use either option to validate the deployment by using a sample protein sequence. This validates the endpoint is working correctly before integrating it into your applications.

Option 2: Deploy Evo 2 using the SageMaker SDK

In this section we walk through deploying the Evo-2 NIM through the SageMaker SDK. Make sure that you have the account-level service limit for using ml.g6e.12xlarge for endpoint usage as one or more instances. Furthermore, NVIDIA provides a list of supported instance types that support deployment. Refer to the AWS Marketplace listing for the model to see the supported instance types. To request a service quota increase, go to the AWS service quotas.

import sagemaker
import boto3
from sagemaker import ModelPackage, get_execution_role
import json
# Initialize SageMaker session and role
role = get_execution_role()
sagemaker_session = sagemaker.Session()
# Model Package ARN from your AWS Marketplace subscription
# Replace this with your actual Model Package ARN after subscription
model_package_arn = "arn:aws:sagemaker:<region>:<account-id>:model-package/Evo-2-nim-model"
# Create model from AWS Marketplace Model Package
model = ModelPackage(
    role=role, 
    model_package_arn=model_package_arn,
    sagemaker_session=sagemaker_session
)
# Deploy the model to an endpoint
predictor = model.deploy(
    initial_instance_count=1,
    instance_type="ml.g6e.12xlarge",  # Using recommended NVIDIA GPU instance
    endpoint_name="Evo-2-endpoint",
    wait=True
)

Run Inference with Evo 2 SageMaker endpoint

When you have the model, you can use a sample text to do an inference request. NIM on SageMaker supports the OpenAI API inference protocol inference request format. For an explanation of the supported parameters, go to the Evo-2 API documentation.

Real-time inference example

sm_runtime = boto3.client("sagemaker-runtime", region_name=region)

generate_payload = {

 "sequence": "ACGTACGTACGT",

 "num_tokens": 100,

 "temperature": 0.7,

 "top_k": 3,

}

response = sm_runtime.invoke_endpoint(

EndpointName='Evo2-40b-2-1-0',

ContentType="application/json",

Body=json.dumps(generate_payload),

)

result = json.loads(response["Body"].read())

print("Generated DNA:", result["sequence"])
print("Elapsed (ms):", result.get("elapsed_ms"))

Example output:

Generated DNA: ACGTACATATGTTCGTACATTCGCACAGACGCCATTTTGAAAAATGCTTTAAATGGATTCAGAATTGGTCAAAATGCATAAATCCATCAAAATTTTTTTC
Elapsed (ms): 10770

Cleaning up

To avoid unwanted charges, complete the steps in this section to clean up your resources.

Deleting the endpoint from SageMaker Studio

In SageMaker Studio, navigate to the Endpoints section in the left sidebar under Inference to view all your active endpoints. Locate your Evo-2 NIM endpoint in the list and select it to open the endpoint details page. On this page, there is a Delete button. Choose Delete and confirm the deletion when prompted. The endpoint status changes to Deleting and disappears from your endpoints list when the deletion is complete. This process typically takes a few minutes, and when it’s deleted the endpoint stops incurring charges immediately.

Delete the SageMaker endpoint

The SageMaker endpoint that you deployed incurs costs if you leave it running. Use the following code to delete the endpoint if you want to stop incurring charges. For more details, go to Delete endpoints and resources.

# Delete endpoint when done (important for cost management)
predictor.delete_endpoint()

Conclusion

The availability of NVIDIA Evo-2 NIM microservices on Amazon SageMaker Jumpstart represents a significant advancement for researchers and organizations working in drug discovery. This solution provides GPU-accelerated multiple sequence alignments and dramatically speeds up structure prediction pipelines that are critical for protein design and antibody research. Users can implement the flexible deployment options—through SageMaker Studio, or SageMaker SDK—to choose the approach that best fits their workflow and technical expertise. The optimized performance of these NIM microservices, combined with the scalability and security of SageMaker, enables faster time-to-insight while streamlining the deployment of complex biomolecular AI models. We encourage you to try the Evo-2 NIM today and look out for future release of MSA-search and Boltz-2 NIMs to accelerate your drug discovery workflows and use the power of NVIDIA’s specialized microservices on AWS infrastructure.

AWS Compute Blog