AWS Partner Network (APN) Blog

Automating Cloud Cost Optimization on AWS with nOps Compute Copilot and Karpenter

By Hayk Harutyunyan, Sr. Software Engineer – nOps
By Jason Janiak, Partner Solutions Architect – AWS

nOps
Connect with nOps-1

Many Amazon Web Services (AWS) customers are running workloads on Amazon Elastic Kubernetes Service (Amazon EKS) for its extensive feature set, flexibility, scalability, and capacity for optimizing resource use. nOps Compute Copilot enables platform engineering teams to further optimize and automate the scheduling and scaling of their workloads.

nOps Copilot is built on open-source Karpenter, which was developed by AWS and offers advanced scheduling and auto-scaling capabilities. It can improve application availability and cluster efficiency by rapidly launching right-sized compute resources in response to changing application load.

nOps is a AWS Specialization Partner and AWS Marketplace Seller that helps companies automatically optimize compute-based workloads. Its mission is to make it faster and easier for engineers to take action on cloud cost optimization, so they can focus on building and innovation.

This post will discuss how nOps adds the following to Karpenter: Amazon EC2 Spot Instance guidance, pricing insights, and usage optimization of your existing Amazon EC2 Reserved Instances and Savings Plan commitments.

With nOps and Karpenter, users can run mission-critical workloads with peace of mind that they are scheduled for cost-optimization, reliability, and stability.

What is Karpenter?

Karpenter is an open-source, flexible, and high-performance Kubernetes cluster autoscaler that enhances pod placement for improved instance utilization and lower compute costs. It addresses predefined node group constraints, enabling more fine-grained control over resource utilization.

This is accomplished via NodePools, which allow users to specify a wide variety of constraints on nodes. This includes instance groups, families and/or sizes, availability zones, architectures, and capacity types—allowing Karpenter to make optimal decisions on what instances to start or terminate.

Multiple NodePools can be configured on the same cluster, enabling different workloads running on the same cluster to have separate capacities. This helps to isolate nodes for billing, specify different node constraints, and set disruption settings.

This advanced node scheduling and scaling technology can be leveraged in conjunction with Amazon EKS. Users also benefit from Karpenter’s native ability to handle Amazon EC2 Spot interruptions, as Karpenter receives Spot interruption notifications and gracefully drains and provisions new nodes.

What is nOps Compute Copilot?

nOps complements Karpenter with the following capabilities, helping customers to cost-optimize while freeing time from managing resources:

  • Awareness of your EC2 Reserved Instances and Savings Plan commitments
  • Advanced analysis of EC2 Spot pricing data
  • Advanced EC2 Spot Instance termination prediction

The solution cost-optimizes EC2 Spot Instance selection and intelligently analyzes your organizational utilization and commitments in near real-time. This moves the optimal amount of your workload onto EC2 Spot Instances, freeing Savings Plans to apply to uncovered usage elsewhere.

Additionally, it leverages Spot pricing and termination data to continuously evaluate the likelihood of a Spot interruption for each instance type across all AWS regions. This allows Compute Copilot to select the most stable and reliable compute resources for EKS clusters.

Karpenter Spot and Commitment Management Capabilities
Figure 1 – Increase Karpenter savings with added Spot and commitment management capabilities.

How it Works

The benefits of nOps Compute Copilot are accomplished in two key ways. Firstly, machine learning (ML) algorithms continually ingest and analyze utilization data to generate recommendations. This includes the AWS Cost and Usage Report, EC2 Spot Instance pricing data, Spot Instance termination data, and EC2 Reserved Instances metadata.

A variety of algorithms are leveraged to detect savings opportunities, such as underutilized Reserved Instances or Savings Plans. Algorithms also determine the interruption risk of all of Spot Instance types across applicable AWS regions.

Secondly, nOps automatically manages your Karpenter configurations to schedule cost-optimized Amazon EKS workloads. Compute Copilot includes a backend cost consideration engine that generates actionable recommendations to maximize stability and savings. An example might include finding optimal Spot Instances using the AWS API to limit the use of a specific Spot type for a period of time.

nOps consumes new recommendations and translates them to corresponding NodePool configurations without manual input. It updates Karpenter via options such as NodePool CPU limits or NodePool weights. This automates the complex process of monitoring Spot pricing and Reserved Instance and Savings Plan commitments.

nOps-EKS-Karpenter-2

Figure 2 – AWS and nOps work together to automatically cost-optimize your workloads.

Configuring nOps requires the installation of the nOps Karpenter agent Helm chart, a lightweight component of Compute Copilot used to synchronize NodePools and NodeClasses with the nOps backend. Due to this minimal design, the agent does not cause any adverse effect on Karpenter or the cluster in the event of nOps downtime.

Users onboard a cluster to the platform via a simplified user interface (UI). Configuration steps follow those of creating Node Classes and NodePools, aligning seamlessly with the experience of using Karpenter.

Unified Savings Opportunities Dashboard

Figure 3 – Savings opportunities are summarized in a unified dashboard.

To make the process easy, the onboarding UI limits configuration options to only the most salient. Alternatively, users who prefer full control can directly input YAML configurations in the UI, or programmatically by leveraging nOps’ API.

The core components of configuring nOps include creating at least one Node Class and NodePool. Node Class configuration includes specifying subnets, security groups, AWS Identity and Access Management (IAM) roles, Amazon Machine Images (AMIs), custom user data, Amazon Elastic Block Store (Amazon EBS) volumes, and other core parameters for instances launched by Karpenter.

NodePool creation allows users to specify intended behavior and includes availability zones, instance families, and sizes, maximum total CPU and RAM, EC2 Spot Instances and/or On-Demand Instances, as well as taints and labels.

Configuration options by creating Node Classes Node Pools

Figure 4 – Configuration allows for a wide range of options by creating Node Classes and NodePools.

After the cluster is onboarded successfully and connectivity with the agent is established, nOps is fully functional without any further user involvement, other than monitoring Karpenter as usual and tracking delivered savings.

Awareness of EC2 Reserved Instances and Savings Plans

By monitoring your pre-existing EC2 Reserved Instances and Savings Plan commitments, nOps allows users to save more with Karpenter by holistically optimizing their EKS spending across all linked AWS accounts.

Through frequent metadata ingestion and predictive machine learning, nOps identifies underutilized resources. When possible, it deploys high-priority NodePools to shift connected cluster nodes to instances that would not otherwise be covered. Spot Instance prices are set by long-term trends in supply and demand for Spot capacity. nOps’s backend API, together with a custom agent, is deployed on configured clusters to keep NodePools in sync with the latest pricing updates.

nOps continually right-sizes, reconsiders, and re-evaluates your workload placement to Reserved Instances, AWS Savings Plans, or Spot Instances, so you can take full advantage of these savings tools.

EC2 Spot Pricing Awareness for Cost Savings and Reliability

Amazon EKS workloads can exhibit variability, creating potential challenges for coverage with Reserved Instances and Savings Plans. Spot Instances provide an opportunity for managing dynamic workloads by leveraging available spare capacity for less cost when needed.

As interruptions can occur and impact workloads, nOps Compute Copilot uses ML-based on Spot recommendations and best practices to identify optimal instance types with a lower chance of interruption. Every 10 minutes, nOps analyzes the Spot API to develop risk scoring for each instance type. A list of instances to use or avoid, valid for 60 minutes, is propagated to all Spot NodePools in the configured cluster.

Copilot leverages this advance notice of Spot termination and Karpenter to continually move you onto optimal and cost-effective instance types. As a result, you can automatically provision and scale workloads to benefit from Spot discounts while maintaining the stability of compute resources in EKS clusters.

Extend Karpenter’s Cost Savings to Other Accounts

A significant advantage of most Reserved Instances and Savings Plans commitments is flexible usage across any account of a given organization, regardless of the account in which they were purchased. By monitoring your usage and commitments across your ecosystem, nOps can find unused resources in one account and leverage Karpenter to use them in another account.

When there’s more usage than commitments, Karpenter diverts usage to Spot Instances to free up these commitments for other, potentially more costly uses.

In other cases, such as for long-running workloads, Savings Plans may be a better fit. When usage is lower and commitments free up, nOps will automatically adjust your Spot usage.

Conclusion

nOps Compute Copilot built on Karpenter is designed to make it simple for customers to maintain stability, and optimize resources efficiently. nOps optimizes your cost savings on autopilot and frees up your team to focus on innovating.

Learn more about nOps in AWS Marketplace and book a demo call today.

.
nOps-APN-Blog-Connect-2023
.


nOps – AWS Partner Spotlight

nOps is a AWS Specialization Partner that helps companies automatically optimize compute-based workloads. Its mission is to make it faster and easier for engineers to take action on cloud cost optimization, so they can focus on building and innovation.

Contact nOps | Partner Overview | AWS Marketplace