Using Amazon EC2 Spot Instances with Karpenter

Update: Starting Karpenter version 0.19.3 it is recommended to use Karpenter native interruption handling rather than using a standalone Node Termination Handler. For more information, refer to the Karpenter FAQ.

Overview

Karpenter is a dynamic, high performance cluster auto scaling solution for the Kubernetes platform introduced at re:Invent 2021. Customers choose an auto scaling solution for a number of reasons, including improving the high availability and reliability of their workloads at the same reduced costs. With the introduction of Amazon EC2 Spot Instances, customers can reduce costs up to 90% compared to On-Demand prices. Combining a high performing cluster auto scaler like Karpenter with EC2 Spot Instances, EKS clusters can acquire compute capacity within minutes while keeping costs low.

In this blog post, we will look at how to use Karpenter with EC2 Spot Instances and handle Spot Instance interruptions.

Getting started

To get started with Karpenter in AWS, you need a Kubernetes cluster. We will be using an EKS cluster throughout this blog post. To provision an Amazon Elastic Kubernetes Service (Amazon EKS) cluster and install Karpenter, please follow the getting started docs from the Karpenter documentation.

Karpenter’s single responsibility is to provision compute capacity for your Kubernetes clusters, which is configured by a custom resource called NodePool. Currently, when a pod is newly created, kube-scheduler is responsible for finding the best feasible node so that kubelet can run it. If none of the scheduling criteria are met, the pod stays in a pending state and remains unscheduled. Karpenter relies on the kube-scheduler and waits for unscheduled events and then provisions new node(s) to accommodate the pod(s). The following code snippet shows an example of a Spot NodePool configuration specifying instance types, Availability Zones, and capacity type. Diversification and flexibility are important when using Spot instances so that Karpenter can make use of a wide range of the available and continuously growing number of EC2 instances. Don’t rely on just a few specific instance types or sizes but rather configure a diverse set of instances types and sizes to maximize capacity and availability of Spot within the cluster.

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: spot-fleet
spec:
  template:
    spec:
      requirements:
          - key: kubernetes.io/arch
          operator: In
          values: ["amd64", "arm64"]
        - key: "karpenter.k8s.aws/instance-cpu"
          operator: In
          values: ["4", "8", "16", "32"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r", "i", "d"]
          - key: "topology.kubernetes.io/zone" 
          operator: In
          values: ["ap-southeast-1a", "ap-southeast-1b", "ap-southeast-1c"]
          - key: "karpenter.sh/capacity-type" # Defaults to on-demand
          operator: In
          values: ["spot"] # ["spot", "on-demand"]
      nodeClassRef:
        name: default
      disruption:
        consolidationPolicy: WhenEmpty # WhenEmpty | WhenUnderutilized
        consolidateAfter: 30s
        expireAfter: 72h

Node selection

Karpenter will automatically launch compute resources when there are unscheduled pods following the constraints we have defined in the NodePool. If we don’t specify any EC2 instance type and size constraints, Karpenter will choose the most suitable one for the unscheduled pods from the full range of instance types available in AWS. However, it is recommended to specify what types of instances Karpenter should launch, a specific processor type, amount of vCPU, in preferred availability zones, etc. to adapt the capacity resource to the workloads.

Once some constraints have been defined in the NodePool, Karpenter needs to know from which instance pool to launch. The strategy to provision an EC2 Spot instance is the Price Capacity Optimized allocation strategy, i.e. it will select a Spot instance considering both lowest price combined with the lowest chance of being interrupted. For On-Demand instances, Karpenter uses the lowest-price allocation strategy. To add additional capacity to the cluster, Karpenter does not rely on Autoscaling Groups, however calls the EC2 Fleet API directly to add new capacity to the cluster. This increases the heterogeneity of nodes within the cluster by design which reduces the need to set up different Amazon EKS managed node groups or Amazon EC2 Autoscaling Groups.

Capacity Type

When creating a NodePool, we can use either Spot, On-Demand Instances, or both. When you specify both, and if the pod does not explicitly specify whether it needs to use Spot or On-Demand, then Karpenter opts to use Spot when provisioning a node.

If the Spot capacity is not available, then Karpenter falls back to On-Demand Instances to schedule the pods. One of the EC2 Spot best practices is that diversification and flexibility are key when it comes to use Spot instances within the cluster: Diversification and flexibility in terms of instance types, sizes, CPU architecture, Availability Zones, and if possible also in AWS Regions. Also take into account that if a Spot quota limit has reached at account level you might get a MaxSpotInstanceCountExceeded exception. In this case, Karpenter won’t perform a fallback. The users have to implement adequate monitoring for quotas and exceptions to create necessary alerts and reach AWS support for the necessary quota increase.

This is an example of a good diversification of Amazon EC2 instances, where Karpenter can choose around 250 instances from the 750+ available today in AWS:

requirements:
    - key: kubernetes.io/arch
        operator: In
        values: ["amd64", "arm64"]
    - key: "karpenter.k8s.aws/instance-cpu"
        operator: In
        values: ["4", "8", "16", "32"]
    - key: karpenter.k8s.aws/instance-category
        operator: In
        values: ["c", "m", "r", "i", "d"]
    - key: "topology.kubernetes.io/zone" 
        operator: In
        values: ["ap-southeast-1a", "ap-southeast-1b", "ap-southeast-1c"]
    - key: "karpenter.sh/capacity-type" 
        operator: In
        values: ["spot", "on-demand"]

Resiliency

Karpenter can handle Spot instance interruption natively: it will automatically taint, drain, and terminate the node ahead of the interruption event. The NodePool will launch a new node as soon as it sees the Spot interruption warning informing that in 2 minutes Amazon EC2 will reclaim the instance. Karpenter uses the price-capacity-optimized strategy to select the Spot instance to launch.

To enable this interruption-handling function, Karpenter needs an SQS queue for watching interruption events and EventBridge rules and targets that forward interruption events from AWS services to the SQS queue. Karpenter provides details for provisioning this infrastructure in the CloudFormation template in the Getting Started Guide. Then, configure the --interruption-queue-name CLI argument with the name of the interruption queue provisioned to handle interruption events.

Another useful feature for Spot instances in Karpenter is Consolidation. You can define the consolidation policy in the disruption section of the NodePool, as explained before. Karpenter will automatically detect nodes that can be deleted, or replaced if the node is empty or underutilized. Karpenter will identify if the pods running on a node can be scheduled in other nodes and then delete it, or if that node can be replaced by a smaller and cheaper one. For Spot nodes, Karpenter has deletion consolidation enabled by default, so it won’t replace a Spot node with a cheaper Spot node.

One important event that is worth mentioning when using Spot Instances is the rebalance recommendations signal. It either arrives sooner or along with the Spot termination notice. When rebalance recommendations signals arrive ahead of a Spot Instance termination notice, it doesn’t mean Spot is interrupting the node. It’s just a recommendation to give an opportunity to proactively manage the capacity needs. Although Karpenter doesn’t recommend reacting to spot rebalance recommendations, it can be configure using the AWS Node Termination Handler (NTH) as Karpenter does not currently support it. However, both components may cause conflicts as they do not share information between each other. Both components will handle the same events and could potentially cause more node churn in the cluster than with interruptions alone. More information here.

In order to make use of Spot instances and cost optimize your EKS workload with Karpenter, however to make sure to keep workloads running, you can use Kubernetes Pod Disruption Budgets (PDB) or PodTopologySpreadConstraints. These and other Kubernetes native scheduling constraints such as NodeSelectors, NodeAffinity, Taints and Tolerations are respected by Karpenter, however pod scheduling constraints must fall within a NodePool’s constraints.

Monitoring

Spot interruptions can occur at any time. Monitoring Kubernetes cluster metrics and logs can help to create notifications when Karpenter fails to acquire capacity. We have to set up adequate monitoring at the Kubernetes cluster level for all the Kubernetes objects and monitor the Karpenter NodePool. We will use Prometheus and Grafana to collect the metrics for the Kubernetes cluster and Karpenter. CloudWatch Logs will be used to collect the logs.

To get started with Prometheus and Grafana on Amazon EKS, please follow the Prometheus and Grafana installation instruction from the Karpenter getting started guide. The Grafana dashboards are preinstalled with dashboards containing controller metrics, node metrics, and pod metrics.

Using the panel Pod Phase that is included in the prebuilt Grafana dashboard named Karpenter Capacity, you can check for pods that have Pending status for more than a predefined period (for example, three minutes). This will help us to understand if there are any workloads that can’t be scheduled.

Karpenter controller logs can be sent to CloudWatch Logs using either Fluent Bit or FluentD. (Here’s information on how to get started with CloudWatch Logs for Amazon EKS.) To view the Karpenter controller logs, go to the log group /aws/containerinsights/cluster-name/application and search for Karpenter.

In the log stream, search for Provisioning failed log messages in the Karpenter controller logs for any provisioning failures. The example below shows provisioning failure due to reaching the account limit for Spot Instances.

2021-12-03T23:45:29.257Z        ERROR   controller.provisioning Provisioning failed, launching capacity, launching instances, with fleet error(s), UnfulfillableCapacity: Unable to fulfill capacity due to your request configuration. Please adjust your request and try again.; MaxSpotInstanceCountExceeded: Max spot instance count exceeded; launching instances, with fleet error(s), MaxSpotInstanceCountExceeded: Max spot instance count exceeded   {"commit": "6984094", "provisioner": "default"

Conclusion

In this blog post, we did a quick overview of Karpenter and how we can use EC2 Spot Instances with Karpenter to scale the compute needs in an Amazon EKS cluster. We encourage you to check out the Further Reading section below to discover more about Karpenter.

Containers