Containers
How to upgrade Amazon EKS worker nodes with Karpenter Drift
[May, 2024 – This blog has been updated to reflect Karpenter v1beta1 API changes]
Introduction
Karpenter is an open-source cluster autoscaler that provisions right-sized nodes in response to unschedulable pods based on aggregated CPU, memory, volume requests, and other Kubernetes scheduling constraints (e.g., affinities and pod topology spread constraints), which simplifies infrastructure management. When using Cluster Autoscaler as an alternative autoscaler, all Kubernetes nodes in a node group must have the same capacity (vCPU and memory) for autoscaling to work effectively. This results in customers having many node groups of different instance sizes, each backed by an Amazon EC2 Auto Scaling group, to meet the requirements of their workload. As a workload continually evolves overtime, the changing resource requirements mean picking the right-sized Amazon Elastic Compute Cloud (Amazon EC2) instances can be challenging. In addition, as Karpenter doesn’t orchestrate capacity management with external infrastructure like node groups and Amazon EC2 auto scaling groups, it introduces a different perspective to operational processes to keep worker node components and operating systems up to date with the latest security patches and features.
In this post, we’ll describe the mechanism for patching Kubernetes worker nodes provisioned with Karpenter through a Karpenter feature called Drift. If you have many worker nodes across multiple Amazon EKS clusters, then this mechanism can help you continuously patch at scale.
Solution overview
Karpenter node patching mechanisms
When Amazon EKS supports a new Kubernetes version, you can upgrade your Amazon Elastic Kubernetes Service (Amazon EKS) cluster control plane to the next version with a single API call. Upgrading the Kubernetes data plane involves updating the Amazon Machine Image (AMI) for the Kubernetes worker nodes. AWS releases AMIs for new Kubernetes versions as well as patches and CVEs(Common Vulnerabilities and Exposures). You can choose from a wide variety of Amazon EKS-optimized AMIs. Alternatively, you can also use your own custom AMIs. Currently, Karpenter in the EC2NodeClass resource supports amiFamily values AL2, AL2023, Bottlerocket, Ubuntu, Windows2019, Windows2022 and Custom. When an amiFamily of Custom is chosen, then an amiSelectorTerms must be specified that informs Karpenter on which custom AMIs are to be used.
Karpenter uses Drift to upgrade Kubernetes nodes following a rolling deployment. As nodes are de-provisioned, nodes are cordoned to prevent new pods scheduling and pods are evicted using the Kubernetes Eviction API. The Drift mechanism is as follows:
Drift
For Kubernetes nodes provisioned with Karpenter that have drifted from their desired specification, Karpenter provisions new nodes first, evicts pods from the old nodes, and then terminates. At the time of writing this post, the Drift interval is set to 5 minutes. However, if the NodePool or EC2NodeClass is updated, then the Drift check is triggered immediately. In EC2NodeClass, amiFamily is a required field, and you can use your own AMI value, or EKS Optimized AMIs. Drift for AMIs has two behaviors in these two cases which are detailed below.
Drift with specified AMI values
You may consider this approach to control the promotion of AMIs through application environments for consistency. If you change the AMI(s) in the EC2NodeClass for a NodePool or associate a different EC2NodeClass with the NodePool, Karpenter detects that the existing worker nodes have drifted from the desired setting.
To trigger the upgrade, associate the new AMI(s) in the EC2NodeClass and Karpenter upgrades the worker nodes via a rolling deployment. AMIs can be specified explicitly by AMI ID, AMI names, or even specific tags. If multiple AMIs satisfy the criteria, then the latest AMI is chosen. Customers can track which AMIs are discovered by the EC2NodeClass from the AMI value(s) under status field in EC2NodeClass. One way of getting the status would be by running kubectl describe on the EC2NodeClass. In certain scenarios, if both the old and the new AMIs are discovered by the EC2NodeClass, then the running nodes with old AMIs will be drifted, de-provisioned and replaced with worker nodes with the new AMI. The new nodes are provisioned using the new AMI. To learn more about selecting AMIs in the EC2NodeClass, refer here.
Example 1 – Select AMIs by IDs
Example 2 – Select AMIs where Name tag has the value appA-ami, in the application account 0123456789
Drift with Amazon EKS optimized AMIs
If there is no amiSelectorTerms specified in the EC2NodeClass, then Karpenter monitors the SSM parameters published for the Amazon EKS-optimized AMIs. You can specify a value from AL2, AL2023, Bottlerocket, Ubuntu, Windows2019, or Windows2022 in the field amiFamily to tell Karpenter which Amazon EKS-optimized AMI should it use. Karpenter provisions nodes with the latest Amazon EKS-optimized AMI for the amiFamily specified, for the EKS version cluster is running with. Karpenter detects when a new AMI is released for the version of the Kubernetes cluster and drifts the existing nodes. EC2NodeClass AMIs value under status field reflect the newly discovered AMI. Those nodes are de-provisioned and replaced with worker nodes with the latest AMI. With this approach, the nodes with older AMIs are recycled automatically (e.g., when there is a new AMI available or after a Kubernetes control plane upgrade). With the previous approach of using amiSelectorTerms, you have more control when the nodes are upgraded. Consider the difference and select the approach suitable for your application. Karpenter currently doesn’t support custom SSM parameters.
Walkthrough
We’ll walk through the following scenarios:
- Enabling the Karpenter Drift feature gate
- Automation of node upgrade with Drift
- Node upgrade with controlling promotion of AMIs
Prerequisites
You’ll need the following to complete the steps in this post:
- An existing Amazon EKS cluster. If you don’t have one, please follow any one method described here to create a cluster.
- An existing latest Karpenter deployment. Please follow the getting started with Karpenter guide listed here to install Karpenter.
We’ll first export the Amazon EKS cluster name to proceed the walkthrough.
Step 1. Enabling the Karpenter Drift feature gate
Since Karpenter version 0.33, Drift is enabled by default. You can disable the drift feature by specifying –feature-gates DriftEnabled=false in the command line arguments to Karpenter.
Step 2. Automate the worker node upgrade with Drift
In this example, we’re specifying the amiFamily field with value of AL2 to target AL2 EKS Optimized AMIs.
Note: Select your own subnets and security groups if your Amazon EKS cluster isn’t provisioned by eksctl. Refer to this page for more details in discovering subnets and security groups with Karpenter EC2NodeClass.
Let’s deploy a sample deployment, named inflate to scale the worker nodes:
You can check the Karpenter logs to see that Karpenter found unscheduable (i.e., provisionable) pods and created new nodes to accommodate the pending Pods:
Next, check the AMI version of a newly deployed node. In this demonstration environment, an AMI version is v1.28:
Now let’s check the Amazon EKS control plane version. We’re assuming the control plane version is equivalent to the node version:
We’ll now upgrade the Amazon EKS control plane and validate if the worker node(s) are automatically updated to the new version that matches the control plane version. You can use your own preferred way to upgrade it but we’ll use AWS Command Line Interface (AWS CLI) as an example here. Replace the region-code with your own. Replace 1.28 with the Amazon EKS-supported version number that you want to upgrade your cluster to. For best practices on Amazon EKS cluster upgrades see the clusters upgrade section of the Amazon EKS best practices guide.
Monitor the status of your cluster update with the following command. Use the update ID that the previous command returned and replace the <update-id> with that value in the following command. When a Successful status is displayed, the upgrade is complete.
After the status changes to Active, let’s check the Karpenter logs. You can check that Karpenter detected a drift and started deprovisioning node via drift and replaces with a new node.
Let’s check the AMI version of the node:
You’ll see a v1.28 node status is Ready, SchedulingDisabled and a newly deployed v1.29 node is NotReady yet.
After few seconds, you can run $ kubectl get nodes -l team=my-team again to check the new v.1.29 node is ready and the previous v1.28 node is terminated.
Note: The actual amount of time for node upgrade varies by the environment.
Step 3. Node upgrade with controlling promotion of AMIs
As we just saw, Karpenter Drift automatically upgrades the node AMI version when the Amazon EKS control plane is upgraded with an Amazon EKS-optimized Amazon Linux AMI. However, there are use-cases (e.g., prompting AMIs through environments) that you want to have more controls on when to initiate the AMI update with a specific AMI. For that, if you specify the AMI in the amiSelectorTerms (under EC2NodeClass), nodes will only be updated when you explicitly change the AMI without following the control plane update.
For this example, we’re using Bottlerocket OS for running containers. Bottlerocket is a Linux-based open-source operating system that is purpose-built by Amazon Web Services for running containers. For more details on benefits of using BottleRocket OS, please refer to https://aws.amazon.com/bottlerocket/.
Note – In the below example, we’re using BottleRocket AMI. Karpenter will automatically query for the appropriate EKS optimized AMI via AWS Systems Manager (SSM). In the case of the Custom amiFamily, no default AMIs are defined. As a result, amiSelectorTerms must be specified to inform Karpenter on which custom AMIs are to be used.
Note: Select your own subnets and security groups if your Amazon EKS cluster isn’t provisioned by eksctl. Refer to this page for more details in discovering subnets and security groups with Karpenter EC2NodeClass.
Now, let’s edit the default NodePool to use this newly created EC2NodeClass, bottlerocket.
Search nodeClassRef under specifications and change the name value from default to bottlerocket:
Let’s check the Karpenter logs. You can check that Karpenter detected the drift and deprovisioning node via drift replace with a new node:
Let’s check the AMI version of the node:
You’ll see an existing Amazon EKS-optimized Linux v1.29 AMI (v1.29.0-eks-5e0fdde) status is Ready, SchedulingDisabled and a newly deployed bottlerocket v.1.28 node (v1.28.2) is NotReady yet.
After few seconds, you can now check the new Bottlerocket v1.29 node is ready and the previous Amazon EKS-optimized Linux AMI v1.29 node is terminated.
When using Karpenter, there are some additional design considerations that can help you achieve continuous operations:
- Use Pod Topology Spread Constraints to spread workloads across fault domains for high availability – Similar to pod anti-affinity rules, pod topology spread constraints allow you to make your application available across different failure (or topology) domains like hosts or availability zones.
- Consider Pod Readiness Gates – For workloads that ingress via an Elastic Load Balancer (ELB) to validate whether workloads are successfully registered to target groups, consider using Pod readiness gates. See the Amazon EKS best practices guide for more information.
- Consider Pod Disruptions Budgets – Use Pod disruption budgets to control the termination of pods during voluntary disruptions. Karpenter respects Pod disruption budgets (PDBs) by using a backoff retry eviction strategy.
- Consider whether automatic AMI selection is the right approach – It is recommended to consider the latest and greatest Amazon EKS optimized AMIs; however, if you would like to control the roll out of AMIs across environments then think whether you’d let Karpenter pick the latest AMI or you’d specify your own AMI. By specifying your own AMI, you can control promotion of AMIs through application environments.
- Consider setting karpenter.sh/do-not-disrupt: “true” – For workloads that might not be interruptible (e.g., long running batch jobs without checkpointing), consider annotating pods with the do not disrupt annotation. By opting pods out of disruption, you are telling Karpenter that it shouldn’t voluntarily remove nodes containing this pod. Or you can also set the karpenter.sh/do-not-disrupt annotation on the node which will prevent disruption actions on the node.
Cleaning up
To clean up the resources created, you can execute the following steps:
- Delete the Karpenter NodePool to deprovision nodes, cleanup the EC2NodeClass, and sample application:
kubectl delete -f basic.yml
kubectl delete -f bottlerocket.yaml
kubectl delete -f sample-deploy.yaml
- If you created a new Amazon EKS cluster for the walkthrough, then don’t forget to clean up any resources or you incur costs.
Conclusion
For customers with many Kubernetes clusters and node groups, adopting Karpenter simplifies the infrastructure management. In this post, we described approaches on how to upgrade and patch Kubernetes nodes when using a Karpenter feature called Drift. These patching strategies can reduce your undifferentiated heavy lifting, which help you patch worker nodes at scale by moving from a point-in-time strategy to a continuous mechanism. The Karpenter Drift feature is still evolving and for the latest up to date information, checkout the Karpenter documentation.
If you would like to learn more, then come and discuss Karpenter in the #karpenter channel in the Kubernetes slack or join the Karpenter working group calls.
To get hands-on information, then checkout the Karpenter workshop.