Deploying Karpenter Nodes with Multus on Amazon EKS

Container based Telco workloads use Multus CNI primarily for traffic or network segmentation. Amazon Elastic Kubernetes Service (Amazon EKS) supports Multus CNI enabling users to attach multiple network interfaces, apply advanced network configuration and segmentation to Kubernetes-based applications running on AWS. One of the many benefits of running applications on AWS is resource elasticity (scaling out and scaling in). Node elasticity can be made possible by a cluster autoscaler such as Karpenter. Karpenter automatically launches the right compute resources to handle application demand. It is designed to use the cloud with fast and simple compute provisioning for Kubernetes clusters. In addition, nodes provisioned by Karpenter are group-less and provisioned outside of an Amazon EKS node group.

Purpose

This post demonstrates a deployment model of an EKS cluster with Karpenter provisioned nodepools with Multus interfaces. The deployment model is recommended for – but not limited to – Telco workloads on AWS. It aims to leverage Karpenter as a group-less nodepool and as an autoscaling solution. The group-less workers (nodepools) are for application pods that need Multus CNI networking, while the Amazon EKS managed nodegroup workers are running pods that do not need Multus CNI, such as plugins, add-ons, and Karpenter itself. This post also illustrates how Karpenter manages just-in-time scaling of workers with multiple elastic network interfaces (ENI), thus meeting stringent requirements for scalability and network separation.

Why this type of deployment?

The main use case of Karpenter is to provision worker nodes in a Kubernetes cluster during deployment and scale out events. In this post we introduce an additional use case where Karpenter is used for provisioning nodepools hosting applications that use Multus CNI. This deployment model decouples your application worker nodes from the Amazon EKS-managed nodegroups. This approach provides benefits such as 1/ your application workers are not limited to a specific instance type and size, and you have a wide selection of instance types and family to choose from; 2/ your application scale-out capabilities are not tied to an Amazon Elastic Compute Cloud (Amazon EC2) autoscaling group, providing the benefit to scale with more flexibility than with an Amazon EKS nodegroup autoscaling group. Furthermore, by not relying on EC2 Autoscaling group, Karpenter is quick to provision worker nodes by directly using Amazon EC2 fleet APIs, which can be critical for application scale-out scenarios. Standard Karpenter based node provisioning does not support Multus CNI, as it creates nodes attached to a single VPC subnet. This post provides the solution for Multus via Karpenter by introducing user-data based ENI management via EC2NodeClass.

Deployment

The steps detailed in this section use the GitHub repository.

This deployment showcases the flexibility of Karpenter as a means to run application pods with Multus CNI. It applies the approach of creating and attaching ENIs for Multus through Karpenter and at the same time demonstrates the scaling capabilities of Karpenter.

Prerequisites

The following prerequisites are necessary to continue with this post:

An AWS CloudShell

Environment setup

In this setup we create a VPC, EKS clusters, EKS Managed Node Group, Security Groups, and associated VPC components.

We use the CloudShell environment from here to configure the EKS cluster and deploy a sample application. Go to your CloudShell console, download this GitHub repository, and start by executing the following script to install the necessary tools (awscli, kubectl, eksctl, helm, etc.).

sudo chmod +x tools/installTools.sh
./tools/installTools.sh

2. Create an AWS CloudFormation stack using template vpc-infra-mng.yaml. Select two Availability zones (e.g., us-west-2a & us-west-2b). Name your stack karpenterwithmultus (you need this stack name later in your CloudShell environment). You can keep the other parameters default. You can use the following AWS Command Line Interface (AWS CLI) command or use the AWS Management Console using the CloudFormation menu.

aws cloudformation create-stack --stack-name karpenterwithmultus --template-body file://cfn-templates/vpc-infra-mng.yaml --parameters ParameterKey=AvailabilityZones,ParameterValue=us-west-2a\\,us-west-2b --region us-west-2 --capabilities CAPABILITY_NAMED_IAM

Before executing the next steps make sure the first CFN stack creation is completed. Note that the stack creation may take ~ 15 minutes as it builds the EKS cluster and worker nodes.

The resulting architecture looks like the following image:

On your CloudShell, create environment variables to define the Karpenter version, EKS cluster name, and AWS Region. Make sure you change the parameter values according to your environment. Then, go to the CloudShell console and execute the following commands:

export KARPENTER_NAMESPACE=karpenter
export K8S_VERSION=1.29
export KARPENTER_VERSION=v0.32.3
export AWS_PARTITION="aws"
export VPC_STACK_NAME="karpenterwithmultus"
export CLUSTER_NAME="eks-${VPC_STACK_NAME}"
export AWS_DEFAULT_REGION=us-west-2 
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export TEMPOUT=$(mktemp)

NOTE:

VPC_STACK_NAME is the name you gave for your CloudFormation template vpc-infra-mng.yaml.

Pay special attention to the default Region value. In later steps your nodepool.yaml config is configured with Availability Zones (AZs).

This example uses CloudShell as your admin node. Environment variables are lost when CloudShell times out. Feel free to use your own admin node.

Update your kubeconfig and test Amazon EKS control plane access.

aws eks update-kubeconfig --name $CLUSTER_NAME
kubectl get nodes
kubectl get pods -A

If you don’t get an error, then it means you have access to the K8S cluster.

Plugin setup

4. Install the Multus CNI Multus CNI is a container network interface plugin for Kubernetes that enables you to attach multiple network interfaces to pods. If you want to understand Multus CNI and VPC CNI and how they work together on Amazon EKS, then refer to this Amazon Container post.

kubectl apply -f https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/master/config/multus/v4.0.2-eksbuild.1/multus-daemonset-thick.yml
kubectl get daemonsets.apps -n kube-system

5. We need to have CIDR reservations on the Multus subnets since we dedicate a portion of the IPs for Multus pod IPs. CIDR reservation with the explicit flag tells the VPC not to touch these CIDR blocks when creating VPC resources such as ENI. Execute the following CIDR reservation commands on the Multus subnets. We are going to reserve /27.

Multus1Az1subnetId=$(aws ec2 describe-subnets --filters "Name=tag:Name,Values=Multus1Az1-${VPC_STACK_NAME}" --query "Subnets[*].SubnetId" --output text)
aws ec2 create-subnet-cidr-reservation --subnet-id ${Multus1Az1subnetId} --cidr 10.0.4.32/27 --reservation-type explicit --region ${AWS_DEFAULT_REGION}
Multus1Az2subnetId=$(aws ec2 describe-subnets --filters "Name=tag:Name,Values=Multus1Az2-${VPC_STACK_NAME}" --query "Subnets[*].SubnetId" --output text)
aws ec2 create-subnet-cidr-reservation --subnet-id ${Multus1Az2subnetId} --cidr 10.0.5.32/27 --reservation-type explicit --region ${AWS_DEFAULT_REGION}
Multus2Az1subnetId=$(aws ec2 describe-subnets --filters "Name=tag:Name,Values=Multus2Az1-${VPC_STACK_NAME}" --query "Subnets[*].SubnetId" --output text)
aws ec2 create-subnet-cidr-reservation --subnet-id ${Multus2Az1subnetId} --cidr 10.0.6.32/27 --reservation-type explicit --region ${AWS_DEFAULT_REGION}
Multus2Az2subnetId=$(aws ec2 describe-subnets --filters "Name=tag:Name,Values=Multus2Az2-${VPC_STACK_NAME}" --query "Subnets[*].SubnetId" --output text)
aws ec2 create-subnet-cidr-reservation --subnet-id ${Multus2Az2subnetId} --cidr 10.0.7.32/27 --reservation-type explicit --region ${AWS_DEFAULT_REGION}

NOTE: If you are getting an error – subnet ID doesn’t exist – then check if you have defined the correct AWS_DEFAULT_REGION environment variable.

6. Install the Whereabouts plugin, Whereabouts is an IPAM CNI plugin that we use to managed the Multus pod IP addresses that we set aside from the previous step. Multus pod IPs through whereabouts are defined via Custom Resource Definition (CRD) Network-Attachment-Definitions (NAD).

kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/whereabouts/master/doc/crds/daemonset-install.yaml
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/whereabouts/master/doc/crds/whereabouts.cni.cncf.io_ippools.yaml
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/whereabouts/master/doc/crds/whereabouts.cni.cncf.io_overlappingrangeipreservations.yaml
kubectl get daemonsets.apps -n kube-system

7. Apply NetworkAttachmentDefinitions on the cluster. This configures the multus interfaces on the pods when we create the application pods later. If you inspect the file, then you notice that the range we defined is the CIDR reservation prefixes we set aside in the previous step.

kubectl apply -f  sample-application/multus-nad-az1.yaml

kubectl apply -f  sample-application/multus-nad-az2.yaml

IAM role setup

8.Create the Karpenter IAM role and other prerequisites needed for Karpenter. You should see “Successfully created/updated stack – <Karpenter-${CLUSTER_NAME}>” at the end of this step.

curl -fsSL https://raw.githubusercontent.com/aws/karpenter/"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml  > $TEMPOUT \
&& aws cloudformation deploy \
  --stack-name "Karpenter-${CLUSTER_NAME}" \
  --template-file "${TEMPOUT}" \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides "ClusterName=${CLUSTER_NAME}"

eksctl create iamidentitymapping \
  --username system:node:{{EC2PrivateDNSName}} \
  --cluster "${CLUSTER_NAME}" \
  --arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}" \
  --group system:bootstrappers \
  --group system:nodes
  

eksctl utils associate-iam-oidc-provider \
    --cluster "${CLUSTER_NAME}" \
    --approve

eksctl create iamserviceaccount \
  --cluster "${CLUSTER_NAME}" --name karpenter --namespace karpenter \
  --role-name "${CLUSTER_NAME}-karpenter" \
  --attach-policy-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}" \
  --role-only \
  --approve

export KARPENTER_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"

echo $KARPENTER_IAM_ROLE_ARN

9. Create an AWS Identity and Access Management (IAM) policy for additional actions needed for Multus, and attach it to the Karpenter Node Role. The userdata section of EC2NodeClass contains a script that creates, attaches, and configures ENIs to Karpenter provisioned nodes. The nodes need the right policies to be attached to the Karpenter node role

aws iam create-policy --policy-name karpenter-multus-policy --policy-document file://config/multus-policy.json | jq -r '.Policy.Arn'
karpentermultuspolicyarn=$(aws iam list-policies | jq -r '.Policies[] | select(.PolicyName=="karpenter-multus-policy") | .Arn')
aws iam attach-role-policy --policy-arn $karpentermultuspolicyarn --role-name KarpenterNodeRole-${CLUSTER_NAME}

10.(Optional) Run the following command to create a role to allow the use of spot instances. If the role has already been successfully created, then you see the following: # An error occurred (InvalidInput) when calling the CreateServiceLinkedRole operation: Service role name AWSServiceRoleForEC2Spot has been taken in this account, please try a different suffix.

aws iam create-service-linked-role --aws-service-name spot.amazonaws.com || true

NOTE: This step is optional and needed only if you want to use spot instances on your Karpenter nodepool.

Installation of Karpenter

11.Install Karpenter.

helm registry logout public.ecr.aws

helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version ${KARPENTER_VERSION} --namespace karpenter --create-namespace \
  --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \
  --set settings.aws.clusterName=${CLUSTER_NAME} \
  --set settings.aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \
  --set settings.aws.interruptionQueueName=${CLUSTER_NAME} \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --wait

12. Execute the following command to check if the Karpenter pods are in the Running state.

kubectl get pods -n karpenter -o wide

13.Update the nodepool.yaml files with the Multus subnet tag name, AZ, and security group tag names. For further reading, more details of Karpenter nodepool configuration can be found in this Karpenter concepts post.

sed -i "s/##Multus1SubnetAZ1##/Multus1Az1-${VPC_STACK_NAME}/g" config/nodepool.yaml
sed -i "s/##Multus2SubnetAZ1##/Multus2Az1-${VPC_STACK_NAME}/g" config/nodepool.yaml
sed -i "s/##Multus1SubnetAZ2##/Multus1Az2-${VPC_STACK_NAME}/g" config/nodepool.yaml
sed -i "s/##Multus2SubnetAZ2##/Multus2Az2-${VPC_STACK_NAME}/g" config/nodepool.yaml
sed -i "s/##Multus1SecGrpAZ1##/Vpc1SecurityGroup/g" config/nodepool.yaml
sed -i "s/##Multus2SecGrpAZ1##/Vpc1SecurityGroup/g" config/nodepool.yaml
sed -i "s/##Multus1SecGrpAZ2##/Vpc1SecurityGroup/g" config/nodepool.yaml
sed -i "s/##Multus2SecGrpAZ2##/Vpc1SecurityGroup/g" config/nodepool.yaml

Change the AZs in the nodepool.yaml, and run the following command using the correct AZ names that you used on your deployment. In this example we are using us-west-2a and us-west-2b.

AZ1='us-west-2a'
AZ2='us-west-2b'

sed -i "s/##AVAILABILITY_ZONE1##/${AZ1}/g" config/nodepool.yaml
sed -i "s/##AVAILABILITY_ZONE2##/${AZ2}/g" config/nodepool.yaml

Update the EKS cluster name using the following commands:

sed -i "s/##CLUSTER_NAME##/${CLUSTER_NAME}/g" config/nodepool.yaml

Apply the Karpenter nodepool configuration.

kubectl apply -f config/nodepool.yaml

NOTE: If you inspect the config/nodepool.yaml file, then you notice a customized userdata section of EC2NodeClass. The script inside the userdata provisions, attaches, and configures the MULTUS ENIs during Amazon EC2 nodepool creation. This is where we are configuring the MULTUS ENI LCM on the nodepools. If you need additional node tuning, then you can also do so in the userdata section.

You can check Karpenter controller if your nodepool config has errors.

kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller

14. To address a race condition that occurs between Multus daemonset pods and application pods both being scheduled at the same time on nodepool nodes, Karpenter needs to be configured with StartupTaints for the nodepool. The StartupTaint prevents the application pods from being scheduled on the new nodes until Multus is ready and the taint is removed. To automate the removal of taint on the nodes, a DaemonSet based solution is used here.

First let’s create a namespace for the daemonset that clears the taint.

kubectl create namespace cleartaints

We have to provide RBAC controls on the daemonset-clear-taints so that it can clear the taint on the node. A service account limited to the namespace cleartaints, role and rolebinding between the service account for cleartaints namespace, and the role it is permitted to use, is created with the following command.

kubectl apply -f config/sa-role-rolebinding.yaml

Create and apply the daemonset called daemonset-clear-taints under the namespace called cleartaints.

kubectl apply -f config/cleartaints.yaml

Check the daemonset pods. At this point you won’t be seeing a running daemonset since cleartaints only runs on the Karpenter nodepools.

kubectl get ds -n cleartaints

Deployment of Node-Latency-For-K8s

15. (Optional) To collect data about the scale-out speed of our nodepools, we can deploy Node-Latency-For-K8s solution. Node-Latency-For-K8s is an opensource tool used to analyze the node launch latency. The tool measures the different phases a node goes through during instantiation up to running the application pod. Included are the containered start/finish time, kubelet timing, and Node Readiness time. In the later steps of this post, you retrieve some examples.

export VERSION="v0.1.10"
   SCRIPTS_PATH="https://raw.githubusercontent.com/awslabs/node-latency-for-k8s/${VERSION}/scripts"
   TEMP_DIR=$(mktemp -d)
   curl -Lo ${TEMP_DIR}/01-create-iam-policy.sh ${SCRIPTS_PATH}/01-create-iam-policy.sh
   curl -Lo ${TEMP_DIR}/02-create-service-account.sh ${SCRIPTS_PATH}/02-create-service-account.sh
   curl -Lo ${TEMP_DIR}/cloudformation.yaml ${SCRIPTS_PATH}/cloudformation.yaml
   chmod +x ${TEMP_DIR}/01-create-iam-policy.sh ${TEMP_DIR}/02-create-service-account.sh
   ${TEMP_DIR}/01-create-iam-policy.sh && ${TEMP_DIR}/02-create-service-account.sh
   export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
   export KNL_IAM_ROLE_ARN="arn:aws:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-node-latency-for-k8s"
   docker logout public.ecr.aws
   helm upgrade --install node-latency-for-k8s oci://public.ecr.aws/g4k0u1s2/node-latency-for-k8s-chart \
	--create-namespace \
 	--version ${VERSION} \
  	--namespace node-latency-for-k8s \
	--set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KNL_IAM_ROLE_ARN} \
	--set env[0].name="PROMETHEUS_METRICS" \
	--set-string env[0].value=true \
	--set env[1].name="CLOUDWATCH_METRICS" \
	--set-string env[1].value=false \
	--set env[2].name="OUTPUT" \
	--set-string env[2].value=markdown \
	--set env[3].name="NO_COMMENTS" \
	--set-string env[3].value=false \
	--set env[4].name="TIMEOUT" \
	--set-string env[4].value=250 \
	--set env[5].name="POD_NAMESPACE"\
	--set-string env[5].value=default \
	--set env[6].name="NODE_NAME" \
	--set-string env[6].value=\(v1\:spec\.nodeName\)\
	--wait

Deployment of a sample application

16. Deploy a sample application. The following commands would install a deployment with one app replica per AZ. This also triggers Karpenter to create nodepools to be able to schedule the application.

Update the AZ with the correct AZ name on each file following multitool-deployment-az1.yaml, multitool-deployment-az2.yaml

sed -i "s/##AVAILABILITY_ZONE1##/${AZ1}/g" sample-application/multitool-deployment-az1.yaml
sed -i "s/##AVAILABILITY_ZONE2##/${AZ2}/g" sample-application/multitool-deployment-az2.yaml

kubectl apply -f  sample-application/multitool-deployment-az1.yaml
kubectl apply -f  sample-application/multitool-deployment-az2.yaml

Each of the deployments has the affinity key “karpenter-node”, thus these application pods are scheduled only on the worker nodes with the label “karpenter-node”. As the Karpenter nodepool configuration assigns a label when the deployment is created, Karpenter scales a node to schedule/run the pods of these deployments.

Watch for Karpenter scaling of a new Amazon EKS worker node using the following command:

kubectl get nodes -o wide

Check Karpenter Logs.

kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller

Once Karpenter launches new EC2 instances and joins the EKS cluster in the Ready state, Pods that were in the Pending state go to the Running state.

kubectl get pods -o wide

The resulting architecture now look like the following image with additional Karpenter workers with Multus interfaces.

17.Validate that the application pods are in the Running state using the following command.

$ kubectl get node
NAME                         STATUS   ROLES    AGE   VERSION
NAME                                       STATUS   ROLES    AGE     VERSION
ip-10-0-2-110.us-west-2.compute.internal   Ready    <none>   8m25s   v1.28.5-eks-5e0fdde
ip-10-0-2-156.us-west-2.compute.internal   Ready    <none>   29m     v1.28.5-eks-5e0fdde
ip-10-0-3-117.us-west-2.compute.internal   Ready    <none>   8m20s   v1.28.5-eks-5e0fdde
ip-10-0-3-146.us-west-2.compute.internal   Ready    <none>   29m     v1.28.5-eks-5e0fdde

$ kubectl get pod
NAME                                  READY   STATUS    RESTARTS   AGE
scaleouttestappaz1-6695b7f878-xjh66   1/1     Running   0          10m
scaleouttestappaz2-7f88f964b6-8sz9r   1/1     Running   0          10m

You can inspect the pods using kubectl describe pod or kubectl exec to verify the Multus interfaces and addresses.

kubectl describe pod|grep "multus   "
  Normal   AddedInterface    42s   multus             Add eth0 [10.0.2.73/32] from aws-cni
  Normal   AddedInterface    42s   multus             Add net1 [10.0.4.32/24] from default/nad-multussubnet1az1-ipv4
  Normal   AddedInterface    42s   multus             Add net2 [10.0.6.32/24] from default/nad-multussubnet2az1-ipv4
  Normal   AddedInterface    45s   multus             Add eth0 [10.0.3.212/32] from aws-cni
  Normal   AddedInterface    44s   multus             Add net1 [10.0.5.32/24] from default/nad-multussubnet1az2-ipv4
  Normal   AddedInterface    44s   multus             Add net2 [10.0.7.32/24] from default/nad-multussubnet2az2-ipv4

Choose one of the pods and do the following exec command to see the Multus pod IPs. You should see net1 and net2 interfaces as your Multus interfaces. On the Amazon EC2 worker the Multus ENIs belongs to the same subnet as eth1 and eth2 respectively.

$ kubectl exec scaleouttestappaz1-6695b7f878-xjh66  -- ifconfig net1
net1      Link encap:Ethernet  HWaddr 02:11:D3:5B:D6:6B  
          inet addr:10.0.4.35  Bcast:10.0.4.255  Mask:255.255.255.0
          inet6 addr: fe80::211:d300:15b:d66b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:9 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:450 (450.0 B)  TX bytes:978 (978.0 B)

$ kubectl exec scaleouttestappaz1-6695b7f878-xjh66  -- ifconfig net2
net2      Link encap:Ethernet  HWaddr 02:8B:1B:57:0D:35  
          inet addr:10.0.6.35  Bcast:10.0.6.255  Mask:255.255.255.0
          inet6 addr: fe80::28b:1b00:157:d35/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:51 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:5070 (4.9 KiB)  TX bytes:558 (558.0 B)

NOTE: Observe the Multus pod IPs. These IPs belong to the range we defined in the NetworkAttachmentDefinitions file (Step 6).

(Optional) You can examine the time it took to provision the Karpenter nodepool by examining the Node-Latency-For-K8s log on the newly created nodes. Retrieve the logs of node-latency-for-k8s-node-latency-for-k8s-chart-xxxxx pods running on the nodepool worker. For example, logs might take five minutes to populate.

$ kubectl -n node-latency-for-k8s logs node-latency-for-k8s-node-latency-for-k8s-chart-7bzvs
2023/12/22 16:57:09 unable to measure terminal events: [Pod Ready]
### i-0cb4baf2a1aa35077 (10.0.3.7) | c6i.2xlarge | x86_64 | us-east-1d | ami-03eaa1eb8976e21a9
|          EVENT           |      TIMESTAMP       |  T  | COMMENT |
|--------------------------|----------------------|-----|---------|
| Pod Created              | 2023-12-22T16:46:30Z | 0s  |         |
| Instance Pending         | 2023-12-22T16:46:41Z | 11s |         |
| VM Initialized           | 2023-12-22T16:46:51Z | 21s |         |
| Network Start            | 2023-12-22T16:46:53Z | 23s |         |
| Network Ready            | 2023-12-22T16:46:54Z | 24s |         |
| Cloud-Init Initial Start | 2023-12-22T16:46:54Z | 24s |         |
| Containerd Start         | 2023-12-22T16:46:54Z | 24s |         |
| Containerd Initialized   | 2023-12-22T16:46:55Z | 25s |         |
| Cloud-Init Config Start  | 2023-12-22T16:46:55Z | 25s |         |
| Cloud-Init Final Start   | 2023-12-22T16:46:55Z | 25s |         |
| Cloud-Init Final Finish  | 2023-12-22T16:47:16Z | 46s |         |
| Kubelet Start            | 2023-12-22T16:47:16Z | 46s |         |
| Kubelet Initialized      | 2023-12-22T16:47:17Z | 47s |         |
| Kubelet Registered       | 2023-12-22T16:47:17Z | 47s |         |
| Kube-Proxy Start         | 2023-12-22T16:47:20Z | 50s |         |
| VPC CNI Init Start       | 2023-12-22T16:47:20Z | 50s |         |
| AWS Node Start           | 2023-12-22T16:47:25Z | 55s |         |
| Node Ready               | 2023-12-22T16:47:27Z | 57s |         |

NOTE: The execution of the userdata script (approx. 20-25s) during bootup contributes to node ready time. Otherwise Karpenter node ready time would be around 30s.

(Optional) Automatic assignments of Multus Pod IPs

For Multus connectivity to the pods, it is essential to assign the pod Multus IPs as secondary IPs on their corresponding Multus ENIs on the worker node. You can automatically assign the Multus IP as secondary IPs on Multus ENIs by following this GitHub link and using either InitContainer based IP management Solution, or Sidecar based IP management Solution within the application pod.

Scaling action

18. Use the following command to perform the scale out to test the additional scaling of nodes using Karpenter, and monitor the nodes scaling out.

kubectl scale --replicas=5 deployment/scaleouttestappaz1

kubectl scale --replicas=5 deployment/scaleouttestappaz2

kubectl get nodes -o wide -w

kubectl get pods

19. (Optional) Let’s collect one more data point on the Karpenter scale-out speed: retrieve the logs of node-latency-for-k8s-node-latency-for-k8s-chart-xxxxx pods running on the newly created nodepool workers.

$ kubectl -n node-latency-for-k8s logs node-latency-for-k8s-node-latency-for-k8s-chart-28wsr 
### i-005dfff5abc7be596 (10.0.2.176) | c6i.2xlarge | x86_64 | us-east-1c | ami-03eaa1eb8976e21a9
2023/12/22 17:14:55 unable to measure terminal events: [Pod Ready]
|          EVENT           |      TIMESTAMP       |  T  | COMMENT |
|--------------------------|----------------------|-----|---------|
| Pod Created              | 2023-12-22T17:04:16Z | 0s  |         |
| Instance Pending         | 2023-12-22T17:04:20Z | 4s  |         |
| VM Initialized           | 2023-12-22T17:04:29Z | 13s |         |
| Network Start            | 2023-12-22T17:04:33Z | 17s |         |
| Network Ready            | 2023-12-22T17:04:33Z | 17s |         |
| Cloud-Init Initial Start | 2023-12-22T17:04:33Z | 17s |         |
| Containerd Start         | 2023-12-22T17:04:33Z | 17s |         |
| Cloud-Init Config Start  | 2023-12-22T17:04:34Z | 18s |         |
| Cloud-Init Final Start   | 2023-12-22T17:04:34Z | 18s |         |
| Containerd Initialized   | 2023-12-22T17:04:35Z | 19s |         |
| Cloud-Init Final Finish  | 2023-12-22T17:05:00Z | 44s |         |
| Kubelet Start            | 2023-12-22T17:05:00Z | 44s |         |
| Kubelet Initialized      | 2023-12-22T17:05:00Z | 44s |         |
| Kubelet Registered       | 2023-12-22T17:05:03Z | 47s |         |
| Kube-Proxy Start         | 2023-12-22T17:05:05Z | 49s |         |
| VPC CNI Init Start       | 2023-12-22T17:05:05Z | 49s |         |
| AWS Node Start           | 2023-12-22T17:05:09Z | 53s |         |
| Node Ready               | 2023-12-22T17:05:11Z | 55s |         |
2023/12/22 17:14:55 Serving Prometheus metrics on :2112

20. Karpenter is flexible in the sense that it automatically provisions the needed node/s depending on the number of pending pods. You can do a scale in and scale out again but this time increase the number of replicas. Observe the type and size of instance that Karpenter provisioned.

Scale in:

kubectl scale --replicas=1 deployment/scaleouttestappaz1
kubectl scale --replicas=1 deployment/scaleouttestappaz2

Wait for Karpenter to terminate the existing nodepool, once terminated, scale out again. In this example you can only scale your pods up to the limit of the number of IP addresses available on your Network address definition.

kubectl scale --replicas=10 deployment/scaleouttestappaz1
kubectl scale --replicas=10 deployment/scaleouttestappaz2

Cleaning up

kubectl delete -f sample-application/multitool-deployment-az1.yaml
kubectl delete -f sample-application/multitool-deployment-az2.yaml
sleep 120 # to ensure all the karpenter nodes are terminated
kubectl delete -f config/nodepool.yaml 
kubectl delete -f config/cleartaints.yaml
kubectl delete -f config/sa-role-rolebinding.yaml
helm uninstall karpenter -n karpenter
helm uninstall -n node-latency-for-k8s node-latency-for-k8s
aws iam detach-role-policy --policy-arn $karpentermultuspolicyarn --role-name KarpenterNodeRole-${CLUSTER_NAME}
karpentermultuspolicyarn=$(aws iam list-policies | jq -r '.Policies[] | select(.PolicyName=="karpenter-multus-policy") | .Arn')
aws iam delete-policy --policy-arn $karpentermultuspolicyarn

Delete all CloudFormation stacks in the reverse order.

Conclusion

In this post, we showed how Karpenter can be used in conjunction with Multus CNI and how it manages the lifecycle management of the ENIs used by Multus. The workers in the Karpenter provisioned nodepool have Multus-capable interfaces. It also demonstrated the benefits of using Karpenter as an autoscaling solution. Karpenter improves the efficiency and cost of running workloads on Kubernetes clusters by:

1/ Watching for pods that the Kubernetes scheduler has marked as unscheduled;

2/ Evaluating scheduling constraints (resource requests, nodeselectors, affinities, toleration, and topology spread constraints) requested by the pods;

3/ Provisioning nodes that meet the requirements of the pods; and

4/ Removing the nodes when the nodes are no longer needed. It’s a flexible tool to address the scaling requirements of your application.

You can read more about Karpenter and best practices in this AWS GitHub link.

Containers