How do I troubleshoot kubelet or CNI plugin issues for Amazon EKS?

Last updated: 2020-09-11

I want to resolve issues with my kubelet or CNI plugin for Amazon Elastic Kubernetes Service (Amazon EKS).

Short description

To run and assign an IP address to the pod on your worker node with your CNI plugin, you must have the following:

  • AWS Identity and Access Management (IAM) permissions, including an Amazon EKS CNI policy attached to the IAM role of your worker node or provided through IAM roles for service accounts
  • An Amazon EKS API server endpoint that can be reached from the worker node
  • Network access to API endpoints for Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Registry (Amazon ECR), and Amazon Simple Storage Service (Amazon S3)
  • Enough IP addresses available in your subnet

Resolution

Verify that the aws-node pod is in Running status on each worker node

To verify that the aws-node pod is in Running status on a worker node, run the following command:

kubectl get pods -n kube-system -l k8s-app=aws-node -o wide

If the command output shows that the RESTARTS count is 0, then the aws-node pod is in Running status. Try the troubleshooting steps in the Verify that your subnet has enough free IP addresses available section.

If the command output shows that the RESTARTS count is any value greater than 0, then try the following steps:

To verify that the API server endpoint of your Amazon EKS cluster can be reached by the worker node, run the following command:

curl -vk https://eks-api-server-endpoint-url

If you can't connect to your Amazon EKS cluster, then try the following:

1.    Verify that your worker node's security group settings for Amazon EKS are correctly configured.

2.    Verify that your worker node's network access control list (network ACL) rules for your subnet allow communication with the Amazon EKS API server endpoint.

Important: Allow inbound and outbound traffic on port 443.

3.    To verify that the kube-proxy pod is in Running status on each worker node, run the following command:

kubectl get pods -n kube-system -l k8s-app=kube-proxy -o wide

4.    Verify that your worker node can access API endpoints for Amazon EC2, Amazon ECR, and Amazon S3.

Note: You can configure these services through public endpoints or AWS PrivateLink.

Verify that your subnet has enough free IP addresses available

To list available IP addresses in each subnet in the Amazon Virtual Private Cloud (Amazon VPC) ID, run the following command:

aws ec2 describe-subnets --filters "Name=vpc-id,Values=<VPCID>" | jq '.Subnets[] | .SubnetId + "=" + "\(.AvailableIpAddressCount)"'

The available-ip-address-count should be greater than 0 for the subnet where pods are launched.

Check if your security group limits have been reached

Your pod networking configuration can fail if you reach the limits of your security groups per elastic network interface.

For more information, see Amazon VPC quotas.

Verify that the instance type used by your worker nodes is supported by the CNI plugin

For more information, see the list of supported instance types for the CNI plugin.

Verify that you're running the latest stable version of the CNI plugin

To confirm that you have the latest version of the CNI plugin, see Amazon VPC CNI plugin for Kubernetes upgrades.

For additional troubleshooting, see the AWS GitHub issues page and release notes for the CNI plugin.


Did this article help?


Do you need billing or technical support?