How do I resolve managed node group errors in an Amazon EKS cluster?
Last updated: 2021-11-15
I have issues with my managed node group in my Amazon Elastic Kubernetes Service (Amazon EKS) cluster.
You receive an error when you register a node with the API server.
If you use an incorrect DHCP option in your custom DNS, then you receive the following error:
Node "ip-x-x-x-x.eu-region.compute.internal" is invalid: metadata.labels: Invalid value
To resolve the issue, complete the steps in the Check your DHCP options section of Resolution.
You receive an error when you launch an Amazon Elastic Compute Cloud (Amazon EC2) instance in an Auto Scaling group with an Amazon Elastic Block Store (Amazon EBS) volume that's encrypted with a KMS key.
AccessDeniedException: User: arn:aws:sts::xxxxxxxxxxxx:assumed-role/AWSServiceRoleForAutoScaling/AutoScaling is not authorized to perform: kms:GenerateDataKeyWithoutPlaintext on resource: ARN of KMS key
If a managed node uses an Amazon EBS volume that's encrypted with a KMS key, then the Auto Scaling group service role doesn't have access to it. To set up a key policy, see the Configure a key policy for your EBS volume encryption section of Resolution.
Your managed node group is in Degraded status because the EC2 launch template version doesn't match the version that Amazon EKS created.
If you manually update the launch template directly from the Auto Scaling group, then you receive the following error:
To resolve the issue, complete the steps in the Update the launch template version section of Resolution.
For other resolutions of failed nodes in managed node groups, see How can I get my worker nodes to join my Amazon EKS cluster?
Check your DHCP options
Verify that your hostname contains no more than 63 characters. To review your DHCP options, see Work with DHCP option sets.
Specify your hostname to match the AWS Region. For an AmazonProvidedDNS server in us-east-1, specify ec2.internal. For an AmazonProvidedDNS server in other AWS Regions, specify region.compute.internal.
Example of a DHCP option set in us-east-1:
Example of a DHCP option set in other Regions:
domain-name: region name.compute.internal
Example of a DHCP option set from a custom DNS:
domain-name:custom DNS name
domain-name-servers: domain name server
Note: Replace region name with your Region, custom DNS name with your DNS name, and domain name server with your domain name server.
For more information, see the domain-name section of Overview of DHCP option sets.
Note: If your DHCP options set is associated with a VPC that has instances with multiple operating systems, it's a best practice to specify only one domain name.
Configure a key policy for your EBS volume encryption
The Auto Scaling group service role must have the following permissions to work with encrypted EBS volumes:
To configure the correct KMS key policy, see Required AWS KMS key policy for use with encrypted volumes.
To allow more IAM roles to work with encrypted EBS volumes, you can modify the key policies. For more information, see Allows key users to use the KMS key.
For more information on KMS key access management, see Managing access to KMS keys.
Update the launch template version
Note: Before you update your EC2 launch template from the managed node group, create a new version. For more information, see Create a new launch template using parameters you define.
To update your EC2 launch template from the managed node group, complete the following steps:
- Open the EKS console.
- Select the cluster that contains the node group to update.
- Choose the Configuration tab, then choose the Compute tab.
- On the node groups page under launch templates, choose Change version.
- Select the version to apply to your node group. Make sure that the update strategy is set to Rolling Update.
- Choose Update.
Note: It's a best practice to update the node group with the new version of the EC2 launch template.
If you haven't used a custom launch template and you get the Ec2LaunchTemplateVersionMismatch error, then your worker nodes aren't using the same version as the EKS node group. To resolve this issue, go to the Auto Scaling console to revert to the version that EKS created. For more information, see Managed node group errors.