How do I resolve cluster creation errors in Amazon EKS?

Last updated: 2020-02-12

I get service errors when I provision an Amazon Elastic Kubernetes Service (Amazon EKS) cluster using AWS CloudFormation or eksctl.

Short Description

Consider the following troubleshooting options:

  • If you receive an error message stating that your targeted Availability Zone doesn't have sufficient capacity to support the cluster, then complete the steps in the Recreate the cluster in a different Availability Zone section.
  • If you receive an error message stating that resource creation failed, then complete the steps in the Confirm that you have the correct IAM permissions to create a cluster section, or in the Monitor your Amazon VPC resources section.
  • If you receive an error message stating that the creation timed out waiting for worker nodes, then complete the steps in the Confirm that your worker nodes can reach the control plane API endpoint section.

Resolution

Recreate the cluster in a different Availability Zone

If you launch control plane instances in an Availability Zone with limited capacity, you could receive an error similar to the following:

Cannot create cluster 'sample-cluster' because us-east-1d, the targeted availability zone, does not currently have sufficient capacity to support the cluster. Retry and choose from these availability zones: us-east-1a, us-east-1b, us-east-1c

To resolve this error, create the cluster again using the recommended Availability Zones from the error message.

If you're provisioning the cluster using AWS CloudFormation, then pass in values for the Subnets parameter for subnets that match the Availability Zones.

--or--

If you're using eksctl, then use the --zones flag to pass in the values for the different Availability Zones. For example, if you receive the preceding error, then run the following command:

$ eksctl create cluster 'sample-cluster' --zones us-east-1a,us-east-1b,us-east-1c

Note: Replace sample-cluster with your cluster name. Replace us-east-1a, us-east-1b, and us-east-1c with your Availability Zones.

Confirm that you have the correct IAM permissions to create a cluster

Verify that you have the correct AWS Identity and Access Management (IAM) permissions when you create a cluster, including the correct policies for the Amazon EKS service IAM role.

You can use eksctl to create the prerequisite resources for your cluster, such as the IAM roles and security groups. The minimum permissions required depend on the eksctl configuration that you're launching. For more information, review troubleshooting solutions from the eksctl GitHub community.

If your cluster has issues with IAM permissions, you could receive an error similar to the following in eksctl:

API: iam:CreateRole User: arn:aws:iam::your-account-id:user/your-user-name is not authorized to perform: iam:CreateRole on resource: arn:aws:iam::your-account-id:role/eksctl-newtest22-cluster-ServiceRole-10NXBYLSN4ULP

Tip: For an easier to read error message, review the error in the AWS CloudFormation console.

To resolve the error, review the IAM guidelines for Amazon EKS, or troubleshoot the IAM policies associated with your user or role.

Monitor your Amazon VPC resources

By default, eksctl creates a new Amazon Virtual Private Cloud (Amazon VPC) when you create a cluster, unless you specify your own custom Amazon VPC and subnets in the configuration file.

If your cluster has issues with your Amazon VPC limits, then you could receive the following error message:

The maximum number of VPCs has been reached. (Service: AmazonEC2; Status Code: 400; Error Code: VpcLimitExceeded; Request ID: a12b34cd-567e-890-123f-ghi4j56k7lmn)

To resolve this error, monitor your resources, such as the number of Amazon VPCs in your AWS Region or the internet gateways per Region where you create the cluster. For more information, see Amazon VPC Quotas.

If you have an issue regarding resource constraints on the number of Amazon VPC resources in your Region, consider one of the following options:

(Option 1) Use an existing Amazon VPC to overcome resource constraints

To create a configuration file that specifies the VPC and the subnets where you want your cluster's worker nodes to be provisioned, run the following command:

$ eksctl create cluster sample-cluster -f cluster.yaml

--or--

(Option 2) Request a service quota increase to overcome resource constraints

Request a service quota increase on the resources that act as a bottleneck in the AWS CloudFormation stack events of the cluster provisioned by eksctl.

Confirm that your worker nodes can reach the control plane API endpoint

When eksctl deploys your cluster, it waits for the worker nodes that are launched to join the cluster and reach Ready status. If your worker nodes can't reach the control plane or have an invalid IAM role, then you could receive the following error:

timed out (after 25m0s) waiting for at least 4 nodes to join the cluster and become ready in "eksfbots-ng1"

Did this article help you?

Anything we could improve?


Need more help?