Why won't my pods connect to other pods in Amazon EKS?

Last updated: 2020-02-21

My pods won't connect to other pods in Amazon Elastic Kubernetes Service (Amazon EKS).

Short Description

If your pods can't connect with other pods, you could receive the following errors (depending on your application).

If the security group from a worker node isn't allowing interworker communication:

curl: (7) Failed to connect to XXX.XXX.XX.XXX port XX: Connection timed out

If the DNS isn't working:

Error: RequestError: send request failed caused by: Post  dial tcp: i/o timeout

If the DNS is working, but there's still a pod connectivity issue:

Error: RequestError: send request failed caused by: Post  dial tcp 1.2.3.4.5:443: i/o timeout

If the pod wasn't exposed through a service and you try to resolve DNS for the pod:

kubectl exec -it busybox -- nslookup nginx 
Server:	  10.100.0.10
Address:  10.100.0.10:53
** server can't find nginx.default.svc.cluster.local: NXDOMAIN
*** Can't find nginx.svc.cluster.local: No answer
*** Can't find nginx.cluster.local: No answer
*** Can't find nginx.ap-southeast-2.compute.internal: No answer
*** Can't find nginx.default.svc.cluster.local: No answer
*** Can't find nginx.svc.cluster.local: No answer
*** Can't find nginx.cluster.local: No answer
*** Can't find nginx.ap-southeast-2.compute.internal: No answer

To resolve these errors, check if your environment is set up correctly by confirming the following:

  • You meet the networking requirements for Kubernetes (excluding any intentional NetworkPolicy)
  • Your pods are correctly using DNS to communicate with each other
  • Your security groups meet Amazon EKS guidelines
  • The network access control list (ACL) isn't denying the connection
  • Your subnet has a local route for communicating within your Amazon Virtual Private Cloud (Amazon VPC)
  • There are enough IP addresses available in the subnet
  • Your pods are scheduled and in the RUNNING state
  • You have the recommended version of the Amazon VPC CNI plugin for Kubernetes

Resolution

You meet the networking requirements for Kubernetes (excluding any intentional NetworkPolicy)

Confirm that you meet the networking requirements for Kubernetes.

By default, pods are not isolated. Pods accept traffic from any source. Pods become isolated by having a NetworkPolicy that selects them.

Note: For NetworkPolicy configurations, see Installing Calico on Amazon EKS.

Your pods are correctly using DNS to communicate with each other

You must expose your pods through a service first. If you don't, then your pods won't get DNS names and can be reached only by their specific IP addresses.

See the following example:

$ kubectl run nginx --image=nginx --replicas=5 -n web
deployment.apps/nginx created

$ kubectl expose deployment nginx --port=80 -n web
service/nginx exposed

$ kubectl get svc -n web
NAME    TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
nginx   ClusterIP   10.100.94.70   <none>        80/TCP    2s

# kubectl exec -ti busybox -n web -- nslookup nginx
Server:    10.100.0.10
Address 1: 10.100.0.10 ip-10-100-0-10.ap-southeast-2.compute.internal
Name:      nginx
Address 1: 10.100.94.70 ip-10-100-94-70.ap-southeast-2.compute.internal

The output shows that the ClusterIP 10.100.94.70 was returned when resolving the DNS name for the nginx service.

If your pods still fail to resolve DNS, see How do I troubleshoot DNS failures with Amazon EKS?

Note: For more information, see Pods, Service, and Headless Services.

Your security groups meet Amazon EKS guidelines

If you want to restrict what traffic is allowed on the security group of your worker node, then create inbound rules for any protocol or any ports that your worker nodes use for interworker communication.

It's a best practice to enable all traffic for itself. You don't need to change security group rules every time a new pod with a new port is created.

For more information, see Amazon EKS Security Group Considerations.

The network ACL isn't denying the connection

1.    Confirm that traffic between your Amazon EKS cluster and VPC CIDR flow freely on your network ACL.

2.    (Optional) To add an additional layer of security to your VPC, consider setting up network ACLs with rules similar to your security groups.

Your subnet has a local route for communicating within your VPC

Confirm that your subnets have the default route for communication within the VPC.

There are enough IP addresses available in the subnet

Confirm that your specified subnets have enough available IP addresses for the cross-account elastic network interfaces and your pods.

For more information, see VPC IP Addressing.

To check for available IP addresses, run the following AWS Command Line Interface (AWS CLI) command:

$ aws ec2 describe-subnets --subnet-id YOUR-SUBNET-ID --query 'Subnets[0].AvailableIpAddressCount'

Your pods are scheduled and in the RUNNING state

Confirm that your pods are scheduled and in the RUNNING state.

To troubleshoot your pod status, see How can I troubleshoot pod status in Amazon EKS?

You have the recommended version of the Amazon VPC CNI plugin for Kubernetes

If you're not running the recommended version of the Amazon VPC CNI plugin for Kubernetes, consider upgrading to the latest version.

If you have the recommended version, but are experiencing issues with it, then see How do I troubleshoot CNI plugin issues for Amazon EKS?


Did this article help you?

Anything we could improve?


Need more help?