Implement a central ingress Application Load Balancer supporting private Amazon Elastic Kubernetes Service VPCs
Many organizations deploy Amazon Elastic Kubernetes Service (Amazon EKS) clusters into Amazon Virtual Private Cloud (VPC) environments with direct access to the internet and to other VPCs. Connectivity between the VPC hosting the Amazon EKS cluster and other VPCs is typically created using routed networking services, such as VPC Peering or AWS Transit Gateway. As an alternative, in this blog post, we show how to set up connectivity between these VPCs using AWS PrivateLink. We also extend this connectivity discussion by showing how to integrate an Amazon EKS VPC with a VPC hosting a centrally managed Application Load Balancer (ALB) that terminates HTTPS traffic from the internet and forwards it to HTTP services running in the private Amazon EKS cluster.
AWS PrivateLink is used in situations where you choose to operate private Amazon EKS clusters inside VPC environments that prohibit inbound and outbound traffic. You would consider this architecture when you want to isolate compute environments in order to improve security, or to mitigate overlapping IP ranges between VPCs. Along with the specific configuration steps required to deploy private Amazon EKS clusters, administrators must also understand that services running within the isolated Amazon EKS cluster are only available to other resources initiating connections from within the Amazon EKS VPC.
AWS PrivateLink creates private connections between services running in a private Amazon EKS cluster and consumers operating in other VPCs, helping secure the end to end network traffic. AWS PrivateLink connects services running in an Amazon EKS VPC to other VPCs while only exposing the services explicitly defined in the VPC Endpoint Service configuration. PrivateLink powers VPC Endpoint Services and also eliminates the need to open a fully routed network path.
After you create a VPC Endpoint Service in the private Amazon EKS VPC, you share it with other VPCs by creating a VPC Endpoint within the VPCs that consume the service. This makes it possible for users to access the Amazon EKS services from both the hosting VPC and from any VPC environments that use the VPC Endpoint that is bound to the VPC Endpoint Service you created.
(If you are interested in how this works, the Access container applications privately on Amazon EKS using AWS PrivateLink and a Network Load Balancer entry in our documentation provides more details).
However, what if you want to connect a VPC that hosts a centrally operated Application Load Balancer (ALB) to handle inbound internet access to the services running within our private Amazon EKS cluster? Our solution shows you how to accomplish this.
We deployed our solution into two VPCs connected using AWS PrivateLink. One VPC hosts the Amazon EKS cluster, and the second hosts the ALB. We refer to these as the “Amazon EKS VPC” and the “Internet VPC” respectively.
Amazon EKS VPC
We first created the Amazon EKS VPC to host the clusters that run the HTTP services we will access from the internet VPC. Then we deployed an Amazon EKS cluster behind a Network Load Balancer (NLB), that targets the HTTP services running within the Amazon EKS Cluster. An NGINX ingress controller handles the final routing of traffic to HTTP services within the Amazon EKS cluster.
We define a VPC Endpoint Service, powered by PrivateLink, within our Amazon EKS VPC and associated it with our NLB. This is the VPC Endpoint Service that we reference in other VPCs where we want to access the services hosted in the Amazon EKS VPC.
An Application Load Balancer (ALB) manages inbound, internet-based connections to the Amazon EKS service. We deployed the ALB in the internet VPC with internet access enabled by an Internet Gateway attached to the internet VPC. We also associated the ALB with an AWS Web Application Firewall (WAF) to help protect our web services against common web exploits and bots.
Using these services, we can centrally manage:
- Public Listeners/Endpoints (ALB): Creates the HTTPS endpoints that our internet traffic will connect to.
- Certificate Management (ALB): Controls the certificates used to terminate the HTTPS/SSL traffic associated with our listeners.
- Control Access using Security Groups (ALB): Use Security Groups on public facing interfaces to control inbound IP traffic.
- Control WAF Rules and Policies: Protect web applications against common web exploits and bots that may affect availability, compromise security, or consume excessive resources within our EKS hosting environment.
Next, we configured VPC endpoints within the internet VPC. We mapped these VPC endpoints directly to the VPC Endpoint Services we are hosting in the isolated Amazon EKS VPC. In order to improve service resiliency, we deployed VPC Endpoints across three Availability Zones with one Elastic Network Interface (ENI) placed into each of the Availability Zones. (We followed the approach also defined in the AWS white paper, Securely Access Services Over AWS PrivateLink.)
We deployed the ALB with a listener on the public (internet-facing) interface that is configured to receive requests on HTTPS port (port 443). Once it receives traffic, it decrypts the SSL/HTTPS packets using the certificate stored in AWS Certificate Manager. The Web Application Firewall (WAF) associated with our ALB then inspects traffic for any known web exploits and passes approved traffic on for forwarding to the ALB targets. The ALB then uses the ENIs associated with the VPC Endpoint as load-balanced targets for the service defined by the public listener. Traffic that arrives at the VPC Endpoint ENIs is then forwarded to the Amazon EKS VPC, using the PrivateLink service.
Solution diagram and data flow
The diagram that follows (figure 1) shows how traffic from the internet flows to the services hosted in the Amazon EKS VPC. following these:
- HTTPS requests coming from the internet are resolved by Amazon Route 53 DNS and then directed to the public address of the ALB.
- The ALB terminates the HTTPS traffic using a certificate managed by AWS Certificate Manager.
- Traffic that is permitted to pass through the ALB is then logged in S3.
- Traffic is then reviewed and screened by the AWS Web Application Firewall (WAF) to catch common security exploits. Assuming the HTTP request is permitted by the WAF, traffic is then routed to the ALB targets (VPC Endpoint ENIs).
- The ALB load balances the traffic against the ENIs defined for the VPC Endpoint.
- Once traffic arrives at the ENIs associated with the VPC Endpoint, the PrivateLink service carries the traffic to the Amazon EKS VPC, where the VPC Endpoint Service maps the traffic to the NLB.
- The NLB, with HTTP targets defined in the Amazon EKS cluster, delivers the HTTP traffic to the service hosted on the pods running in the Amazon EKS cluster.
- All ALB and Amazon EKS logs are logged in Amazon CloudWatch.
Figure 1: Diagram of our solution and data flow
- This solution separates duties between 1) the governance of the ALB security, certificate management, and integration with WAF services in the internet VPC, and 2), application management in the Amazon EKS VPC.
- Application teams in the Amazon EKS VPCs keep their services isolated inside their VPC until they are configured by the PrivateLink service and given access external parties.
- With one centrally managed internet VPC, it is simpler to perform security audits on inbound internet access.
- Network access between all VPCs is controlled by a single networking approach using PrivateLink.
- While this solution is beneficial for connecting multiple workload VPCs to a central inbound internet solution, it becomes difficult to manage at scale. This is because configuring new services and connecting additional Amazon EKS VPCs requires additional IP space within the internet VPC.
- By default, the quota for VPC Endpoints defined within a VPC is 50 (you can increase this quota with a service quota increase request). Alternatively, we could create another internet VPC along with another ALB to shard some of the inbound traffic and mitigate the concerns of the VPC Endpoint limits.
- For this solution, we require access to at least one AWS account
- We based the codebase for this solution on Terraform at version >= 12
- The codebase for this solution requires the AWS CLI v2
- Before starting a Terraform, double-check that the AWS provider role definitions and the correct AWS account ID are properly configured
As a starting point, download the contents of the following GitHub repository
A description of the structure of the code follows:
- |– README.md
- |– central-internet-acc-setup
- |– eks-service-account
The central–internet–acc–setup directory contains the Terraform code for the central Internet VPC.
The eks–service–account directory contains the Terraform code for the EKS VPC.
Follow the following set of steps in order to set up eks-service-account:
- Create Terraform state bucket for this account by running the following command in the terminal. The s3 bucket-name used in the code base is eks-service-account
$ cd eks-service-account $ sh s3_state_buckets.sh <S3_BUCKET_NAME_as_input> <REGION_as_input> default
- Update the s3 bucket name created in step 1 to Terraform backends (terraform/backend.tf) for roles, networking-setup and eks-cluster-setup directories
- Create IAM Roles
Create the following IAM roles with required permissions:
$ cd roles/terraform
$ terraform init $ terraform plan $ terraform apply -auto-approve
- Networking setup
Create the following resources:
- VPC (default cidr: 10.11.0.0/16)
- 3 completely private subnets (one in each Availability Zone with no NAT/IGW connectivity)
- VPC Endpoints (ecr, s3, ec2, lb, sts and autoscaling)
- ECR repository (with name: eks-service-account-ecr)
$ cd networking-setup/terraform $ terraform init $ terraform plan $ terraform apply -auto-approve
- Upload the Nginx ingress to the ECR repository
Since the EKS data plane has no internet link/connectivity, you must push the docker image of the Nginx ingress controller in the private ECR repository
- Pull the Nginx ingress controller image
$ docker pull k8s.gcr.io/ingress-nginx/controller:v0.46.0@sha256:52f0058bed0a17ab0fb35628ba97e8d52b5d32299fbc03cc0f6c7b9ff036b61a
- List images and note the IMAGE ID of the downloaded image
$ docker images
- Tag docker image. Replace the <IMAGE-ID> and <your-aws-account-id> in the following command with correct values.
$ docker tag <IMAGE-ID> <your-aws-account-id>.dkr.ecr.eu-central-1.amazonaws.com/eks-service-account-ecr
- ECR Login and Push Image to ECR
$ aws ecr get-login-password --region eu-central-1 | docker login --username AWS --password-stdin <your-aws-account-id>.dkr.ecr.eu-central-1.amazonaws.com/eks-service-account-ecr $ docker push <your-aws-account-id>.dkr.ecr.eu-central-1.amazonaws.com/eks-service-account-ecr
EKS Cluster Setup
This step creates:
- EKS cluster control plane
- EKS cluster data plane with managed worker nodes
- Deploy Nginx ingress controller in a separate namespace with AWS NLB
- VPC endpoint service with the NLB created in the previous step
$ cd eks-cluster-setup/eks/terraform $ terraform init $ terraform plan $terraform apply -auto-approve
- Deploy nginx-ingress-controller
$ cd eks-cluster-setup/ingress-plugin-module/terraform $ terraform init $ terraform plan $ terraform apply -auto-approve
- Create Terraform state bucket for this account by running this command in terminal:
$ cd central-internet-account-setup $ sh s3_state_buckets.sh <S3_BUCKET_NAME_as_input> <REGION_as_input> central-acc
- Update the s3 bucket name created in the preceding step to Terraform backends (terraform/backend.tf) for roles, networking-setup and eks-cluster-setup directories
- Networking Setup
This step creates:
- VPC (default cidr: 10.12.0.0/16)
- 3 completely private subnets – for VPC endpoints
- 3 public subnets for application loadbalancer that would serve as an entry point for traffic
- 1 VPC Interface Endpoint which will connect with VPC Endpoint Service created in the eks-service account
- NLB target groups pointing to VPC endpoint private ENI IPs.
$ cd networking-setup/terraform $ terraform init $ terraform plan $ terraform apply -auto-approve
Adding a new VPC Connection/Account
In order to connect additional VPCs to the centralized internet-ingress account, the eks-service-account code is replicated/extended to provision the additional subscriber VPC or AWS accounts. During the creation of the VPC Endpoint Service in the eks-service-account, you must define a unique name for each service that is referenced in the central internet VPC, in order to create the associated VPC endpoints.
We use the VPC Endpoint Service names and AWS Systems Manager Parameter Store to share the with the central internet VPC. Parameter Store elements can be secrets or plain-text values you reference in your scripts, commands, SSM documents, configurations, and automation workflows by using the unique name that you specified when you created the parameter. In our case, as part of ingress-plugin-module setup, the VPC endpoint service name is automatically deployed to the Parameter Store, which is then consumed by natively using Terraform at the time of each endpoint creation.
The Parameter Store name must be unique, so that for every new VPC endpoint, Terraform automation does not override a previously deployed or active VPC endpoint service. In our scenario, this is handled by a Terraform variable called connection_name. Hence, for every additional eks-service-account setup, the connection_name variable has to be unique.
- The connection_name variable is handled in the file: eks-service-account/eks-cluster-setup/ingress-plugin-module/nginx.tf:
- When you add a new connection, declare a new endpoint module and target group reference in the central-internet-account-setup/networking-setup/terraform/main.tf:
- To deploy the new configuration, run the following Terraform:
$ terraform plan central-internet-account-setup/networking-setup/terraform/ -auto-approve $ terraform apply central-internet-account-setup/networking-setup/terraform/ -auto-approve
When finished, you can clean up your account by following these steps in order:
- Switch to this working directory:
$ cd central-eks-internet-ingress
- Clean up central internet account services:
$ terraform destroy central-internet-account-setup/networking-setup/terraform -auto-approve $ terraform destroy central-internet-account-setup/roles/terraform -auto-approve
- Clean up the EKS service account services:
$ terraform destroy eks-service-account/eks-cluster-setup/ingress-plugin-module/terraform -auto-approve $ terraform destroy eks-service-account/eks-cluster-setup/eks/terraform -auto-approve $ terraform destroy eks-service-account/networking-setup/terraform -auto-approve $ terraform destroy eks-service-account/roles/terraform -auto-approve
By following the steps in this document, you have completed the implementation of a central inbound connectivity VPC, enabling internet-sourced traffic to be routed by a WAF-protected Application Load Balancer to an Amazon EKS cluster using an inter-VPC communication path provided by AWS PrivateLink.