Enhanced VPC flexibility: modify subnets and security groups in Amazon EKS
With Amazon Elastic Kubernetes Service (Amazon EKS) users can modify the configuration of the cluster before and after cluster creation without having to create a new cluster. Before provisioning the cluster, users can define specific parameters like the Kubernetes version, VPC and subnets, and logging preferences. Post-creation, they can dynamically adjust various settings, such as:
- Modifying which control plane logs are sent to Amazon CloudWatch.
- Adjusting the public endpoint access by amending CIDR (Classless Inter-Domain Routing) blocks and toggling the visibility of the public API server.
- Switching the accessibility of the private endpoint within the VPC (Virtual Private Cloud).
- Refining AWS Identity and Access Management (AWS IAM) roles and policies via the aws-auth ConfigMap.
- Managing add-ons like the VPC CNI (Container Network Interface) and altering cluster labels.
- Activating AWS Key Management Service (AWS KMS) encryption for Kubernetes secrets if not done initially.
- Upgrading the Kubernetes version to the latest supported ones.
- Adjusting endpoint policies for the Amazon EKS API (Application Programming Interface) server.
- Enabling an OIDC (OpenID Connect) provider.
While Amazon EKS is highly configurable, certain parameters set at the time of cluster creation previously couldn’t be altered without creating a new cluster. Notably, the subnets and security groups were associated with cluster. Now, Amazon EKS has increased its level of post-cluster flexibility, and Amazon EKS supports changing cluster subnets and security groups.
What is a cluster subnet?
A cluster subnet in the context of Amazon EKS refers to the specific subnets you choose when creating your Amazon EKS cluster. While the underlying nodes of this control plane are managed by AWS and are invisible to the user, there are certain networking components, like the Cross-account Elastic Network Interfaces (ENIs) that are deployed automatically onto the cluster subnets.
When you create a cluster, you specify a VPC and at least two subnets that are in different Availability Zones (AZs), the control plane is provisioned in a VPC managed by AWS, while nodes (i.e., Amazon Elastic Compute Cloud [Amazon EC2] instances) run in the user’s VPC account. To achieve this cross-account communication, ENIs from the Amazon EKS service account are placed into your specified cluster subnets. These ENIs facilitate the necessary bi-directional communication between the control plane in the AWS-managed account and nodes in your account. These network interfaces also enable Kubernetes features such as kubectl exec and kubectl logs. Each Amazon EKS created network interface has the text Amazon EKS cluster-name in its description.
Amazon EKS can create its network interfaces in any subnet that you specify when you create a cluster. Previously, you couldn’t change which subnets Amazon EKS creates its network interfaces in after your cluster is created. To control which subnets network interfaces are created in, you can limit the number of subnets you specify to only two when you create a cluster.
One common misconception is that cluster subnets chosen when creating an Amazon EKS cluster serve as the primary targets for nodes and users can only use these subnets for creating the nodes (i.e., Kubernetes nodes). Instead of being the designated subnets for nodes, cluster subnets have a distinct role of hosting cross-account ENIs as specified above. So, while it may be common to equate cluster subnets and node subnets are synonymous, recognizing their differences is essential.
While placing cross-account ENIs for node to control plane communication is a primary function of cluster subnets, they can also serve other roles based on your configuration:
- If you don’t specify separate subnets for nodes, then they may be deployed in the same subnets as your cluster subnets. Nodes and Kubernetes resources can run in the cluster subnets, but it isn’t recommended. During cluster upgrades, Amazon EKS provisions additional ENIs in the cluster subnets. When your cluster scales out, nodes and pods may consume the available IPs in the cluster subnet. Hence, in order to make sure there are enough available Ips, you might want to consider using dedicated cluster subnets with /28 netmask.
- With the AWS Load Balancer Controller, you can choose the specific subnets where load balancers can be deployed, or you can use the auto-discovery feature by tagging the subnets. Cluster subnets can still be used for load balancers, but this is not a best practice, as it can lead to IP exhaustion, similar to the previous case.
Nodes aren’t restricted to cluster subnets. If the worker nodes are in a different subnet, then they need appropriate routing rules and security groups to ensure they can reach the control plane’s endpoint. This communication is critical for the nodes to join the cluster and for ongoing management operations; however, with this adaptability comes a note of caution.
- Amazon EKS doesn’t automatically create new ENIs in subnets that weren’t designated as cluster subnets during the initial cluster setup. If you have worker nodes in subnets other than your original cluster subnets (i.e., where the cross-account ENIs are located), then they can still communicate with the Amazon EKS control plane if there are local routes in place within the VPC that allow this traffic. Essentially, the worker nodes need to be able to resolve and reach the Amazon EKS API server endpoint. This setup might involve transit through the subnets with the ENIs, but it’s the VPC’s internal routing that makes this possible.
The following is an example of what the route table for your private subnet might look like. 10.0.0.0/16 represents the CIDR block for your entire VPC, which varies based on your actual VPC configuration. The subnets should have direct or routed communication with the cluster subnets where the cross-account ENIs are hosted.
The following is an example of security groups for the worker nodes allowing traffic from Amazon EKS managed control plane and traffic between all the subnets within a cluster VPC.
- AWS allows you to associate additional CIDR blocks with your existing cluster VPC, which effectively increases the pool of IP addresses at your disposal. This expansion can be done by adding more private IP ranges (e.g., RFC 1918) or, if necessary, public (non-RFC 1918) ranges. When you add new CIDR blocks to your VPC, there’s a propagation period before these CIDR ranges are recognized and usable by Amazon EKS. It can take up to five hours for a CIDR block association to be recognized by Amazon EKS. This delay means that even after successfully adding the new CIDR block, you should wait up to four hours before expecting the services within the VPC to smoothly create and operate new nodes, pods, or services using IP addresses from the new range. So, if you’re considering an infrastructure expansion or introducing new nodes into subnets with new CIDR blocks, it’s crucial to account for this delay and plan your deployments accordingly.
What is an Amazon EKS-managed cluster security group?
When you create a cluster, Amazon EKS creates a security group that’s named eks-cluster-sg-<cluster-name>-uniqueID. Amazon EKS associates this managed cluster security group to the managed cross account ENIs. EKS also attaches this security group by default to nodes created by managed node groups and Fargate (if you don’t specify custom security groups with those features). The default rules allow all traffic to flow freely between your cluster and nodes, and allows all outbound traffic to any destination. This security group is different from the default security group that comes with your VPC.
The cluster security group information can be found in the Networking section of the Amazon EKS Console, as shown the previous image. The following diagram illustrates the cluster subnet:
When Amazon EKS creates the default security group for the cluster control plane, it pre-configures necessary inbound and outbound rules to ensure that the Amazon EKS control plane can operate correctly and communicate with necessary AWS services, as well as the nodes.
The default inbound rules include all access from within the security group and shared node security group ,which enables bi-directional communication between the control plane and the nodes. Today, these rules can’t be deleted or modified. If you remove the default inbound rule, then Amazon EKS recreates it whenever the cluster is updated.
The default outbound rule of the cluster security group allows all traffic. Optionally, users can remove this egress rule and limit the open ports between the cluster and nodes. You can remove the default outbound rule and add the minimum rules required for the cluster.
However, while you cannot change the default security group’s rules, AWS does allow you to associate additional security groups with the Amazon EKS control plane. These additional customer-managed security groups provide a level of flexibility, which let you customize access further if you have specific requirements beyond the default configuration. It’s worth noting that while AWS manages the default security group for the control plane, the security groups for nodes are typically under your control and should be configured to ensure secure and functional communication with the control plane and other resources. This is not necessary when using managed node groups and Fargate, as by default, the cluster security group is applied to nodes to enable successful joining to the cluster. For more information, refer AWS documentation on Amazon EKS security group requirements and considerations.
AWS is aware of the feature request to allow users to skip the creation of an Amazon EKS-managed cluster security group so that they can have full control over their cluster security groups. We’re researching this request, and it is possible that we’ll add this feature in the future. If we do add this feature, then we’ll provide guidance on the required inbound and outbound rules needed to enable connectivity from the nodes in your account to the managed control plane.
New: dynamically adding and removing cluster subnets
Cluster administrators can now adjust their cluster subnets and security groups with ease. What does this mean? Well, whenever there are changes to the VPC resources—be it for expanding the VPC, improving security, or planning for unforeseen events—administrators don’t need to rebuild clusters from scratch.
With the new feature, you can conveniently adjust an existing cluster to align with any changes to the base VPC resources and security groups associated with the cluster. The option to tweak cluster subnets and security groups, even after the initial creation of your Amazon EKS cluster, offers a greater degree of flexibility. This is particularly helpful in scenarios where a VPC resource (be it a subnet or a security group) linked to a cluster gets deleted in a scenario where there are no child objects such as nodes or Kubernetes resources linked to these entities. In the past, these actions would prevent clusters from being upgraded to newer Kubernetes versions when Amazon EKS failed to find the cluster subnets where the process involved deleting the original network interfaces initially created, and created new network interfaces. Another common scenario that can be addressed by this feature is when you unexpectedly run out of IP addresses within your cluster subnets due to suboptimal initial planning. Now you can conveniently adjust an existing cluster to align with any changes to the base VPC resources. This not only saves time but also ensures smoother operations.
This flexibility in adjusting cluster subnets and security groups is used through the Amazon EKS UpdateClusterConfiguration API. When updating, customers feed in the subnet and security group IDs into the VPC resources segment of the API. One important thing to note is that during an update, the subnets must still be consistent with the original set of AZs from the cluster’s creation and the security groups must be part of the cluster VPC.
The below example shows how users can use the new feature using the Amazon EKS Console and AWS Command Line Interface (AWS CLI), the configuration is also supported with AWS CloudFormation and eksctl.
Users can now use the Manage VPC resources option available in the Networking section of the Amazon EKS Console to add or delete subnets. The following sample cluster was created using two subnets from two different AZs (i.e., minimum requirement to create an Amazon EKS cluster).
When two subnets are initially provided during the cluster creation, Amazon EKS creates two cross-account ENIs created in the selected cluster subnets.
The Networking section shows the current subnets and security groups associated with the cluster and along with the Manage endpoint access option a new option is added to the console called Manage VPC resources.
The subnets selection field provides a list of subnets available to be added to the cluster subnets. The considerations for the subnet selection are:
- Subnets provided should belong to same set of AZs that are selected during cluster creation.
- Subnets provided should belong to the same VPC provided during cluster creation
While using the console, these restrictions are accounted for. For example, the drop-down only lists the subnets that meet the criteria. If a user is using API, AWS CLI or any other supported method, the aforementioned requirements need to be considered.
Users can select one or more subnets that meet the criteria from the drop-down menu.
Once the subnets are selected users need to the confirm the selection.
Clusters remain fully accessible during the update. Cluster updates are asynchronous and should finish within a few minutes. During an update, the cluster status moves to UPDATING (this status transition is eventually consistent). When the update is complete (either FAILED or SUCCESSFUL), the cluster status moves to ACTIVE. Optionally, users can check the status through the Update history section and this section also provides the details of errors, if any.
Once the update is complete, users can use the Networking section to validate the addition or deletion of the subnets.
Cross-account ENI’s are auto-configured based on the added subnets without any manual intervention.
Adding or removing security groups associated with the cluster (Amazon EKS Console)
Similar to subnets addition and deletion, users can also add or remove the security groups associated with the cluster from Manage VPC resources section, as shown in the previous diagram.
The security group selection field provides a list of security groups available to be added to the cluster as additional security groups. The considerations for the security group selection are:
- Security Groups provided should belong to the same VPC provided during cluster creation
- The EKS Cluster Security Group cannot be added as an additional security group and cannot be removed.
While using the console, these requirements are already accounted for in that the drop-down list only displays the security groups that meet these requirements. If a user is using API, AWS CLI, or any other supported method, then the aforementioned requirements should be considered.
The addition and deletion of security groups is also performed as an update action and once the update is complete users can use the Networking section and additional security groups section to validate the addition and deletion of the security groups.
In this post, we showed you how to update the subnets and security groups associated with their existing Amazon EKS clusters. This feature addition allows for greater flexibility in managing network configurations. Users with the appropriate permissions can seamlessly update subnets and security groups, which facilitates efficient changes to the network setup. This feature minimizes disruption and eliminates the need to re-create clusters that safeguards consistent application performance. Moreover, the ability to modify cluster network resources prevents downtime, even in events where cluster subnets and security groups, may need to be reconfigured or restored, ensuring smooth, ongoing operations.