Containers

Leveraging Amazon EKS managed node group with placement group for low latency critical applications

Our customers have been asking how to host their low-latency applications with high throughput such as stock-trading applications and financial market workloads on Amazon Elastic Kubernetes Service (Amazon EKS), particularly with the EKS managed node group offering.

In this blog post, we introduce the concept of Amazon Elastic Compute Cloud (Amazon EC2) placement groups, and demonstrate how to set up EKS managed node group with launch template to enable placement group. The blog post provides two ways to implement the solution: AWS Cloud Development Kit (AWS CDK) and Terraform, two very popular infrastructure as code (IaC) tools. Last but not the least, a performance test is performed to see the performance difference between Amazon EKS on Amazon EC2 placement group vs Amazon EKS on normal worker nodes.

Time to read 8 minutes
Time to complete 45 minutes
Cost to complete $3
Learning Level Advanced (300)
Services used Amazon EKS, Amazon EC2, AWS CDK

Preview Changes (opens in a new tab)

What is low latency ?

Latency is the time that passes between a user action and the resulting response. There are many processing workloads (e.g., stock-tracking applications, financial market workloads) that require low-latency response time of less than one millisecond. Low-latency computing usually requires very fast inter-process communication (IPC) and inter-computer communications. In order to get quick response time, the application needs to have a high throughput as well as great computing capability.

Introduction of placement group

What is placement group?

When you launch a new EC2 instance, the Amazon EC2 service attempts to place the instance in such a way that all of your instances are spread out across underlying hardware to minimize correlated failures. You can use placement groups to influence the placement of a group of interdependent instances to meet the needs of your workload.

Types of placement group

There are three types of placement group:

  • Cluster—packs instances close together inside an Availability Zone. The following image shows instances that are placed into a cluster placement group. This strategy enables workloads to achieve the low-latency network performance necessary for tightly coupled node-to-node communication that is typical of high performance computing (HPC) applications.Cluster placement group inside Availability Zone
  • Partition—spreads your instances across logical partitions, such that groups of instances in one partition do not share the underlying hardware with groups of instances in different partitions. When using partition placement groups, EC2 divides each group into logical segments called logical partitions. EC2 ensures that each partition within a placement group has its own set of racks. Each rack has its own network and power source. No two partitions within a placement group share the same racks, allowing you to isolate the impact of a hardware failure within your application. The following image shows instances that are placed into a partition placement group with three partitions: Partition 1, Partition 2, and Partition 3. Each partition comprises multiple instances which do not share racks with the instances in the other partitions. This strategy is typically used by large distributed and replicated workloads, such as Hadoop, Cassandra, and Kafka.Partition placement group inside Availability Zone
  • Spread—strictly places a small group of instances across distinct underlying hardware to reduce correlated failures. A spread placement group is a group of instances that are each placed on distinct racks, with each rack having its own network and power source. The following image shows seven instances in a single Availability Zone that are placed into a spread placement group. The seven instances are placed on seven different racks. This strategy is recommended for applications that have a small number of critical instances that should be kept separate from each other.Spread placement group inside Availability Zone

For the use case mentioned above, we will choose cluster placement group for our demo solution. A cluster placement group can span peered VPCs in the same region. Instances in the same cluster placement group enjoy a higher per-flow throughput limit for TCP/IP traffic and are placed in the same high-bisection bandwidth segment of the network to ensure high inter-computer performance. Besides the low network latency and high network throughput required application, cluster placement group is also recommended when the majority of the network traffic is between the EC2 instances in the group, or in Kubernetes’s context, is the pod-to-pod communications.

To provide the lowest latency and the highest packet-per-second network performance for your placement group, we also recommend choosing an EC2 instance type that supports enhanced networking as your EKS cluster worker nodes. Enhanced networking provides higher bandwidth, higher packet per second (PPS) performance, and consistently lower inter-instance latencies. Please refer to our documentation for more details of enhanced networking.

Introduction of EKS managed node group with custom launch templates

Amazon EKS managed node groups automate the provisioning and lifecycle management of nodes (EC2 instances) for Amazon EKS Kubernetes clusters. With EKS managed node groups, you don’t need to separately provision or register the EC2 instances that provide compute capacity to run your Kubernetes applications. You can create, automatically update, or terminate nodes for your cluster with a single operation. Node updates and terminations automatically and gracefully drain nodes to ensure that your applications stay available. In short, AWS manages the EKS node groups for you instead of you managing them yourself.

In August 2020, Amazon EKS began supporting EC2 launch templates and custom AMIs for managed node groups. This enabled our customers to leverage the simplicity of EKS managed node provisioning and lifecycle management features while adhering to any level of customization, compliance, or security requirements. Given that placement group is a supporting feature of launch template, it makes placement group an available option for EKS managed node group.

Solution Overview

In this blog post, we create an Amazon EKS cluster with two managed node groups (one with placement group enabled and the other without placement group enabled). Each node group contains two c5.large instances. The EKS cluster is attached to a newly created VPC. All application workloads are running in the VPC’s public subnets for demo purposes. But in production workloads, we are recommending using private subnets to host the workloads. When you create a new cluster, Amazon EKS creates an endpoint for the managed Kubernetes API server that you use to communicate with your cluster. For your convenience, in this blog, we make the Amazon EKS control plane Kubernetes API server endpoint public so that it’s easier for you to validate the solution in your AWS account. For production workloads, we are recommending using a private-only endpoint for your EKS control plane. For more information, see our Best Practices Guide.

In the performance testing, we create two iperf3 deployments in two different node groups and test the throughput and latency performance between the two nodes within the same node group. The following diagram shows the high-level architecture. iperf is a popular network bandwidth and performance-testing tool to perform bandwidth and throughput test.

As shown in the preceding diagram, pod01 and pod02 are created under deployment cluster-placementgroup-enabled, and they are hosted on the two nodes with placement group enabled accordingly. pod03 and pod04 are created under deployment cluster-placementgroup-disabled, and they are hosted on the two nodes (VM3, VM4) with placement group disabled accordingly. These are done via the use of podAntiAffinity Rules and nodeSelector described in the “Performance Testing” section. The performance testings (iperf3 and ping) occur between pod01 and pod02, and pod03 and pod04 accordingly, to test the inter-node pod-to-pod throughput and latency.

Walkthrough

Here are the high-level deployment steps:

  1. Clone the code from the GitHub repo.
  2. If you are using cdk, please continue with steps 3, 4, and 6.
    Terraform users, please continue with steps 5 and 6.
  3. Run the npm command to compile the code if you are using cdk.
  4. Run the cdk deploy command to deploy all components, including the AWS resources and Kubernetes.
  5. If you are using terraform, run terraform init, terraform plan and terraform apply.
  6. Conduct performance testing.

Prerequisites

To deploy with AWS CDK/Terraform code, you need the following:

  • A good understanding of Amazon EKS and Kubernetes. You also need basic knowledge of Amazon EC2, the AWS CDK, Typescript, or Terraform.
  • An AWS account with the permissions required to create and manage the EKS cluster and Amazon EC2. All those resources will be created by AWS CDK/Terraform automatically.
  • The AWS Command Line Interface (AWS CLI) configured. For information about installing and configuring the AWS CLI, see Installing, updating, and uninstalling the AWS CLI version 2.
  • A current version of Node/Terraform; in this blog post, we use npm version 8.0.0 and Terraform version 1.0.8.
  • The Kubernetes command-line tool, kubectl. For installation and setup instructions, see Install and Set Up kubectl.

AWS CDK Deployment Steps

The AWS CDK Toolkit, the AWS CLI command cdk, is the primary tool for interacting with your AWS CDK app. The code will create a new VPC with two public subnets and an EKS cluster with two managed node groups, one with placement group enabled and the other without placement group enabled.

  1. For information about installing the cdk command, see AWS CDK Toolkit (cdk command). In the example, we are using AWS CDK 1.25.0 or above.
    $ cd ~
    $ npm install -g aws-cdk
  2. Use the git clone command to clone the repository that contains all the AWS CDK code used in this blog.
    $ cd eks-manage-node-groups-placement-group 
    $ cd cdk 
    $ npm ci 
    $ npm run build
  3. Install the required npm modules and then use npm commands to compile the AWS CDK code.
    $ cd eks-manage-node-groups-placement-group $ cd cdk $ npm ci $ npm run build
  4. Use the cdk deploy command to deploy the AWS resources and Kubernetes workloads. It takes approximately 30 minutes to provision the cluster.
    $ cdk deploy 
    Type "y" to agree the question of "Do you wish to deploy these changes"
    << Take around 30 mins>>
    << Once finishes, CDK print outputs to the CLI as below: >>
    
    Outputs:  ✅  PlacementGroupDemoEksStack
    
    Outputs:
    PlacementGroupDemoEksStack.PlacementGroupDemoEKSConfigCommand53CB05BE = aws eks update-kubeconfig --name PlacementGroupDemoEKS22DD6187-a70c9951934944a184be53e652c2ab98 --region <region> --role-arn arn:aws:iam::<AWS-Account>:role/PlacementGroupDemoEksStac-PlacementGroupDemoEKSMas-XXXXXXXXXX
    PlacementGroupDemoEksStack.PlacementGroupDemoEKSGetTokenCommand9E638807 = aws eks get-token --cluster-name PlacementGroupDemoEKS22DD6187-a70c9951934944a184be53e652c2ab98 --region <region> --role-arn arn:aws:iam::<AWS-Account>:role/PlacementGroupDemoEksStac-PlacementGroupDemoEKSMas-XXXXXXXXXX
    
    Stack ARN:
    arn:aws:cloudformation:<region>:<AWS-Account>:stack/PlacementGroupDemoEksStack/84f25bb0-2334-11ec-8e60-XXXXXXXXXX
  5. To set up kubectl to access the cluster, copy the output of the PlacementGroupDemoEKSConfigCommand53CB05BE in step 4:
    $ aws eks update-kubeconfig --name PlacementGroupDemoEKS22DD6187-a70c9951934944a184be53e652c2ab98 --region <region> --role-arn arn:aws:iam::<AWS-Account>:role/PlacementGroupDemoEksStac-PlacementGroupDemoEKSMas-XXXXXXXXXX
     
Now the deployment is complete, and you can log on to the AWS Management Console to look at the EKS cluster and also run kubectl command to check the pod status.

AWS CDK code deep dive

  • In lib/cdk-stack.ts, an EKS cluster is created using @aws-cdk/aws-eks library. It also creates a VPC with two public subnets and private subnets by default. For this demo, all worker nodes are placed in public subnets.
// Add EKS cluster
    const cluster = new eks.Cluster(this, 'PlacementGroupDemoEKS', {
      version: eks.KubernetesVersion.V1_21,
      defaultCapacity: 0,
    }); 
  • Following the EKS cluster creation, two managed EKS worker node groups are provisioned. One node group will utilize launch template support for Amazon EKS to place worker nodes in a placement group with strategy set to cluster.
    const pg = new ec2.CfnPlacementGroup(this, 'PlacementGroup', {
      strategy: 'cluster',
    });
    
    const lt = new ec2.LaunchTemplate(this, 'PlacementGroupLaunchTemplate');
    const cfnLt = lt.node.defaultChild as ec2.CfnLaunchTemplate;
    cfnLt.addOverride('Properties.LaunchTemplateData.Placement.GroupName', pg.ref);

    cluster.addNodegroupCapacity('NgTruePlacementGroup', {
      ...sharedNgConfig,
      labels: {
        placementGroup: 'true',
      },
      launchTemplateSpec: {
        id: lt.launchTemplateId!,
        version: lt.latestVersionNumber,
      },
    }); 
  • Both node groups have applied a Kubernetes node label: placementGroup, with either true or false value to identify whether the node is placed in the cluster placement group or not. This label is later used for scheduling the performance testing pods to the relevant nodes.
cluster.addNodegroupCapacity('NgTruePlacementGroup', {
      ...sharedNgConfig,
      labels: {
        placementGroup: 'true',
      },
      launchTemplateSpec: {
        id: lt.launchTemplateId!,
        version: lt.latestVersionNumber,
      },
    });

    /**
     * Add another node group with same configurations except not in a placement group but in same AZ
     */
    cluster.addNodegroupCapacity('NgFalsePlacementGroup', {
      ...sharedNgConfig,
      labels: {
        placementGroup: 'false',
      },
    }); 

Terraform deployment steps

The Terraform code will create a new VPC with two public subnets and an EKS cluster with two managed node groups, one with placement group enabled and the other without placement group enabled. All the nodes in the same node group will stay in the same Availability Zone for performance testing purposes.

  1. For information about installing the terraform command, see Terraform. In the example, we are using Terraform v1.0.8. Please follow the document to download the Terraform CLI. After installing the Terraform CLI, we will be able to see it as below:
     $ terraform -v 
    Terraform v1.0.8
  2. Use the git clone command to clone the repo that contains all the Terraform code used in this blog.
    $ cd ~ 
    $ git clone https://github.com/aws-samples/eks-manage-node-groups-placement-group.git
    
  3. Perform the terraform init to install the required Terraform packages. In this blog post example, the Terraform state terraform.tfstate will be installed locally.
    $ cd eks-manage-node-groups-placement-group 
    $ cd terraform 
    $ terraform initnitializing the backend... 
    - Initializing provider plugins... 
    - Installing hashicorp/kubernetes v2.5.0... 
    - Installed hashicorp/kubernetes v2.5.0 (signed by HashiCorp) 
    - Installing hashicorp/aws v3.60.0... 
    - Installed hashicorp/aws v3.60.0 (signed by HashiCorp) 
    - Installing hashicorp/random v3.1.0...
    ... 
    - Installed hashicorp/local v2.1.0 (signed by HashiCorp) 
    - Installing hashicorp/null v3.1.0... 
    - Installed hashicorp/null v3.1.0 (signed by HashiCorp) 
    Partner and community providers are signed by their developers. 
    If you'd like to know more about provider signing, you can read about it here:
    https://www.terraform.io/docs/cli/plugins/signing.html
    Terraform has been successfully initialized!
  4. Use the terraform apply command and key in the Region name to deploy the AWS resources and Kubernetes workloads. It takes approximately 30 minutes to provision the cluster.
    $ terraform apply 
    provider.aws.region
      The region where AWS operations will take place. Examples
      are us-east-1, us-west-2, etc.
    
      Enter a value: region-name
      ...
    
    Do you want to perform these actions?
      Terraform will perform the actions described above.
      Only 'yes' will be accepted to approve.
    
      Enter a value: yes
  5. After it has been successfully deployed, run the following command to get kubectl access to the EKS cluster.
    aws eks update-kubeconfig --name <cluster-name> —region <region-name>

Terraform code deep dive

  • In the launch_template.tf, the placement group has been configured.
  • In the placement section of aws_launch_template, we set the availability_zone to the same Availability Zone, which refers to the aws_placement_group with cluster strategy.
    resource "aws_placement_group" "eks" {
      name     = "eks-placement-group"
      strategy = "cluster"
      tags = {
        placementGroup = "true",
        applicationType = "eks"
      }
    }
    
    resource "aws_launch_template" "default" {
      name_prefix            = "eks-example-placementgroup-"
      description            = "Default Launch-Template for Placement Group"
      update_default_version = true
    
      ...
    
      placement {
        availability_zone = data.aws_availability_zones.available.names[0]
        group_name = aws_placement_group.eks.name
      }
    
      ...
      
    }
  • EKS managed node group uses the launch template created in eks-cluster.tf.

We create two managed node groups in this example. One is using the launch template created in the launch_template.tf, and the other uses no launch templates but sticks with one Availability Zone. In addition, the labels placementGroup="true" and placementGroup="false" have been passed into both node groups and to the Kubernetes Nodes. This label allows us to deploy Kubernetes deployments into two different node groups.

module "eks" {
  source          = "terraform-aws-modules/eks/aws"
  cluster_name    = local.cluster_name
  cluster_version = "1.21"
  subnets         = module.vpc.public_subnets

  ...

  node_groups = {
    placementgroup01 = {
      name_prefix = "placementgroup"
      desired_capacity = 2
      max_capacity     = 2
      min_capacity     = 2

      launch_template_id      = aws_launch_template.default.id
      launch_template_version = aws_launch_template.default.default_version

      // This is to get the subnet id from the subnet ARN, as the data.aws_subnet does not have attribute of subnet id.
      subnets = [split("/", data.aws_subnet.selected.arn)[1]]
      
      instance_types = ["c5.large"]
            
      k8s_labels = {
        placementGroup = "true"
      }

      additional_tags = {
        placementgroup = "true"
      }
    },
    nonplacementgroup02 = {
      name_prefix = "non-placementgroup"
      desired_capacity = 2
      max_capacity     = 5
      min_capacity     = 2

      instance_types = ["c5.large"]
      // This is to get the subnet id from the subnet ARN, as the data.aws_subnet does not have attribute of subnet id.
      subnets = [split("/", data.aws_subnet.selected.arn)[1]]
      
      k8s_labels = {
        placementGroup = "false"
      }

      additional_tags = {
        placementGroup = "false"
      }
    }
  }
}

Performance testing

The sample code mentioned previously creates a new VPC and deploys an EKS cluster with two node groups. Both node groups contain two c5.large instances, one with placement group of cluster type and the other one without placement group enabled. iperf3, a popular network bandwidth and performance testing tool, is deployed into both node groups for evaluating the network performance. We also use ping command to test the round-trip latency in these two scenarios. The following steps display the performance testing process.

Install iperf3 and ping performance testing tools

The following actions will install iperf3 and ping tool on the placement group enabled node group and the node group without placement group enabled but in the same Availability Zone.

$ kubectl apply -f ~/eks-manage-node-groups-placement-group/yaml/deployment.yaml
deployment.apps/cluster-placementgroup-enabled created
deployment.apps/cluster-placementgroup-disabled created

Review the yaml manifest

Let’s take a look at the deployment of cluster-placementgroup-enabled . The similar settings are enabled for the deployment of cluster-placementgroup-disabled, as well.

  • nodeSelector – the nodeSelector helps to deploy the two replicas to the nodes with label of placementGroup=true, which are the nodes with placement group enabled.
  • podAntiAffinity – the podAntiAffinity spec helps to make sure the two replicas are not hosted on the same node if possible.
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: cluster-placementgroup-enabled
      labels:
        app: cluster-placementgroup-enabled
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: cluster-placementgroup-enabled
      template:
        metadata:
          labels:
            app: cluster-placementgroup-enabled
        spec:
          nodeSelector:
            placementGroup: "true"
          affinity:
            podAntiAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                    - cluster-placementgroup-enabled
                topologyKey: "kubernetes.io/hostname"
          containers:
          - name: iperf
            image: networkstatic/iperf3
            args: ['-s']
            ports:
            - containerPort: 5201
              name: server
          terminationGracePeriodSeconds: 0

With the preceding specs set up, the deployment pods are able to be placed into the right node for our performance testing.

Check the EKS Nodes

In this example, we can see the nodes with placement group enabled and the nodes without placement group enabled. They have true and false respectively as the value for the placementGroup label.

$ kubectl get nodes -l placementGroup=true
NAME                                            STATUS   ROLES    AGE   VERSION
ip-10-0-40-89.ap-southeast-2.compute.internal   Ready    <none>   15m   v1.21.2-eks-55daa9d
ip-10-0-7-22.ap-southeast-2.compute.internal    Ready    <none>   14m   v1.21.2-eks-55daa9d

$ kubectl get nodes -l placementGroup=false
NAME                                             STATUS   ROLES    AGE   VERSION
ip-10-0-42-228.ap-southeast-2.compute.internal   Ready    <none>   15m   v1.21.2-eks-55daa9d
ip-10-0-57-161.ap-southeast-2.compute.internal   Ready    <none>   15m   v1.21.2-eks-55daa9d
  • The EC2 nodes with placement group enabledconsole view of EC2 nodes with placement group enabled
  • The EC2 nodes WITHOUT placement group enabledConsole view of EC2 nodes without placement group enabled

Return pods running in the default namespace

We can see the deployment cluster-placementgroup-enabled is installed in the placement group enabled nodes, and cluster-placementgroup-disabled is installed in the placement group disabled nodes.

$ kubectl get pods -o wide
NAME                                              READY   STATUS    RESTARTS   AGE     IP           NODE                                            NOMINATED NODE   READINESS GATES
cluster-placementgroup-enabled-868c59f745-qknl5   1/1     Running   0          3m37s   10.0.15.0    ip-10-0-7-22.ap-southeast-2.compute.internal    <none>           <none>
cluster-placementgroup-enabled-868c59f745-t8rct   1/1     Running   0          3m37s   10.0.37.71   ip-10-0-40-89.ap-southeast-2.compute.internal   <none>           <none>
cluster-placementgroup-disabled-94c99786d-699fz   1/1     Running   0          3m41s   10.0.22.48   ip-10-0-42-228.ap-southeast-2.compute.internal   <none>           <none>
cluster-placementgroup-disabled-94c99786d-d2kz9   1/1     Running   0          3m41s   10.0.1.149   ip-10-0-57-161.ap-southeast-2.compute.internal   <none>           <none>

Run performance test from one pod to the other

In this exercise, we use kubectl exec to get a shell to one of the pods to perform the simple test.

$ kubectl exec -i -t <pod-name> -- bash -c "iperf3 -c <destination-pod-ip-address>"
$ kubectl exec -i -t <pod-name> -- bash -c "ping -c 30 <destination-pod-ip-address>"
  • Result with placement group configured
    $ kubectl exec -i -t cluster-placementgroup-enabled-868c59f745-qknl5 -- bash -c "iperf3 -c 10.0.37.71"
    Connecting to host 10.0.37.71, port 5201
    [  5] local 10.0.15.0 port 46464 connected to 10.0.37.71 port 5201
    [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
    [  5]   0.00-1.00   sec  1.11 GBytes  9.55 Gbits/sec  107   1.66 MBytes       
    [  5]   1.00-2.00   sec  1.11 GBytes  9.52 Gbits/sec  230   1.19 MBytes       
    [  5]   2.00-3.00   sec  1.11 GBytes  9.50 Gbits/sec   70   1.17 MBytes       
    [  5]   3.00-4.00   sec  1.10 GBytes  9.46 Gbits/sec  386   1.20 MBytes       
    [  5]   4.00-5.00   sec  1.10 GBytes  9.47 Gbits/sec  111   1.64 MBytes       
    [  5]   5.00-6.00   sec  1.10 GBytes  9.48 Gbits/sec  353   1.24 MBytes       
    [  5]   6.00-7.00   sec  1.11 GBytes  9.53 Gbits/sec   76   1.66 MBytes       
    [  5]   7.00-8.00   sec  1.11 GBytes  9.52 Gbits/sec  413   1.15 MBytes       
    [  5]   8.00-9.00   sec  1.11 GBytes  9.53 Gbits/sec  176   1.53 MBytes       
    [  5]   9.00-10.00  sec  1.10 GBytes  9.49 Gbits/sec  133   1.33 MBytes       
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bitrate         Retr
    [  5]   0.00-10.00  sec  11.1 GBytes  9.50 Gbits/sec  2055             sender
    [  5]   0.00-10.00  sec  11.1 GBytes  9.50 Gbits/sec                  receiver
    iperf Done.
    
    $ kubectl exec -i -t cluster-placementgroup-enabled-868c59f745-qknl5 -- bash -c "ping -c 30 10.0.37.71"
    PING 10.0.37.71 (10.0.37.71) 56(84) bytes of data.
    64 bytes from 10.0.37.71: icmp_seq=1 ttl=253 time=0.213 ms
    64 bytes from 10.0.37.71: icmp_seq=2 ttl=253 time=0.124 ms
    64 bytes from 10.0.37.71: icmp_seq=3 ttl=253 time=0.151 ms
    64 bytes from 10.0.37.71: icmp_seq=4 ttl=253 time=0.144 ms
    64 bytes from 10.0.37.71: icmp_seq=5 ttl=253 time=0.182 ms
    ...
    64 bytes from 10.0.37.71: icmp_seq=26 ttl=253 time=0.163 ms
    64 bytes from 10.0.37.71: icmp_seq=27 ttl=253 time=0.150 ms
    64 bytes from 10.0.37.71: icmp_seq=28 ttl=253 time=0.164 ms
    64 bytes from 10.0.37.71: icmp_seq=29 ttl=253 time=0.155 ms
    64 bytes from 10.0.37.71: icmp_seq=30 ttl=253 time=0.152 ms
    --- 10.0.37.71 ping statistics ---
    30 packets transmitted, 30 received, 0% packet loss, time 716ms
    rtt min/avg/max/mdev = 0.118/0.155/0.213/0.022 ms
  • Result without placement group but in the same Availability Zone
    $ kubectl exec -i -t cluster-placementgroup-disabled-94c99786d-699fz -- bash -c "iperf3 -c 10.0.1.149"
    Connecting to host 10.0.1.149, port 5201
    [  5] local 10.0.22.48 port 34756 connected to 10.0.1.149 port 5201
    [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
    [  5]   0.00-1.00   sec   584 MBytes  4.90 Gbits/sec   94   1.48 MBytes       
    [  5]   1.00-2.00   sec   581 MBytes  4.88 Gbits/sec   24   1.31 MBytes       
    [  5]   2.00-3.00   sec   591 MBytes  4.96 Gbits/sec    7   1.75 MBytes       
    [  5]   3.00-4.00   sec   590 MBytes  4.95 Gbits/sec    4   2.52 MBytes       
    [  5]   4.00-5.00   sec   592 MBytes  4.97 Gbits/sec    6   2.24 MBytes       
    [  5]   5.00-6.00   sec   590 MBytes  4.95 Gbits/sec   19   1.78 MBytes       
    [  5]   6.00-7.00   sec   591 MBytes  4.96 Gbits/sec    0   2.91 MBytes       
    [  5]   7.00-8.00   sec   591 MBytes  4.96 Gbits/sec   73   1.90 MBytes       
    [  5]   8.00-9.00   sec   591 MBytes  4.96 Gbits/sec    7   2.13 MBytes       
    [  5]   9.00-10.00  sec   591 MBytes  4.96 Gbits/sec   41   1.83 MBytes       
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bitrate         Retr
    [  5]   0.00-10.00  sec  5.76 GBytes  4.94 Gbits/sec  275             sender
    [  5]   0.00-10.00  sec  5.75 GBytes  4.94 Gbits/sec                  receiver
    iperf Done.
    
    $ kubectl exec -i -t cluster-placementgroup-disabled-94c99786d-699fz -- bash -c "ping -c 30 10.0.1.149"
    PING 10.0.1.149 (10.0.1.149) 56(84) bytes of data.
    64 bytes from 10.0.1.149: icmp_seq=1 ttl=253 time=0.603 ms
    64 bytes from 10.0.1.149: icmp_seq=2 ttl=253 time=0.440 ms
    64 bytes from 10.0.1.149: icmp_seq=3 ttl=253 time=0.465 ms
    64 bytes from 10.0.1.149: icmp_seq=4 ttl=253 time=0.442 ms
    64 bytes from 10.0.1.149: icmp_seq=5 ttl=253 time=0.441 ms
    ...
    64 bytes from 10.0.1.149: icmp_seq=26 ttl=253 time=0.459 ms
    64 bytes from 10.0.1.149: icmp_seq=27 ttl=253 time=0.466 ms
    64 bytes from 10.0.1.149: icmp_seq=28 ttl=253 time=0.509 ms
    64 bytes from 10.0.1.149: icmp_seq=29 ttl=253 time=0.454 ms
    64 bytes from 10.0.1.149: icmp_seq=30 ttl=253 time=0.497 ms
    --- 10.0.1.149 ping statistics ---
    30 packets transmitted, 30 received, 0% packet loss, time 731ms
    rtt min/avg/max/mdev = 0.418/0.457/0.603/0.039 ms

Performance testing summary

From the sample performance test result shown above, we can see that the placement group inter-node pod-to-pod throughput is approximately double the one without the placement group (9.50 Gbits/sec vs. 4.94 Gbit/sec), and the latency is around 66 percent lower (0.155ms vs. 0.457ms). This shows better performance in both throughput and latency with cluster placement group.

  • Note:
    • There are chances that the inter-node pod to pod without placement group enabled can achieve the same level of performance as the one with placement group enabled, as it is possible that the two underlying EC2 nodes can sit in the close rack in the same Availability Zone. However, this cannot be guaranteed. In order to achieve high-consistency inter-node pod-to-pod performance, it is recommended that placement group be enabled for the underlying nodes in the Kubernetes cluster.
    • For latency, during our tests, we found latency was around 24 to 66 percent lower with placement group enabled vs. placement group disabled.

Cleaning Up

To avoid ongoing charges to your account, run the following commands to clean up resources. The cleanup process takes approximately 30 minutes.

Amazon CDK

$ cd ~/eks-manage-node-groups-placement-group/cdk
$ cdk destroy
Type "y" to agree the question of "Are you sure you want to delete: PlacementGroupDemoEksStack"
<< Take around 30 mins>>

Terraform

$ cd ~/eks-manage-node-groups-placement-group/terraform
$ terraform destroy
var.region
  Enter a value: <region-name>
random_string.suffix: Refreshing state... [id=9rl5]
module.vpc.aws_vpc.this[0]: Refreshing state... [id=vpc-077feb5d9f65c3439]
aws_placement_group.eks: Refreshing state... [id=eks-placement-group]
module.eks.aws_iam_policy.cluster_elb_sl_role_creation[0]: Refreshing state... [id=arn:aws:iam::030977880123:policy/PlacementGroupDemoEKS-9rl5-elb-sl-role-creation20211001114109805100000001]
...
Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.
  Enter a value: yes
...

Conclusion

Amazon EKS managed node group can support a variety of customized instance configurations after its official support of launch templates, including the placement group introduced in this blog post. With the benefits of lower latency and higher throughput, our customers can leverage the placement group option in their low-latency container applications running on EKS managed node group.

Haofei Feng

Haofei Feng

Haofei is a Senior Cloud Architect at AWS with 16+ years experiences in Containers, DevOps and IT Infrastructure. He enjoys helping customers with their cloud journey. He is also keen to assist his customers to design and build scalable, secure and optimized container workloads on AWS. In his spare time, he spent time with his family and his lovely Border Collies. Haofei is based in Sydney, Australia.

Xin Chen

Xin Chen

Xin Chen is a Cloud Architect at AWS, focusing on Containers and Serverless Platform. He engages with customers to create innovative solutions that address customer business problems and to accelerate the adoption of AWS services. In his spare time, Xin enjoys spending time with his family, reading books, and watching movies.