Deploying and scaling Apache Kafka on Amazon EKS

Introduction

Apache Kafka, a distributed streaming platform, has become a popular choice for building real-time data pipelines, streaming applications, and event-driven architectures. It is horizontally scalable, fault-tolerant, and performant. However, managing and scaling Kafka clusters can be challenging and often time-consuming. This is where Kubernetes, an open-source platform for automating deployment, scaling, and management of containerized applications, comes into play.

Kubernetes provides a robust platform for running Kafka. It offers features like rolling updates, service discovery, and load balancing which are beneficial for running stateful applications like Kafka. Moreover, Kubernetes’ ability to automatically heal applications ensures that Kafka brokers remain operational and highly available.

Scaling Kafka on Kubernetes involves increasing or decreasing the number of Kafka brokers or changing the resources (i.e., CPU, memory, and disk) available. Kubernetes provides mechanisms for auto-scaling applications based off resource utilization and workload, which can be applied to make sure that our level of resource allocation remains optimal at all times. This allows us to both ensure that we can respond to variable workloads while also optimizing cost by elastically autoscaling up and down as is necessary to meet performance demands.

You can install Kafka on a Kubernetes cluster using Helm or a Kubernetes Operator. Helm is a package manager for Kubernetes applications. It allows you to define, install, and manage complex Kubernetes applications using pre-configured packages called charts. A Kubernetes Operator is a custom controller that extends the Kubernetes API to manage and automate the deployment and management of complex, stateful applications. Operators are typically used for applications that require domain-specific knowledge or complex lifecycle management beyond what is provided by native Kubernetes resources.

In this post, we demonstrate how to build and deploy a Kafka cluster with Amazon Elastic Kubernetes Service (Amazon EKS) using Data on EKS (DoEKS). DoEKS’s open-source project empowers customers to expedite their data platform development on Kubernetes. DoEKS provides templates that incorporate best practices around security, performance, cost-optimization among others that can be used as is or fine-tuned to meet more unique requirements.

DoEKS makes it easy for users to begin utilizing these templates through the DoEKS blueprints project. This project provides blueprints, as well as practical examples tailored to various data frameworks, written as infrastructure-as-code in the form of Terraform or AWS Cloud Development Kit (CDK) that can be used to set up infrastructure designed with best practices and performance benchmarks in minutes without having to write any infrastructure-as-code yourself.

For deploying the Kafka cluster, we use the Strimzi Operator. Strimzi simplifies the process of running Apache Kafka in a Kubernetes cluster by providing container images and operators for running Kafka on Kubernetes. The operators simplify the process of deploying and running Kafka clusters, configuration and secure access, upgrading and managing Kafka and even managing topics and users within Kafka itself.

Solution overview

The following detailed diagram shows the design we will implement in this post.

Diagram illustrating the architecture of Apache Kafka running on an Amazon EKS cluster, showing the interaction between various components and services.

In this post, you will learn how to provision the resources depicted in the diagram using Terraform modules that leverage DoEKS blueprints. These resources are:

A sample Virtual Private Cloud (VPC) with three Private Subnets and three Public Subnets.
An internet gateway for Public Subnets and Network Address Translation (NAT) Gateway for Private Subnets, VPC endpoints for Amazon Elastic Container Registry (Amazon ECR), Amazon Elastic Compute Cloud (Amazon EC2), Security Token Service (STS), etc.
An Amazon Elastic Kubernetes Service (Amazon EKS) Cluster Control plane with public endpoint (for demonstration reasons only) with two managed node groups.
Amazon EKS Managed Add-ons: VPC_CNI, CoreDNS, Kube_Proxy, EBS_CSI_Driver.
A metrics server, Cluster Autoscaler, Cluster Proportional Autoscaler for CoreDNS, Kube-Prometheus-Stack with a local Prometheus and optionally an Amazon Managed Prometheus instance.
Strimzi Operator for Apache Kafka deployed to strimzi-kafka-operator namespace. The operator by default watches and handles kafka in all namespaces.

Walkthrough

Deploy infrastructure

Clone the code repository to your local machine.

git clone https://github.com/awslabs/data-on-eks.git
cd data-on-eks/streaming/kafka
export AWS_REGION="us-west-2" # Select your own region

You need to setup your AWS credentials profile locally before running Terraform or AWS CLI commands. Run the following commands to deploy the blueprint. The deployment should take approximately 20-30 minutes.

terraform init
terraform apply -var region=$AWS_REGION --auto-approve

Verify the deployment

List Amazon EKS nodes

The following command will update the kubeconfig on your local machine and allow you to interact with your Amazon EKS Cluster using kubectl to validate the deployment.

aws eks --region $AWS_REGION update-kubeconfig --name kafka-on-eks

The Kafka cluster is deployed with multi-availability zone (AZ) configurations for high availability. This offers stronger fault tolerance by distributing brokers across multiple availability zones within an AWS Region. This ensures that a failed availability zone does not cause Kafka downtime. To achieve high availability, a minimum cluster of three brokers is required.

Check if the deployment has created six nodes. Three nodes for Core Node group and three for Kafka brokers across threes AZs.

kubectl get nodes -l 'NodeGroupType=core'
# Output should look like below
NAME  STATUS  ROLES AGE  VERSION
ip-10-1-0-100.us-west-2.compute.internal   Ready    <none>   62m   v1.27.7-eks-e71965b
ip-10-1-1-163.us-west-2.compute.internal   Ready    <none>   27m   v1.27.7-eks-e71965b
ip-10-1-2-209.us-west-2.compute.internal   Ready    <none>   63m   v1.27.7-eks-e71965b

kubectl get nodes -l 'NodeGroupType=kafka'
# Output should look like below
NAME                                       STATUS   ROLES    AGE   VERSION
ip-10-1-0-97.us-west-2.compute.internal    Ready    <none>   35m   v1.27.7-eks-e71965b
ip-10-1-1-195.us-west-2.compute.internal   Ready    <none>   25m   v1.27.7-eks-e71965b
ip-10-1-2-35.us-west-2.compute.internal    Ready    <none>   92m   v1.27.7-eks-e71965b

Verify Kafka Brokers and Zookeeper

Verify the Kafka Broker and Zookeeper pods created by the Strimzi Operator and the desired Kafka replicas. Because Strimzi is using custom resources to extend the Kubernetes APIs, you can also use kubectl with Strimzi to work with Kafka resources.

kubectl get strimzipodsets -n kafka
# Output should look like below
NAME              PODS READY PODS  CURRENT PODS  AGE
cluster-kafka     3     3           3             47h
cluster-zookeeper 3     3           3             47h

kubectl get kafka -n kafka
# Output should look like below
NAME     DESIRED KAFKA REPLICAS DESIRED ZK REPLICAS READY WARNINGS
cluster  3                      3                   True  

kubectl get all -n kafka
# Output should look like below
NAME                                          READY  STATUS    
pod/cluster-cruise-control-7bffd9b4bf-np7vp   1/1    Running   
pod/cluster-entity-operator-5676fdf75c-h2qzb  3/3    Running   
pod/cluster-kafka-0                           1/1    Running   
pod/cluster-kafka-1                           1/1    Running   
pod/cluster-kafka-2                           1/1    Running   
pod/cluster-kafka-exporter-7bcc56654f-gxjbw   1/1    Running   
pod/cluster-zookeeper-0                       1/1    Running   
pod/cluster-zookeeper-1                       1/1    Running   
pod/cluster-zookeeper-2                       1/1    Running   

NAME                              TYPE       CLUSTER-IP      EXTERNAL-IP  PORT(S)                              
service/cluster-cruise-control    ClusterIP  172.20.126.223  <none>      9090/TCP                             
service/cluster-kafka-bootstrap   ClusterIP  172.20.188.47   <none>    9091/TCP,9092/TCP,9093/TCP           
service/cluster-kafka-brokers     ClusterIP  None            <none>      9090/TCP,9091/TCP,9092/TCP,9093/TCP  
service/cluster-zookeeper-client  ClusterIP  172.20.117.150  <none>       2181/TCP                             
service/cluster-zookeeper-nodes   ClusterIP  None            <none>       2181/TCP,2888/TCP,3888/TCP           

NAME                                     READY UP-TO-DATE  AVAILABLE  
deployment.apps/cluster-cruise-control   1/1   1           1          
deployment.apps/cluster-entity-operator  1/1   1           1          
deployment.apps/cluster-kafka-exporter   1/1   1           1          

NAME                                                DESIRED  CURRENT  READY  
replicaset.apps/cluster-cruise-control-7bffd9b4bf   1        1        1      
replicaset.apps/cluster-entity-operator-5676fdf75c  1        1        1      
replicaset.apps/cluster-kafka-exporter-7bcc56654f   1        1        1

Create Kafka topics and run sample test

In Apache Kafka, a topic is a category used to organize messages. Each topic has a unique name across the entire Kafka cluster. Messages are sent to and read from specific topics. Topics are partitioned and replicated across brokers in our implementation.

The producers and consumers play a crucial role in the handling of data. Producers are source systems that write data to Kafka topics. They send multiple streams to the Kafka brokers. Consumers are target systems and read data from Kafka brokers. They can subscribe to one or more topics in the Kafka cluster and process the data.

In Kafka, producers and consumers are fully decoupled and agnostic of each other, which is a key design element to achieve high scalability.

You will create a Kafka topic my-topic and run a Kafka producer to publish messages to it. A Kafka streams will consume messages from the Kafka topic my-topic and will send them to another topic my-topic-reversed. A Kafka consumer will read messages from my-topic-reversed topic.

To make it easy to interact with the Kafka cluster you create a pod with Kafka CLI.

kubectl -n kafka run --restart=Never --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 kafka-cli -- /bin/sh -c "exec tail -f /dev/null"

Create the Kafka topics

kubectl apply -f examples/kafka-topics.yaml

Verify the status of the Kafka topics:

kubectl -n kafka get KafkaTopic
# Output should look like below
NAME                                                                                              CLUSTER  PARTITIONS  REPLICATION FACTOR  READY
consumer-offsets---84e7a678d08f4bd226872e5cdd4eb527fadc1c6a                                        cluster   50           3                    True
my-topic                                                                                           cluster   12           3                    True
my-topic-reversed                                                                                  cluster   12           3                    True
strimzi-store-topic---effb8e3e057afce1ecf67c3f5d8e4e3ff177fc55                                     cluster   1            3                    True
strimzi-topic-operator-kstreams-topic-store-changelog---b75e702040b99be8a9263134de3507fc0cc4017b   cluster   1            3                    True
strimzi.cruisecontrol.metrics                                                                      cluster   1            3                    True
strimzi.cruisecontrol.modeltrainingsamples                                                         cluster   32           2                    True
strimzi.cruisecontrol.partitionmetricsamples                                                       cluster   32           2                    True

The cluster has a minimum of three brokers, each in a different AZ, with topics having a replication factor of three and a minimum In-Sync Replica (ISR) of two. This will allow a single broker to be down without affecting producers with acks=all. Fewer brokers will compromise either availability or durability.

Verify that the topic partitions are replicated across all three brokers.

kubectl -n kafka exec -it kafka-cli -- bin/kafka-topics.sh \
--describe \
--topic my-topic \
--bootstrap-server cluster-kafka-bootstrap:9092
# Output should look like below
Topic: my-topic TopicId: c9kRph4QSHOUaNkW6A9ctw PartitionCount: 12      ReplicationFactor: 3    Configs: min.insync.replicas=2,message.format.version=3.0-IV1
Topic: my-topic Partition: 0    Leader: 1      Replicas: 1,0,2 Isr: 1,0,2
Topic: my-topic Partition: 1    Leader: 0      Replicas: 0,2,1 Isr: 0,2,1
Topic: my-topic Partition: 2    Leader: 2      Replicas: 2,0,1 Isr: 2,0,1
Topic: my-topic Partition: 3    Leader: 1      Replicas: 1,0,2 Isr: 1,0,2
Topic: my-topic Partition: 4    Leader: 0      Replicas: 0,2,1 Isr: 0,2,1
Topic: my-topic Partition: 5    Leader: 2      Replicas: 2,0,1 Isr: 2,0,1
Topic: my-topic Partition: 6    Leader: 1      Replicas: 1,0,2 Isr: 1,0,2
Topic: my-topic Partition: 7    Leader: 0      Replicas: 0,2,1 Isr: 0,2,1
Topic: my-topic Partition: 8    Leader: 2      Replicas: 2,0,1 Isr: 2,0,1
Topic: my-topic Partition: 9    Leader: 1      Replicas: 1,0,2 Isr: 1,0,2
Topic: my-topic Partition: 10   Leader: 0      Replicas: 0,2,1 Isr: 0,2,1
Topic: my-topic Partition: 11   Leader: 2      Replicas: 2,0,1 Isr: 2,0,1

The Kafka topic my-topic has 12 partitions, each with different leader and replicas. All replicas of a partition exist on separate brokers for better scalability and fault tolerance. The replication factor of three ensures that each partition has three replicas, so that data is still available even if one or two brokers fail. The same is for Kafka topic my-topic-reversed.

Reading and writing messages to the cluster

Deploy a simple Kafka producer, Kafka streams and Kafka consumer.

kubectl apply -f examples/kafka-producers-consumers.yaml

kubectl -n kafka get pod -l 'app in (java-kafka-producer,java-kafka-streams,java-kafka-consumer)'
# Output should look like below
NAME                                       READY   STATUS    RESTARTS       AGE
java-kafka-consumer-fb9fdd694-2dxh5        1/1     Running   0              4s
java-kafka-producer-6459bd5c47-69rjx       1/1     Running   0              5s
java-kafka-streams-7967f9c87f-lsckv        1/1     Running   0              5s

Verify the messages moving through your Kafka brokers. Split your terminal in three parts and run the commands below in each terminal.

kubectl -n kafka logs \
$(kubectl -n kafka get pod -l app=java-kafka-producer -o jsonpath='{.items[*].metadata.name}')

kubectl -n kafka logs \
$(kubectl -n kafka get pod -l app=java-kafka-streams -o jsonpath='{.items[*].metadata.name}')

kubectl -n kafka logs \
$(kubectl -n kafka get pod -l app=java-kafka-consumer -o jsonpath='{.items[*].metadata.name}')

Scaling your Kafka cluster

Scaling Kafka brokers is essential for a few key reasons. Firstly, scaling up the cluster by adding more brokers allows for the distribution of data and replication across multiple nodes. This reduces the impact of failures and ensures the system’s resilience. Secondly, with more broker nodes in the cluster, Kafka can distribute the workload and balance the data processing across multiple nodes, which leads to reduced latency. This means that data can be processed faster, improving the overall performance of the system.

Furthermore, Kafka is designed to be scalable. As your needs change, you can scale out by adding brokers to a cluster or scale in by removing brokers. In either case, data should be redistributed across the brokers to maintain balance. This flexibility allows you to adapt your system to meet changing demands and ensure optimal performance.

To scale the Kafka cluster, you need modify the Kafka.spec.kafka.replicas configuration to add or reduce the number of brokers.

kind: Kafka
metadata:
  name: cluster
  namespace: kafka
spec:
  kafka:
    ...
    replicas: 4
    ...

The numbers of replicas for Zookeeper can stay the same. One of the main purposes of Zookeeper in distributed systems is to do leader election in case of any broker failure and it needs to have a strict majority when the vote happens. It is advisable to use at-least three Zookeeper servers in a quorum for small Kafka clusters in production environments and go with five or seven Zookeeper servers for medium to very large Kafka clusters.

You have now four Kafka brokers.

kubectl -n kafka get pod -l app.kubernetes.io/name=kafka
# Output should look like below
NAME              READY   STATUS    RESTARTS        AGE
cluster-kafka-0   1/1     Running   0               21m
cluster-kafka-1   1/1     Running   0               21m
cluster-kafka-2   1/1     Running   0               21m
cluster-kafka-3   1/1     Running   0               14m

However, the existing partitions from Kafka topics are still piled up on the initial brokers.

kubectl -n kafka exec -it kafka-cli -- bin/kafka-topics.sh \
  --describe \
  --topic my-topic \
  --bootstrap-server cluster-kafka-bootstrap:9092

You need to redistribute these partitions to balance our cluster out.

Cluster balancing

To balance our partitions across your brokers you can use Kafka Cruise Control. Cruise Control is a tool designed to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.

Cruise Control’s optimization approach allows it to perform several different operations linked to balancing cluster load. As well as rebalancing a whole cluster, Cruise Control also has the ability to do this balancing automatically when adding and removing brokers or changing topic replica values.

Kafka Cruise Control is already running in the kafka namespace and should appear as follows:

kubectl -n kafka get pod -l app.kubernetes.io/name=cruise-control
# Output should look like below
NAME                                      READY   STATUS    RESTARTS   AGE
cluster-cruise-control-6689dddcd9-rxgm7   1/1     Running   0          9m25s

Strimzi provides a way of interacting with the Cruise Control API using the Kubernetes CLI, through KafkaRebalance resources.

You can create a KafkaRebalance resource like this:

kubectl apply -f kafka-manifests/kafka-rebalance.yaml

The purpose of creating a KafkaRebalance resource is for creating optimization proposals and executing partitions rebalances based on that optimization proposals.

The Cluster Operator will subsequently update the KafkaRebalance resource with the details of the optimization proposal for review.

kubectl -n kafka describe KafkaRebalance my-rebalance
# Output should look like below
Name:         my-rebalance
Namespace:    kafka
Labels:       strimzi.io/cluster=cluster
Annotations:  <none>
API Version:  kafka.strimzi.io/v1beta2
Kind:         KafkaRebalance
...
Status:
  Conditions:
    Last Transition Time:  2023-11-22T17:36:21.571935895Z
    Status:                True
    Type:                  ProposalReady
  Observed Generation:     1
  Optimization Result:
    After Before Load Config Map:  my-rebalance
    Data To Move MB:               0
    Excluded Brokers For Leadership:
    Excluded Brokers For Replica Move:
    Excluded Topics:
    Intra Broker Data To Move MB:         0
    Monitored Partitions Percentage:      100
    Num Intra Broker Replica Movements:   0
    Num Leader Movements:                 3
    Num Replica Movements:                126
    On Demand Balancedness Score After:   74.16652216699164
    On Demand Balancedness Score Before:  59.473726784274575
    Provision Recommendation:             
    Provision Status:                     RIGHT_SIZED
    Recent Windows:                       1
  Session Id:                             93fe6347-2700-486b-b866-83f31b75e0b8
Events:                                   <none>

When the proposal is ready and looks good, you can execute the rebalance based on that proposal, by annotating the KafkaRebalance resource like this:

kubectl -n kafka annotate kafkarebalance my-rebalance strimzi.io/rebalance=approve

Wait for Kafka Rebalance to be Ready:

Status:
  Conditions:
    Last Transition Time:  2023-11-22T17:41:04.048629616Z
    Status:                True
    Type:                  Ready

Check to see the partitions spread on all 4 Kafka brokers:

kubectl -n kafka exec -it kafka-cli -- bin/kafka-topics.sh \
  --describe \
  --topic my-topic \
  --bootstrap-server cluster-kafka-bootstrap:9092
# Output should look like below
Topic: my-topic TopicId: qOyjJRSuQTCrOAldQZB0yw PartitionCount: 12      ReplicationFactor: 3    Configs: min.insync.replicas=2,message.format.version=3.0-IV1
        Topic: my-topic Partition: 0    Leader: 3      Replicas: 3,0,1 Isr: 1,0,3
        Topic: my-topic Partition: 1    Leader: 0      Replicas: 0,2,3 Isr: 0,2,3
        Topic: my-topic Partition: 2    Leader: 2      Replicas: 2,0,3 Isr: 2,0,3
        Topic: my-topic Partition: 3    Leader: 3      Replicas: 3,0,2 Isr: 0,2,3
        Topic: my-topic Partition: 4    Leader: 0      Replicas: 0,2,3 Isr: 0,2,3
        Topic: my-topic Partition: 5    Leader: 2      Replicas: 2,0,3 Isr: 2,0,3
        Topic: my-topic Partition: 6    Leader: 3      Replicas: 3,0,1 Isr: 1,0,3
        Topic: my-topic Partition: 7    Leader: 3      Replicas: 3,1,0 Isr: 0,1,3
        Topic: my-topic Partition: 8    Leader: 1      Replicas: 1,0,3 Isr: 0,1,3
        Topic: my-topic Partition: 9    Leader: 3      Replicas: 3,0,1 Isr: 1,0,3
        Topic: my-topic Partition: 10   Leader: 0      Replicas: 0,2,3 Isr: 0,2,3
        Topic: my-topic Partition: 11   Leader: 2      Replicas: 2,0,3 Isr: 2,0,3

Benchmark your Kafka cluster

Kafka includes a set of tools namely the kafka-producer-perf-test.sh and kafka-consumer-perf-test.sh to run test performance and simulation of the expected load on the cluster.

Create a topic for performance test

First time you created a topic using Kubernetes constructs and Srimzi Custom Resource Definitions (CRDs). Now you create the topic using kafka-topics.sh script. Both options can be used.

kubectl exec -it kafka-cli -n kafka -- bin/kafka-topics.sh \
  --create \
  --topic test-topic-perf \
  --partitions 3 \
  --replication-factor 3 \
  --bootstrap-server cluster-kafka-bootstrap:9092
# Output should look like below
Created topic test-topic-perf.

Test the producer performance

Then run the producer performance test script with different configuration settings. The following example will use the topic created above to store 10 million messages with a size of 100 bytes each. The -1 value for –throughput means that messages are produced as quickly as possible, with no throttling limit. Kafka producer related configuration properties like acks and bootstrap.servers are mentioned as part of –producer-props argument.

kubectl exec -it kafka-cli -n kafka -- bin/kafka-producer-perf-test.sh \
  --topic test-topic-perf \
  --num-records 100000000 \
  --throughput -1 \
  --producer-props bootstrap.servers=cluster-kafka-bootstrap:9092 \
      acks=all \
  --record-size 100 \
  --print-metrics

The output will look similar with this:

...
1825876 records sent, 365175.2 records/sec (34.83 MB/sec), 829.4 ms avg latency, 930.0 ms max latency.
1804564 records sent, 360912.8 records/sec (34.42 MB/sec), 839.0 ms avg latency, 1039.0 ms max latency.
1835644 records sent, 367128.8 records/sec (35.01 MB/sec), 827.1 ms avg latency, 1038.0 ms max latency.
1828244 records sent, 365648.8 records/sec (34.87 MB/sec), 827.4 ms avg latency, 933.0 ms max latency.
100000000 records sent, 363854.676442 records/sec (34.70 MB/sec), 825.36 ms avg latency, 1065.00 ms max latency, 876 ms 50th, 910 ms 95th, 941 ms 99th, 1032 ms 99.9th.

In this example approximately 364k messages were produced per second with a maximum latency of 1 second.

Next, you test the consumer performance.

Test the consumer performance

The consumer performance test script will read 10 million messages from the test-topic-perf.

kubectl exec -it kafka-cli -n kafka -- bin/kafka-consumer-perf-test.sh \
  --topic test-topic-perf \
  --messages 100000000 \
  --broker-list bootstrap.servers=cluster-kafka-bootstrap.kafka.svc.cluster.local:9092 | \
  jq -R .|jq -sr 'map(./",")|transpose|map(join(": "))[]'

The output will show the total amount of 9536.74 MB data consumed and 10000000 messages and the throughput 265.36 MB.sec and 1951828.8636 nMsg.sec.

start.time: 2023-11-22 18:08:11:395
 end.time:  2023-11-22 18:08:47:333
 data.consumed.in.MB:  9536.7432
 MB.sec:  265.3666
 data.consumed.in.nMsg:  100000000
 nMsg.sec:  2782569.9816
 rebalance.time.ms:  3387
 fetch.time.ms:  32551
 fetch.MB.sec:  292.9785
:  3072102.2396

View Kafka cluster in Grafana

The blueprint installs a local instance of Prometheus and Grafana, with predefined dashboards and has the option to create an instance of Amazon Managed Prometheus. The secret for login to Grafana Dashboard is stored in AWS Secrets Manager.

kubectl port-forward svc/kube-prometheus-stack-grafana 8080:80 -n kube-prometheus-stack

Open browser with local Grafana Web UI.

Enter username as admin and password can be extracted from AWS Secrets Manager with the following command.

aws secretsmanager get-secret-value \
 --secret-id kafka-on-eks-grafana --region $AWS_REGION --query "SecretString" --output text

Open Strimzi Kafka dashboard

Grafana Dashboard showing Strimzi Kafka parameters

Test Kafka disruption

When you deploy Kafka on Amazon Elastic Compute Cloud (Amazon EC2) machines, you can configure storage in two primary ways: Amazon Elastic Block Storage (Amazon EBS) and instance storage. Amazon EBS consists of attaching a disk to an instance over a local network, whereas instance storage is directly attached to the instance.

Amazon EBS volumes offer consistent I/O performance and flexibility in deployment. This is crucial when considering Kafka’s built-in fault tolerance of replicating partitions across a configurable number of brokers. In the event of a broker failure, a new replacement broker fetches all data the original broker previously stored from other brokers in the cluster that hosts the other replicas. Depending on your application, this could involve copying tens of gigabytes or terabytes of data, which takes time and increases network traffic, potentially impacting the performance of the Kafka cluster. A properly designed Kafka cluster based on Amazon EBS storage can virtually eliminate re-replication overhead that would be triggered by an instance failure, as the Amazon EBS volumes can be reassigned to a new instance quickly and easily.

For instance, if an underlying Amazon EC2 instance fails or is terminated, the broker’s on-disk partition replicas remain intact and can be mounted by a new Amazon EC2 instance. By using Amazon EBS, most of the replica data for the replacement broker will already be in the Amazon EBS volume and hence won’t need to be transferred over the network.

It is recommended to use the broker ID from the failed broker in the replacement broker when replacing a Kafka broker. This approach is operationally simple as the replacement broker will automatically resume the work that the failed broker was doing.

You simulate a node failure for an Amazon EKS node from Kafka nodegroups hosting one of the brokers.

Create a topic and send some data to it:

kubectl exec -it kafka-cli -n kafka -- bin/kafka-topics.sh \
  --create \
  --topic test-topic-failover \
  --partitions 1 \
  --replication-factor 3 \
  --bootstrap-server cluster-kafka-bootstrap:9092
# Output should look like below
Created topic test-topic-failover

kubectl exec -it kafka-cli -n kafka -- bin/kafka-topics.sh \
  --describe \
  --topic test-topic-failover \
  --bootstrap-server cluster-kafka-bootstrap:9092
# Output should look like below
Topic: test-topic-failover        TopicId: zpk0hKUkSM61sK-haPpaNA PartitionCount: 1        ReplicationFactor: 3           Configs: min.insync.replicas=2,message.format.version=3.0-IV1
       Topic: test-topic-failover      Partition: 0    Leader: 0       Replicas: 0,2,3 Isr: 0,2,3

kubectl exec -it kafka-cli -n kafka -- bin/kafka-console-producer.sh \
  --topic test-topic-failover \
  --bootstrap-server cluster-kafka-bootstrap:9092
>This is message for testing failover

Leader: 0 in the output above means the leader for topic test-topic-failover is broker 1, which corresponds to pod cluster-kafka-0.

Find on which node is running pod cluster-kafka-0.

kubectl -n kafka get pod cluster-kafka-0 -owide
# Output should look like below
NAME             READY  STATUS   RESTARTS    AGE  IP NODE     NOMINATED NODE                            READINESS GATES
cluster-kafka-0   1/1     Running   0          62m   10.1.2.168   ip-10-1-2-35.us-west-2.compute.internal   <none>           <none>

Note: Your nominated node will be different.

Drain the node where the pod cluster-kafka-0 is running:

kubectl drain ip-10-1-2-35.us-west-2.compute.internal \
  --delete-emptydir-data \
  --force \
  --ignore-daemonsets

Get the instance ID and terminate it:

ec2_instance_id=$(aws ec2 describe-instances \
  --filters "Name=private-dns-name,Values=ip-10-1-2-35.us-west-2.compute.internal" \
  --query 'Reservations[*].Instances[*].{Instance:InstanceId}' \
  --region $AWS_REGION --output text)
  
aws ec2 terminate-instances --instance-id ${ec2_instance_id} --region $AWS_REGION > /dev/null

The cluster-kafka-0 pod will be recreated on a new node.

kubectl -n kafka get pod cluster-kafka-0 -owide
# Output should look like below
NAME             READY   STATUS    RESTARTS   AGE     IP NODE      NOMINATED NODE                             READINESS  GATES
cluster-kafka-0  1/1     Running   0          4m19s   10.1.2.11   ip-10-1-2-100.us-west-2.compute.internal   <none>     <none>

Verify the message is still available under the test-topic-failover.

kubectl exec -it kafka-cli -n kafka -- bin/kafka-console-consumer.sh \
 --topic test-topic-failover \
 --bootstrap-server cluster-kafka-bootstrap:9092 \
 --partition 0 \
 --from-beginning
# Output should look like below
This is message for testing failover

Prerequisites

Ensure that you have an AWS account, a profile with administrator permissions configured and the following tools installed locally:

AWS Command Line Interface (AWS CLI)
kubectl
terraform >=1.0.0

Cleaning up

Delete the resources creates in the blog to avoid incurring costs. You can run the below script:

./cleanup.sh

Conclusion

In this post, I showed you how to deploy Apache Kafka on Amazon EKS while ensuring high availability across multiple nodes with automatic failover. Additionally, I explored the benefits of using Data on EKS blueprints to quickly deploy infrastructure using Terraform. If you would like to learn more, then you can visit the Data on EKS GitHub repository to collaborate with others and access additional resources.

Containers