Integrate KEDA with a Kubernetes cluster to achieve event-driven scalability
This Guidance demonstrates how to implement event-driven autoscaling for Amazon Elastic Kubernetes Service (Amazon EKS) applications using Kubernetes Event-Driven Autoscaler (KEDA). The Guidance shows how to scale deployments based on custom metrics, rather than solely CPU and memory utilization. KEDA integrates with Kubernetes, extending autoscaling capabilities for event-driven workloads. By following this Guidance, you can optimize resource provisioning, improve cost efficiency, and enhance the customer experience for your event-driven applications on Amazon EKS using custom metrics-based autoscaling with KEDA.
Please note: [Disclaimer]
Architecture Diagram
-
EKS Cluster
-
KEDA Overview
-
Scaling with KEDA
-
EKS Cluster
-
This architecture diagram shows how to deploy KEDA on Amazon EKS clusters. For a KEDA overview, open the next tab.
Step 1
Set up an AWS Cloud9 environment with AWS Identity and Access Management (IAM) permissions.Step 2
Install helm, eksctl, kubectl, and AWS Command Line Interface (CLI) in AWS Cloud9.Step 3
Amazon Elastic Kubernetes Service (Amazon EKS) cluster and EKS managed node groups are launched through AWS Cloud9.Step 4
KEDA is deployed with the required IAM role for service account (IRSA).Step 5
Deploy Amazon Simple Queue Service (Amazon SQS) to decouple communication between applications and attach a policy on KEDA IRSA to access Amazon SQS.Step 6
Create Amazon Managed Service for Prometheus and optionally, Amazon Managed Grafana.Step 7
Configure AWS Distro for OpenTelemetry to send application metrics to Amazon Managed Service for Prometheus, deployed with the required IAM IRSA.Step 8
Configure the Sigv4 proxy pod to authenticate KEDA with Amazon Managed Service for Prometheus, deployed with the required IAM IRSA. -
KEDA Overview
-
This architecture diagram shows an overview of how KEDA, the Kubernetes Horizontal Pod Autoscaler (HPA), and external event sources work together. For KEDA scaling pods, open the next tab.
Step 1
The scaled object is a CustomResourceDefinition (CRD) to configure the event source, deployment to be scaled, and scaling behavior.Step 2
KEDA activates and deactivates Kubernetes deployments to scale to and from zero on no events. This is one of the primary roles of the keda-operator container that runs when you install KEDA.Step 3
KEDA feeds custom metrics for Kubernetes Horizontal Pod Autoscaling (HPA) to scale from one to the required amount of pods.Step 4
HPA scales the pods based on KEDA instructions.Step 5
KEDA supports more than 60 event sources, available at: Currently available scalers for KEDA. -
Scaling with KEDA
-
This architecture diagram shows KEDA scaling deployment pods based on custom metrics sources. For the EKS cluster, open the first tab.
Step 1
The app uses Amazon SQS to decouple communication between microservices.Step 2
AWS Distro for OpenTelemetry gets metrics from the application and sends them to Amazon Managed Service for Prometheus.Step 3
KEDA is configured to use Amazon SQS and the Amazon Managed Service for Prometheus scaler to get Amazon SQS queue length and Prometheus custom metrics.Step 4
KEDA (keda-operator-metrics-apiserver) exposes event data for HPA to scale.Step 5
HPA scales to the appropriate number of pods.Step 6
Cluster Autoscaling (CA) provisions the required nodes using auto scaling group. Instead of CA, you can also use Karpenter.Step 7
New capacity is provisioned as required.Step 8
You can optionally configure Amazon Managed Grafana to show metrics from Amazon Managed Service for Prometheus in a dashboard.
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
This event-driven architecture allows you to define precise scaling rules, granting fine-grained control over how your applications respond to specific events or metrics. By using KEDA, you can enhance the efficiency, resource utilization, and responsiveness of your Kubernetes environment, driving operational excellence.
-
Security
Use IAM roles to obtain temporary credentials, enabling access to diverse AWS services. For your Kubernetes-based applications, integrate with native authentication and authorization mechanisms to securely interact with the Kubernetes API server. By precisely defining IAM policies to grant the minimal necessary permissions, you effectively mitigate unauthorized access and strengthen the security of your environment.
-
Reliability
Using pod topology spread constraints in Kubernetes can help you bolster availability and resilience of your applications. This feature enables you to control the distribution of your pods across different failure domains, such as hosts or Availability Zones. By ensuring a balanced and optimal spread, you minimize the impact of potential failures within a single domain, enhancing the overall integrity of your application.
-
Performance Efficiency
The Amazon EKS cluster's multi-Availability Zone setup allows for low-latency performance. While inter-subnet traffic across Availability Zones may occur, the resulting latency is expected to have a minimal impact on your application's performance. In scenarios where even lower latency is required, you can direct traffic within a single Availability Zone to further optimize network performance.
-
Cost Optimization
KEDA's automated pod scaling capabilities can help provide cost savings. By using custom metrics to initiate scaling, you can support optimal pod availability to meet your application's needs, avoiding overprovisioning and unnecessary costs.
-
Sustainability
You can achieve sustainable resource management through Kubernetes HPA. KEDA uses this capability to effectively scale your workloads based on custom metrics so that only the required pods run. With access to over 60 different scalers, you can tailor the scaling behavior to your specific needs, maximizing the utilization of your deployed resources and preventing overallocation.
Related Content
[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.