Container Orchestration platforms, such as Amazon Elastic Kubernetes Service (Amazon EKS), have simplified the process of building, securing, operating, and maintaining container-based applications. Therefore, they have helped organizations focus on building applications. Customers have started adopting event-driven deployment, allowing Kubernetes deployments to scale automatically in response to metrics from various sources dynamically.
By implementing event-driven deployment and autoscaling, customers can achieve cost savings by providing on-demand compute and autoscale efficiently that are based on custom needs. KEDA (Kubernetes-based Event Driven Autoscaler) lets you drive the autoscaling of Kubernetes workloads based on the number of events, such as a custom metric scraped breaching a specified threshold, or when there’s a message in a Amazon Managed Streaming for Apache Kafka queue.
Amazon CloudWatch is a monitoring and observability service built for DevOps engineers, developers, site reliability engineers (SREs), IT managers, and product owners. CloudWatch collects monitoring and operational data in the form of logs, metrics, and events. You get a unified view of operational health, and you gain complete visibility of your AWS resources, applications, and services running on AWS and on-premises.
This post will show you how to use KEDA to autoscale Amazon EKS pods by querying the metrics stored in CloudWatch.
Solution Overview
The following diagram shows the complete setup that we will walk through in this post.

Prerequisites
You will need the following to complete the steps in this post:
Create an Amazon EKS Cluster
You start by setting a few environment variables:
Next, you prepare the required Kubernetes scripts with a shell script from this GitHub repository and create an Amazon EKS cluster using eksctl
:
Creating a cluster can take up to 10 minutes. When the creation completes, proceed to the next steps.
Deploying a KEDA Operator
Next, you install the keda
operator in the keda
namespace of our Amazon EKS cluster by using the following commands:
Now you can check on the keda operator pods:
Deploy sample application
You will use a sample application called ho11y, a synthetic signal generator that lets you test observability solutions for microservices. It emits logs, metrics, and traces in a configurable manner. For more information, see the AWS O11y Receipes respository.
This command will create the Kubernetes deployments and services as shown in the following:
Scrape metrics using AWS Distro for OpenTelemetry (ADOT)
Next, you will deploy an OpenTelemetry (ADOT) collector to scrape Amazon Managed Service for Prometheus metrics emitted from the ho11y
application
After the ADOT collector is deployed, it will collect the metrics and ingest them into the specified CloudWatch namespace. The scrape configuration is similar to that of a Prometheus server. We have added the necessary configuration for scraping metrics from the ho11y
application.
Navigate to your CloudWatch console and look at the holly_total
metric. The deep link opens in the Oregon (us-west-2) Region. You can specify a different Region in the top-right console corner.

Configure sigv4 authentication for querying custom metrics from CloudWatch
AWS Signature Version 4 is a process to add authentication information to requests made to AWS APIs using HTTP. The AWS Command Line Interface (AWS CLI) and AWS SDKs use this protocol to make calls to the AWS APIs. CloudWatch API calls require sigv4
authentication. Furthermore, since KEDA doesn’t support sigv4
, we’ll deploy a sigv4
proxy as a K8s Service to act as a gateway for KEDA to access the query CloudWatch API endpoints.
Execute the following commands to deploy the sigv4
proxy:
Setup autoscaling using KEDA scaled object
Next, you will create the ScaledObject that will scale the deployment by querying the metrics stored in CloudWatch. A ScaledObject represents the desired mapping between an event source, such as a Prometheus metric and the Kubernetes Deployment, StatefulSet, or any Custom Resource that defines /scale
sub-resource.
Behind the scenes, KEDA monitors for the event source, and then feeds that data to Kubernetes and the HPA (Horizontal Pod Autoscaler) to drive the scaling of the specified Kubernetes resource. Each replica of a resource is actively pulling items from the event source.
The following commands will deploy the ScaledObject named ho11y-hpa
that will query the CloudWatch endpoint for a custom metric called ho11y_total
. The ho11y_total
metric represents the number of application invocations, and the threshold is specified as one. Depending on the value over a period of one minute, the scale in/out of downstream0
deployment happens between 1 and 10 pods.
KEDA also supports the scaling behavior that we configure in the Horizontal Pod Autoscaler. To make your scaling even more powerful, you can configure the pollingInterval
and cooldownPeriod
configurations. Follow this link to get more details on the CloudWatch trigger and the scaled object. Moreover, KEDA supports various additional scalers, and a current list of scalers is available on the KEDA home page.
Once we deploy the scaledobject
, the KEDA will also create an HPA object in the ho11y
namespace with the configuration specified in the scaledobject.yaml
:
Then take a quick look at our deployment/pod for ho11y
:
Loading the ho11y application
You need to place some load on the application by running the following commands:
Next, you will investigate the deployment to see if our deployment downstream0
is scaling in to spin more pods in response to the load on the application. Increased load on the application will cause the ho11y_total
custom metric in CloudWatch to go to one or higher, and it will trigger the deployment/pod scaling.
Note that it can take a few minutes before observing the deployment scale-in.
Describe the HPA using the following command, and you should see SuccessfulRescale
happening from horizontal-pod-autoscaler
This concludes the usage of KEDA to successfully autoscale the application using the metrics ingested into CloudWatch.
Clean-up
You will continue to incur cost until deleting the infrastructure that you created for this post. Delete the cluster resources using the following commands:
Conclusion
This post demonstrates the detailed steps for utilizing the KEDA operator to autoscale deployments based on custom metrics emitted by the instrumented application that CloudWatch pushes. This capability helps customers scale compute capacity on-demand by provisioning the pods only when needed to serve bursts of traffic. Furthermore, CloudWatch lets you store the metrics reliably. KEDA can also monitor and efficiently scale the workloads out/in based on the events occurring.
Also checkout Proactive autoscaling of Kubernetes workloads with KEDA post if you are curious to learn about autoscaling your kubernetes workloads using metrics ingested into Amazon Managed Service for Prometheus.
Authors: