AWS Cloud Operations Blog
Proactive autoscaling of Kubernetes workloads with KEDA using metrics ingested into Amazon Managed Service for Prometheus
UPDATE: This blog post has been published to include information about the recently added support for KEDA with the Amazon Managed Service for Prometheus (AMP).”
Orchestration platforms such as Amazon EKS and Amazon ECS have simplified the process of building, securing, operating, and maintaining container-based applications, thereby helping organizations focus on building applications. We simplified this further with the launch of AWS Fargate, where customers no longer have to manage Amazon Elastic Compute Cloud (EC2) instances for their Amazon EKS/ECS clusters. Customers have started adopting event-driven deployment, which is the ability to activate a Kubernetes deployment (i.e., no pods to a single pod) and subsequently to more pods based on events from various event sources.
By implementing eventdriven deployment andautoscaling, customers can achieve cost savings by providing on-demand compute and autoscale efficiently based on application logic. Kubernetes supports cluster level autoscaling using cluster autoscaler and application level autoscaling using Horizontal Pod Autoscaler and Vertical Pod Autoscaler. Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler can scale pods based on observed CPU utilization and memory usage. For complex scenarios like eventdriven autoscaling, customers rely on metrics related to application logic.
KEDA (Kubernetes-based Event Driven Autoscaler) lets you drive the autoscaling of Kubernetes workloads based on the number of events, such as a custom metric scraped breaching a specified threshold or when there is a message in the Kafka queue. KEDA is a single-purpose and lightweight component that can be added into any Kubernetes cluster. It works alongside standard Kubernetes components, such as the Horizontal Pod Autoscaler, and can extend functionality without overwriting or duplication. KEDA lets you explicitly map the apps you want to use event-driven scale, while other apps continuing to function. This makes KEDA a flexible and safe option to run alongside any number of other Kubernetes applications or frameworks.
Amazon Managed Service for Prometheus is a serverless, Prometheus-compatible monitoring service for container metrics that makes it easier to securely monitor container environments at scale. Amazon Managed Service for Prometheus lets you use the same opensource Prometheus data model and query language that you use today to monitor the performance of your containerized workloads, and also enjoy improved scalability, availability, and security without having to manage the underlying infrastructure.
In this post, we will show you how you can use KEDA to autoscale Amazon EKS Pods by querying the metrics stored in Amazon Managed Service for Prometheus
Prerequisites
You will need the following to complete the steps in this post:
- AWS CLI version 2
- eksctl
- kubectl
- jq
- helm
- An Amazon Managed Service for Prometheus workspace configured in your AWS account. For instructions, see Create a workspace in the Amazon Managed Service for Prometheus User Guide.
The following diagram shows the complete setup that we will walk through in this blog post
Figure 1: Architecture diagram
Step 1: Create an Amazon EKS Cluster and KEDA operator
First, create an Amazon EKS cluster for running the sample application and the KEDA operator. The eksctl CLI tool will deploy the cluster using the eks-cluster-config.yaml
file:
Execute the following command to create the EKS cluster:
This creates an Amazon EKS cluster named AMP-KEDA-CLUSTER
and a service account named keda-operator
that the KEDA scaler will use.
KEDA performs two key roles within Kubernetes: The Agent activates and deactivates Kubernetes Deployments to scale to and from zero on no events. This is one of the primary roles of the keda-operator
container that runs when you install KEDA. KEDA also acts as a Kubernetes metrics server that exposes rich event data, such as queue length or any custom metric to the Horizontal Pod Autoscaler to drive scale out. It is up to the Deployment to consume the events directly from the source. This preserves rich event integration and enables gestures like completing or abandoning queue messages to work out of the box. The metric serving is the primary role of the keda-operator-metrics-apiserver
container that runs when you install KEDA.
Next, use the following commands to install the keda operator in the Keda namespace:
We will create the values.yaml for the KEDA operator to set the right security context:
We can describe the pods in the Keda namespace to verify:
Step 2: Create an Amazon Managed Service for Prometheus workspace
The Amazon Managed Service for Prometheus workspace ingests the Prometheus metrics collected from envoy. A workspace is a logical and isolated Prometheus server dedicated to Prometheus resources such as metrics. A workspace supports fine-grained access control for authorizing its management, such as update, list, describe, and delete, as well as ingesting and querying metrics.
Next, optionally create an interface VPC endpoint to securely access the managed service from resources deployed in your VPC. This makes sure that data ingested by the managed service won’t leave your AWS account VPC. Utilize the AWS CLI as shown here. Replace the placeholder strings, such as VPC_ID, AWS_REGION
, with your values.
Step 3: Deploy sample application and scrape metrics using AWS Distro for OpenTelemetry
In this post, we will use a sample application called ho11y, which is a synthetic signal generator letting you test observability solutions for microservices. It emits logs, metrics, and traces in a configurable manner.
To install the application, execute the following commands:
This will create three deployments: frontend, downstream0, and downstream1 and 3 services for each deployment.
Next, we will have to deploy a collector that can scrape the application metrics emitted by ho11y. Amazon Managed Service for Prometheus does not directly scrape operational metrics from containerized workloads in a Kubernetes cluster. You must deploy and manage a Prometheus server or an OpenTelemetry agent, such as the AWS Distro for OpenTelemetry Collector (ADOT) or the Grafana Agent to perform this task. The ADOT-AMP pipeline lets us use the ADOT Collector to scrape a Prometheus-instrumented application, and then send the scraped metrics to Amazon Managed Service for Prometheus.
This post will also walk you through the steps to configure an AWS Identity and Access Management (IAM) role to send Prometheus metrics to Amazon Managed Service for Prometheus. We install the ADOT collector on the Amazon EKS cluster and forward metrics to Amazon Managed Service for Prometheus.
Configure permissions
We will be deploying the ADOT collector to run under the identity of a Kubernetes service account “amp-iamproxy-service-account”. IAM roles for service accounts (IRSA) lets you associate the AmazonPrometheusRemoteWriteAccess role with a Kubernetes service account, thereby providing IAM permissions to any pod utilizing the service account to ingest the metrics to Amazon Managed Service for Prometheus.
You need kubectl and eksctl CLI tools to run the script. They must be configured to access your Amazon EKS cluster.
Now create a manifest file, amp-eks-adot-prometheus-daemonset.yaml, with the scrape configuration to extract envoy metrics and deploy the ADOT collector. This example deploys a deployment named adot. The adot deployment collects metrics from pods on the cluster.
After the ADOT collector is deployed, it will collect the metrics and ingest them into the specified Amazon Managed Service for Prometheus workspace. The scrape configuration is similar to that of a Prometheus server. We have added the necessary configuration for scraping the ho11y metrics.
Step 4: Configure sigv4 authentication for querying Amazon Managed Service for Prometheus and setup autoscaling using KEDA scaled object
AWS Signature Version 4 is a process to add authentication information to requests made to AWS APIs using HTTP. The AWS Command Line Interface (AWS CLI) and the AWS SDKs already use this protocol to make calls to the AWS APIs. Amazon Managed Service for Prometheus requires the API calls to have sigv4 authentication, and since KEDA doesn’t support sigv4, we will be deploying a sigv4 proxy service to act as a gateway for KEDA to access the query endpoint of the Amazon Managed Service for Prometheus.
Execute the following commands to deploy the sig-v4 proxy:
Let’s create the ScaledObject that will scale the deployment by querying the metrics stored in Amazon Managed Service for Prometheus.
A ScaledObject represents the desired mapping between an event source, such as a Prometheus metric and the Kubernetes Deployment, StatefulSet, or any Custom Resource that defines /scale
subresource.
Behind the scenes, KEDA acts to monitor the event source and feed that data to Kubernetes and the HPA (Horizontal Pod Autoscaler) to drive the rapid scale of a resource. Each replica of a resource is actively pulling items from the event source.
KEDA also supports the scaling behavior that we configure in Horizontal Pod Autoscaler. This means that we can configure pollingInterval, cooldownPeriod, fallback configuration and make your scaling even more powerful.
The following commands will deploy the scaledobject named ho11y-hpa that will query the Amazon Managed Service for Prometheus query endpoint for a metric called holly_total
.
holly_total metric represents the number of invokes of the application, and the threshold is specified as 1. Depending on the value over a period of one minute, you can scale in/out the downstream0 deployment between 1 and 10 pods.
Once we deploy the scaledobject, the KEDA will also create a HPA object in the ho11y namespace with the configuration specified in the scaledobject.yaml:
Let’s put some load on the application by running the following command:
Describing the HPA should show something such as the following:
This confirms that the KEDA could successfully scale the application using the metrics ingested into Amazon Managed Service for Prometheus.
Clean-up
Use the following commands to delete resources created during this post:
Conclusion
This post demonstrated how we can utilize KEDA operator to autoscale deployments based on some custom metrics stored in Amazon Managed Service for Prometheus. This helps customers provide on-demand compute by provisioning the pods only when they must have them. Amazon Managed Service for Prometheus lets you store the metrics reliably, and KEDA can monitor and scale the workloads out/in efficiently based on the events as they occur.
Further Reading
- KEDA Concepts
- Getting Started with Amazon Managed Service for Prometheus
- AWS One Observability Demo Workshop: What’s new with Prometheus, Grafana, and OpenTelemetry
- Using Prometheus Adapter to autoscale applications running on Amazon EKS