Automated scaling is an approach to scaling up or down workloads automatically based on resource usage. In Kubernetes, the Horizontal Pod Autoscaler (HPA) can scale pods based on observed CPU utilization and memory usage. In more complex scenarios, we would account for other metrics before deciding the scaling. For example, most web and mobile backends require automated scaling based on requests per second in order to handle traffic bursts. For ETL apps, automated scaling could be triggered by the job queue length exceeding a particular threshold, and so on. Instrumenting your applications with Prometheus and exposing the right metrics for autoscaling lets you fine-tune your apps to handle bursts better and ensure high availability.
Prometheus is an open-source monitoring and alerting toolkit that collects and stores its metrics as time series data. In other words, its metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels. Prometheus Adapter helps query and leverage custom metrics collected by Prometheus, and then utilizes them to make scaling decisions. These metrics are exposed by an API service and can be used readily by Horizontal Pod Autoscaler object.
Managing long-term Prometheus storage infrastructure is challenging. Therefore, in order to remove the heavy lifting of managing Prometheus, AWS launched Amazon Managed Service for Prometheus , a Prometheus-compatible monitoring service for container infrastructure and application metrics for containers that makes it easy to securely monitor container environments at scale. Amazon Managed Service for Prometheus automatically scales the ingestion, storage, alerting, and querying of operational metrics as workloads scale up and down.
This post will show how to utilize Prometheus Adapter to autoscale Amazon EKS Pods running an Amazon App Mesh workload. AWS App Mesh is a service mesh that makes it easy to monitor and control services. A service mesh is an infrastructure layer dedicated to handling service-to-service communication, usually through an array of lightweight network proxies deployed alongside the application code. We will be registering the custom metric via a Kubernetes API service that HPA will eventually use to make scaling decisions.
Prerequisites
You will need the following to complete the steps in this blog post:
Create an Amazon EKS Cluster
Figure 1: Architecture diagram
We will create a custom metric for the counter exposed by envoy, which is the “envoy_cluster_upstream_rq
“. This can be extended to any custom metrics that the application emits.
First, create an Amazon EKS cluster enabled with AWS App Mesh for running the sample application. The eksctl CLI tool will deploy the cluster using the eks-cluster-config.yaml
file:
export AMP_EKS_CLUSTER=AMP-EKS-CLUSTER
export AMP_ACCOUNT_ID=<Your Account id>
export AWS_REGION=<Your Region>
cat << EOF > eks-cluster-config.yaml
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: $AMP_EKS_CLUSTER
region: $AWS_REGION
version: '1.18'
iam:
withOIDC: true
serviceAccounts:
- metadata:
name: appmesh-controller
namespace: appmesh-system
labels: {aws-usage: "application"}
attachPolicyARNs:
- "arn:aws:iam::aws:policy/AWSAppMeshFullAccess"
managedNodeGroups:
- name: default-ng
minSize: 1
maxSize: 3
desiredCapacity: 2
labels: {role: mngworker}
iam:
withAddonPolicies:
certManager: true
cloudWatch: true
appMesh: true
cloudWatch:
clusterLogging:
enableTypes: ["*"]
EOF
Execute the following command to create the EKS cluster:
eksctl create cluster -f eks-cluster-config.yaml
This creates an Amazon EKS cluster named AMP-EKS-CLUSTER
and a service account named appmesh-controller
that the AWS App Mesh controller will use for EKS.
Next, use the following commands to install the AppMesh controller.
First, get the Custom Resource Definitions (CRDs) in place:
helm repo add eks https://aws.github.io/eks-charts
helm upgrade -i appmesh-controller eks/appmesh-controller \
--namespace appmesh-system \
--set region=${AWS_REGION} \
--set serviceAccount.create=false \
--set serviceAccount.name=appmesh-controller
Step 2: Deploy sample application and enable AWS App Mesh
To install an application and inject an envoy container, use the AWS App Mesh controller for Kubernetes that you created earlier. AWS App Mesh Controller for K8s manages App Mesh resources in your Kubernetes clusters. The controller is accompanied by CRDs that allow you to define AWS App Mesh components, such as meshes and virtual nodes, via the Kubernetes API just as you define native Kubernetes objects, such as deployments and services. These custom resources map to AWS App Mesh API objects that the controller manages for you. The controller watches these custom resources for changes and reflects them into the AWS App Mesh API.
## Install the base application
git clone https://github.com/aws/aws-app-mesh-examples.git
kubectl apply -f aws-app-mesh-examples/examples/apps/djapp/1_base_application
kubectl get all -n prod ## check the pod status and make sure it is running
## Now install the App Mesh controller and meshify the deployment
kubectl apply -f aws-app-mesh-examples/examples/apps/djapp/2_meshed_application/
kubectl rollout restart deployment -n prod dj jazz-v1 metal-v1
kubectl get all -n prod ## Now we should see two containers running in each pod
Step 3: Create an Amazon Managed Service for Prometheus workspace
The Amazon Managed Service for Prometheus workspace ingests the Prometheus metrics collected from envoy. A workspace is a logical and isolated Prometheus server dedicated to Prometheus resources such as metrics. A workspace supports fine-grained access control for authorizing its management, such as update, list, describe, and delete, as well as ingesting and querying metrics.
aws amp create-workspace --alias AMP-APPMESH --region $AWS_REGION
Next, optionally create an interface VPC endpoint in order to securely access the managed service from resources deployed in your VPC. An Amazon Managed Service for Prometheus public endpoint is also available. This ensures that data ingested by the managed service won’t leave your AWS account VPC. Utilize the AWS CLI as shown here. Replace the placeholder strings, such as VPC_ID, AWS_REGION
, with your values.
export VPC_ID=<Your EKS Cluster VPC Id>
aws ec2 create-vpc-endpoint \
--vpc-id $VPC_ID \
--service-name com.amazonaws.<$AWS_REGION>.aps-workspaces \
--security-group-ids <SECURITY_GROUP_IDS> \
--vpc-endpoint-type Interface \
--subnet-ids <SUBNET_IDS>
Step 4: Scrape the metrics using AWS Distro for OpenTelemetry
Amazon Managed Service for Prometheus does not directly scrape operational metrics from containerized workloads in a Kubernetes cluster. You must deploy and manage a Prometheus server or an OpenTelemetry agent such as the AWS Distro for OpenTelemetry Collector or the Grafana Agent in order to perform this task. This post will walk you through the configuring of the AWS Distro for Open Telemetry (ADOT) in order to scrape the envoy metrics. The ADOT-AMP pipeline lets us use the ADOT Collector to scrape a Prometheus-instrumented application, and then send the scraped metrics to Amazon Managed Service for Prometheus.
This post will also walk you through the steps to configure an IAM role to send Prometheus metrics to Amazon Managed Service for Prometheus. We install the ADOT collector on the Amazon EKS cluster and forward metrics to Amazon Managed Service for Prometheus.
Configure permissions
We will be deploying the ADOT collector to run under the identity of a Kubernetes service account “amp-iamproxy-service-account”. With IAM roles for service accounts (IRSA), you can associate the AmazonPrometheusRemoteWriteAccess role with a Kubernetes service account, thereby providing IAM permissions to any pod utilizing the service account to ingest the metrics to Amazon Managed Service for Prometheus.
You need kubectl and eksctl CLI tools in order to run the script. They must be configured to access your Amazon EKS cluster.
kubectl create namespace prometheus
eksctl create iamserviceaccount --name amp-iamproxy-service-account --namespace prometheus --cluster $AMP_EKS_CLUSTER --attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess --approve
export WORKSPACE=$(aws amp list-workspaces | jq -r '.workspaces[] | select(.alias=="AMP-APPMESH").workspaceId')
export REGION=$AWS_REGION
export REMOTE_WRITE_URL="https://aps-workspaces.$REGION.amazonaws.com/workspaces/$WORKSPACE/api/v1/remote_write"
Now create a manifest file, amp-eks-adot-prometheus-daemonset.yaml, with the scrape configuration in order to extract envoy metrics and deploy the ADOT collector. This example deploys a DaemonSet named adot-collector. The adot-collector DaemonSet collects metrics from pods on the cluster.
cat > amp-eks-adot-prometheus-daemonset.yaml <<EOF
---
apiVersion: v1
kind: ConfigMap
metadata:
name: adot-collector-conf
namespace: prometheus
labels:
app: aws-adot
component: adot-collector-conf
data:
adot-collector-config: |
receivers:
prometheus:
config:
global:
scrape_interval: 15s
scrape_timeout: 10s
scrape_configs:
- job_name: 'appmesh-envoy'
metrics_path: /stats/prometheus
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_container_name]
action: keep
regex: '^envoy$'
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: ${1}:9901
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: namespace
- source_labels: ['app']
action: replace
target_label: service
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_scrape]
action: keep
regex: true
exporters:
awsprometheusremotewrite:
# replace this with your endpoint
endpoint: "$REMOTE_WRITE_URL"
# replace this with your region
aws_auth:
region: "$REGION"
service: "aps"
logging:
loglevel: info
extensions:
health_check:
pprof:
endpoint: :1888
zpages:
endpoint: :55679
service:
extensions: [pprof, zpages, health_check]
pipelines:
metrics:
receivers: [prometheus]
exporters: [logging, awsprometheusremotewrite]
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: adotcol-admin-role
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups:
- extensions
resources:
- ingresses
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: adotcol-admin-role-binding
subjects:
- kind: ServiceAccount
name: amp-iamproxy-service-account
namespace: prometheus
roleRef:
kind: ClusterRole
name: adotcol-admin-role
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: Service
metadata:
name: adot-collector
namespace: prometheus
labels:
app: aws-adot
component: adot-collector
spec:
ports:
- name: metrics # Default endpoint for querying metrics.
port: 8888
selector:
component: adot-collector
type: NodePort
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: adot-collector
namespace: prometheus
labels:
app: aws-adot
component: adot-collector
spec:
selector:
matchLabels:
app: aws-adot
component: adot-collector
minReadySeconds: 5
template:
metadata:
labels:
app: aws-adot
component: adot-collector
spec:
serviceAccountName: amp-iamproxy-service-account
containers:
- command:
- "/awscollector"
- "--config=/conf/adot-collector-config.yaml"
image: public.ecr.aws/aws-observability/aws-otel-collector:latest
name: adot-collector
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 200m
memory: 400Mi
ports:
- containerPort: 8888 # Default endpoint for querying metrics.
volumeMounts:
- name: adot-collector-config-vol
mountPath: /conf
livenessProbe:
httpGet:
path: /
port: 13133 # Health Check extension default port.
readinessProbe:
httpGet:
path: /
port: 13133 # Health Check extension default port.
volumes:
- configMap:
name: adot-collector-conf
items:
- key: adot-collector-config
path: adot-collector-config.yaml
name: adot-collector-config-vol
---
EOF
kubectl apply -f amp-eks-adot-prometheus-daemonset.yaml
After the ADOT collector is deployed, it will collect the metrics and ingest them into the specified Amazon Managed Service for Prometheus workspace. The scrape configuration is similar to that of a Prometheus server. We will add the necessary configuration for scraping envoy metrics.
Step 5: Deploy the Prometheus Adapter to register custom metric
We will be creating a serviceaccount “monitoring” that will be used to run the Prometheus adapter. We will also be assigning the AmazonPrometheusQueryAccess permission using IRSA.
kubectl create namespace monitoring
eksctl create iamserviceaccount --name monitoring --namespace monitoring --cluster AMP-EKS-CLUSTER --attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusQueryAccess --approve --override-existing-serviceaccounts
cat > pma-cm.yaml << EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: adapter-config
namespace: monitoring
data:
config.yaml: |
rules:
- seriesQuery: 'envoy_cluster_upstream_rq {namespace!="",kubernetes_pod_name!=""}'
resources:
overrides:
namespace: {resource: "namespace"}
kubernetes_pod_name: {resource: "pod"}
name:
matches: "envoy_cluster_upstream_rq "
as: "appmesh_requests_per_second"
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)'
EOF
kubectl apply -f pma-cm.yaml
openssl req -new -newkey rsa:4096 -x509 -sha256 -days 365 -nodes -out serving.crt -keyout serving.key -subj "/C=CN/CN=custom-metrics-apiserver.monitoring.svc.cluster.local"
kubectl create secret generic -n monitoring cm-adapter-serving-certs --from-file=serving.key=./serving.key --from-file=serving.crt=./serving.crt
The Envoy sidecar utilized by AWS App Mesh exposes a counter envoy_cluster_upstream_rq_total
. You can configure the Prometheus adapter to transform this metric into req/sec rate. Below is the Prometheus adapter configuration information. The adapter will be connecting to the Amazon Managed Service for Prometheus’s query endpoint through sigv4 proxy.
We will now deploy the Prometheus adapter to create the custom metric:
cat > prometheus-adapter.yaml <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: custom-metrics-resource-reader
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- nodes/stats
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: custom-metrics-resource-reader
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: custom-metrics-resource-reader
subjects:
- kind: ServiceAccount
name: monitoring
namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: custom-metrics:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: monitoring
namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: custom-metrics-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: monitoring
namespace: monitoring
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: custom-metrics-apiserver
name: custom-metrics-apiserver
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: custom-metrics-apiserver
template:
metadata:
labels:
app: custom-metrics-apiserver
name: custom-metrics-apiserver
spec:
serviceAccountName: monitoring
containers:
- name: custom-metrics-apiserver
image: directxman12/k8s-prometheus-adapter-amd64
args:
- /adapter
- --secure-port=6443
- --tls-cert-file=/var/run/serving-cert/serving.crt
- --tls-private-key-file=/var/run/serving-cert/serving.key
- --logtostderr=true
- --prometheus-url=http://localhost:8080/workspaces/$WORKSPACE
- --metrics-relist-interval=30s
- --v=10
- --config=/etc/adapter/config.yaml
ports:
- containerPort: 6443
volumeMounts:
- mountPath: /var/run/serving-cert
name: volume-serving-cert
readOnly: true
- mountPath: /etc/adapter/
name: config
readOnly: true
- name: aws-iamproxy
image: public.ecr.aws/aws-observability/aws-sigv4-proxy:1.0
args:
- --name
- aps
- --region
- us-east-1
- --host
- aps-workspaces.us-east-1.amazonaws.com
ports:
- containerPort: 8080
volumes:
- name: volume-serving-cert
secret:
secretName: cm-adapter-serving-certs
- name: config
configMap:
name: adapter-config
---
apiVersion: v1
kind: Service
metadata:
name: custom-metrics-apiserver
namespace: monitoring
spec:
ports:
- port: 443
targetPort: 6443
selector:
app: custom-metrics-apiserver
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.custom.metrics.k8s.io
spec:
service:
name: custom-metrics-apiserver
namespace: monitoring
group: custom.metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100
EOF
kubectl apply -f prometheus-adapter.yaml
We will create an API service so that our Prometheus adapter is accessible by Kubernetes API. Therefore, metrics can be fetched by our Horizontal Pod Autoscaler. We can query the custom metric API to see if the metric has been created.
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 |jq .
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "pods/appmesh_requests_per_second",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
{
"name": "namespaces/appmesh_requests_per_second",
"singularName": "",
"namespaced": false,
"kind": "MetricValueList",
"verbs": [
"get"
]
}
]
}
Now you can use the appmesh_requests_per_second
metric in the HPA definition with the following HPA resource:
cat > hpa.yaml <<EOF
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: envoy-hpa
namespace: prod
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: jazz-v1
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: appmesh_requests_per_second
targetAverageValue: 10m
EOF
kubectl apply -f hpa.yaml
Now, we will be able to scale the pods when the threshold for the metric “appmesh_request_per_second” exceeds 10.
Let us add some load to experience the autoscaling actions:
dj_pod=`kubectl get pod -n prod --no-headers -l app=dj -o jsonpath='{.items[*].metadata.name}'`
loop_counter=0
while [ $loop_counter -le 300 ] ; do kubectl exec -n prod -it $dj_pod -c dj -- curl jazz.prod.svc.cluster.local:9080 ; echo ; loop_counter=$[$loop_counter+1] ; done
Describing the HPA will show the scaling actions resulting from the load we introduced.
kubectl describe hpa -n prod
Name: envoy-hpa
Namespace: prod
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 06 Sep 2021 04:19:37 +0000
Reference: Deployment/jazz-v1
Metrics: ( current / target )
"appmesh_requests_per_second" on pods: 622m / 10m
Min replicas: 1
Max replicas: 10
Deployment pods: 8 current / 10 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 10
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric appmesh_requests_per_second
ScalingLimited True TooManyReplicas the desired replica count is more than the maximum replica count
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetPodsMetric 58m (x44 over 69m) horizontal-pod-autoscaler unable to get metric appmesh_requests_per_second: unable to fetch metrics from custom metrics API: the server could not find the metric appmesh_requests_per_second for pods
Normal SuccessfulRescale 41s horizontal-pod-autoscaler New size: 4; reason: pods metric appmesh_requests_per_second above target
Normal SuccessfulRescale 26s horizontal-pod-autoscaler New size: 8; reason: pods metric appmesh_requests_per_second above target
Normal SuccessfulRescale 11s horizontal-pod-autoscaler New size: 10; reason: pods metric appmesh_requests_per_second above target
Clean-up
Use the following commands to delete resources created during this post:
aws amp delete-workspace --workspace-id $WORKSPACE
eksctl delete cluster $AMP_EKS_CLUSTER
Conclusion
This blog demonstrated how we can utilize Prometheus Adapter to autoscale deployments based on some custom metrics. For the sake of simplicity, we have only fetched one metric from AMP. However, the Adapter configmap can be extended to fetch some or all of the available metrics and utilize them for autoscaling.
Further Reading
About the author