AWS Open Source Blog
Amazon EKS Control Plane Metrics with Prometheus
中文版 – Kubernetes core components provide a rich set of metrics you can use to observe what is happening in the Control Plane. You can see how many watchers
are on each resource in the API Server, the number of audit trail events, the latency of the requests to the API Server, and much more. These metrics come from the Kubernetes API Server, Kubelet, Cloud Controller Manager, and the Scheduler. These components expose “metrics” endpoints (which respond via HTTP) at /metrics
with a text/plain
content type. This post will walk you through how to get the API Server metrics from an Amazon Elastic Container Service for Kubernetes (EKS) cluster.
Prerequisites
You’ll first need to set up an Amazon EKS cluster. For this demo, we’ll use eksctl
with the Cluster config file mechanism. Start by downloading these prerequisites:
With all the necessary tools installed, you can get started launching your EKS cluster. In this example, we’re deploying the cluster in us-east-2, AWS’ Ohio region; you can replace the AWS_REGION
with any region that supports Amazon EKS.
Deploy Cluster
export AWS_REGION=us-east-2
Once you’ve exported the region, you can create the ClusterConfig
as follows:
cat >cluster.yaml <<EOF
apiVersion: eksctl.io/v1alpha4
kind: ClusterConfig
metadata:
name: control-plane-metrics
region: us-east-2
nodeGroups:
- name: ng-1
desiredCapacity: 2
EOF
After the file has been created, create the cluster using the eksctl create cluster
command:
eksctl create cluster -f cluster.yaml
This will take roughly 10 – 15 minutes to complete, then you’ll have an Amazon EKS cluster ready to go.
Raw metrics
Before you can visualize, monitor and alert on your metrics, you can first look at how these metrics endpoints are output:
kubectl get --raw /metrics
These metrics are output in a Prometheus format. Prometheus is a Cloud Native Computing Foundation (CNCF) graduated project. It can scan and scrape metrics endpoints within your cluster, and will even scan its own endpoint. The syntax for a Prometheus metric is:
metric_name {[ "tag" = "value" ]*} value
This allows you to set a metric_name
, define tags
on the metric which can be used for querying, and set a value
. An example of this for the apiserver_request_count
would be:
apiserver_request_count{client="kube-apiserver/v1.11.8 (linux/amd64) kubernetes/7c34c0d",code="200",contentType="application/vnd.kubernetes.protobuf",resource="pods",scope="cluster",subresource="",verb="LIST"} 7
This tells us that there have been 7
requests to the pods
resource to LIST
.
Next, we’ll set up Prometheus using helm
.
Configuring Helm
Once the cluster is created, you can set up a helm
locally so that you don’t need to have tiller
running within your cluster. Follow the steps in the post Using Helm with Amazon EKS.
After you have completed those steps, you can deploy Prometheus.
Deploy Prometheus
First, create a Kubernetes namespace and use helm
to deploy the stable/prometheus
package:
kubectl create namespace prometheus
helm install stable/prometheus \
--name prometheus \
--namespace prometheus \
--set alertmanager.persistentVolume.storageClass="gp2",server.persistentVolume.storageClass="gp2",server.service.type=LoadBalancer
Once that is installed, you can get the Load Balancer’s address by listing services:
kubectl get svc -o wide —namespace prometheus
With this Load Balancer address, you can navigate to it in your browser, which will load the Prometheus UI. From here you can go to Status → Targets – this page will show you the Control Plane nodes:
If you can see your nodes, you can go inspect some of the metrics. Navigate to Graph and in the drop-down – insert metric at cursor – select any metric starting with apiserver_
and click Execute. This will load the last-synced data from the API Server.
Now that you can see your metrics in the Console view, you can switch over to the Graph and visualize this data:
Teardown
If you deployed a cluster specifically to run this test and you’d like to tear it down, you can do so by first deleting the prometheus
namespace, and then deleting the cluster:
kubectl delete namespace prometheus
eksctl delete cluster -f cluster.yaml
Using Prometheus, you can see what is happening within the Kubernetes API Server, and you can graph those metrics over time. You can also use Prometheus to set alerting rules which will populate the Alerts tab. With this helm
chart, you can also deploy Alertmanager, which allows you to configure alerts based on whatever alerting rules you define. Try setting some rules on your own by modifying the prometheus-server
configmap:
kubectl get configmap -n prometheus prometheus-server -o yaml
If you want to want to learn about using metrics in your own applications the same way you can in the Kubernetes API, check out the talk at KubeCon CloudNativeCon North America 2018 – Monitor the World: Meaningful Metrics for Containerized Apps & Clusters by Nicholas Turner and Nic Cope