AWS Open Source Blog

Monitor AWS services used by Kubernetes with Prometheus and PromCat

AWS offers Amazon CloudWatch to provide observability of the operational health for your AWS resources and applications through logs, metrics, and events. CloudWatch is a great way to monitor and visualize AWS resources metrics and logs. Recently I’ve found that some customers are adopting Prometheus as their monitoring standard because it offers the ability to gather metrics from any source via exporters, including providing detailed metrics specific to Kubernetes and other cloud-native tools. At the same time, those customers are asking about CloudWatch because AWS services are being used under the hood by some of these tools, such as the AWS ALB ingress controller and the AWS CSI-Storage Driver, and they want the ability to monitor them in one single pane of glass.

In this post, I will walk through recent open source contributions to YACE (Yet Another CloudWatch Exporter) made by SysDig. YACE is a Prometheus exporter designed to pull in CloudWatch metrics and to enrich your existing Prometheus setup with AWS service metrics. SysDig is making these enhancements available through PromCat, an open source resource catalog for Prometheus monitoring. I’ll also look at examples of how to take advantage of these integrations including:

  • How to monitor services such as AWS Elastic Load Balancer (ELB) using the provided exporter configurations
  • What types of metrics are available through CloudWatch for the supported AWS services
  • How to visualize the Prometheus and CloudWatch metrics using pre-built Grafana and Sysdig dashboards

Sysdig is an the AWS Partner Network (APN) Advanced Technology Partner with AWS competency in containers, and has a strong open source background, with engineers who have co-founded projects such as Wireshark and Falco.

Prerequisites

First you’ll need to set up an Amazon Elastic Kubernetes Service (Amazon EKS) cluster. For this demo, we’ll use eksctl to launch the cluster. Start by downloading these prerequisites:

With the necessary tools installed, launch your Amazon EKS cluster. In this example, we’re deploying the cluster in us-east-1, the AWS Virginia region; you can replace the AWS_REGION with any region that supports Amazon EKS.

Deploying cluster

After the file has been created, create the cluster using the eksctl create cluster command. This will create the control plane and two worker nodes by default.

eksctl create cluster promcat --region=us-east-1

This will take roughly 10–15 minutes to complete, then you’ll have an Amazon EKS cluster ready to go.

To test that your installation finished correctly, and you should see a service listed.

kubectl get svc

Installing Helm

Once the cluster is created, you can set up a helm locally by following the steps in the post “Using Helm with Amazon EKS.” After you have completed those steps, you can deploy Prometheus and Grafana.

Installing Yace exporters

YACE—or “Yet another CloudWatch Exporter“—is an open source project created by InVision to help standardize pulling AWS service metrics into Prometheus. Benefits of YACE when compared to existing CloudWatch exporters include:

  • Auto discovery of resources via tags
  • Automatic adding of tag labels to metrics
  • CloudWatch metrics with timestamps (disabled by default)
  • Concurrency options
  • Batch Requests to reduce API throttling

First let’s create an AWS Identity and Access Management (IAM) service user for YACE. Then let’s install a YACE secret by creating a Base64-encoded hash of the service IAM user’s credentials. I added the credentials to a file locally: ~/.aws/yace_user. Add this policy to the service user in the IAM console to allow gathering of ELB metrics from CloudWatch. This can also be copied from PromCat under the “AWS permissions” section on each exporter.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "CloudWatchExporterPolicy",
            "Effect": "Allow",
            "Action": [
                "tag:GetResources",
                "cloudwatch:ListTagsForResource",
                "cloudwatch:GetMetricStatistics",
                "cloudwatch:GetMetricData",
                "cloudwatch:ListMetrics"
            ],
            "Resource": "*"
        }
    ]
}

Next Base64 encodes that file to get a hash.

cat ~/.aws/yace_user | base64

Next create a file called secret.yaml with the exporter secret from the previous hash command and put it into the credentials data. Then apply the secret and check to see whether it exists.

apiVersion: v1
kind: Secret
metadata:
  name: yace-credentials
  namespace: monitoring
data:
  # Add in credentials the result of:
  # cat ~/.aws/credentials | base64
  credentials: |
    <REPLACE_WITH_YOUR_HASH>

kubectl apply -f secret.yaml --namespace monitoring
kubectl get secret yace-credentials --namespace monitoring

Now we will create a YACE exporter deployment for each service to monitor. For this demo, we monitor an AWS classic ELB, and apply a file from the sample GitHub repo. This will create a volume map with the yace-credential secret we created above to grant access to the exporter.

kubectl apply -f https://raw.githubusercontent.com/jonahjon/promcat/master/yace/yace_elb.yaml --namespace monitoring
kubectl get deployment yace-elb --namespace monitoring

This file is currently set to pull metrics from region us-east-1. You can change the region config parameter in the yace_elb.yaml.

To verify the Prometheus metrics we are pulling, let’s try out the metrics endpoint on the deployment. We will port forward the deployment locally and navigate to http://localhost:5000/metrics:

port-forward -n monitoring deploy/yace-elb 5000:5000

Awesome, we can see our metrics are being pulled by the exporter and exposed for Prometheus.

Installing Prometheus

Next we are going to create a monitoring namespace and deploy Prometheus to that namespace.

kubectl create namespace prometheus
helm install stable/prometheus \ 
             --name prometheus \
             --namespace prometheus \
             --set alertmanager.persistentVolume.storageClass="gp2",server.persistentVolume.storageClass="gp2",server.service.type=LoadBalancer

Let’s grab the load balancer URL and navigate to Prometheus to check that the auto-discovery mechanism is pulling in the YACE exporter ELB metrics by querying for the “aws_elb_latency_p95” metric.

It looks good. Next let’s set up a Grafana dashboard with the help of PromCat to monitor the ELB metrics.

Installing Grafana

Let’s install Grafana with the pre-loaded PromCat ELB golden Signal dashboard installed. This lets us visualize Prometheus metrics pulled from the previous steps. In the install, set a dashboard URL as a pre-configured one, and set the service to use load balancer type too.

helm upgrade -i grafana stable/grafana \
    --namespace monitoring \
    --set persistence.storageClassName="gp2" \
    --set adminPassword='P@ssword123!' \
    --set datasources."datasources\.yaml".apiVersion=1 \
    --set datasources."datasources\.yaml".datasources[0].name=Prometheus \
    --set datasources."datasources\.yaml".datasources[0].type=prometheus \
    --set datasources."datasources\.yaml".datasources[0].url=http://prometheus-server.monitoring.svc.cluster.local \
    --set datasources."datasources\.yaml".datasources[0].access=proxy \
    --set datasources."datasources\.yaml".datasources[0].isDefault=true \
    --set dashboardProviders."dashboardproviders\.yaml".apiVersion=1 \
    --set dashboardProviders."dashboardproviders\.yaml".providers[0].name=default \
    --set dashboardProviders."dashboardproviders\.yaml".providers[0].orgId=1 \
    --set dashboardProviders."dashboardproviders\.yaml".providers[0].folder="" \
    --set dashboardProviders."dashboardproviders\.yaml".providers[0].type=file \
    --set dashboardProviders."dashboardproviders\.yaml".providers[0].disableDeletion=false \
    --set dashboardProviders."dashboardproviders\.yaml".providers[0].editable=true \
    --set dashboardProviders."dashboardproviders\.yaml".providers[0].options.path=/var/lib/grafana/dashboards/default \
    --set dashboards.default.elb.url="https://raw.githubusercontent.com/sysdiglabs/promcat-resources/master/resources/aws-elb/dashboards.yaml" \
    --set service.type=LoadBalancer

Now let’s grab the load balancer URL, navigate to it in the browser, and log in using the demo password from the command:

kubectl get svc --namespace monitoring grafana -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

Username:Admin
Password:P@ssword123!

After logging in, navigate to Dashboards > AWS ELB Golden Signals and the metrics about the ELB’s in the current AWS region from Prometheus are displayed.

Cleanup

To avoid incurring future charges to your AWS accounts, delete the resources created in your AWS account for this project:

helm delete grafana
helm delete prometheus
eksctl delete cluster promcat

Summary

In this post, we walked through an overview of how the Sysdig PromCat integration with Amazon EKS works so you can:

  • Use the YACE Exporter in Amazon EKS to gather metrics from AWS services.
  • View AWS resource metrics from prometheus launched in Amazon EKS.
  • Use PromCat-provided dashboards to visualize AWS service metrics scraped by YACE.

Next steps

Check out PromCat for an Enterprise Grade Prometheus Catalog, or the YACE GitHub to see the configuration options it provides for 24 AWS services. For more information on why and how Sysdig contributed to YACE and built PromCat, check out their engineering blog post.

To contribute to PromCat, visit the PromCat Resources on Github. Learn more about the AWS containers roadmap on GitHub.

Feature image via Pixabay.

TAGS:
Jonah Jones

Jonah Jones

Jonah is a Solutions Architect at AWS working within the APN Program and with container partners to help customers adopt container technologies. Before this Jonah worked in AWS ProServe as a DevOps Consultant, and several startups in the NorthEast. You can find him on Twitter at @jjonahjon