Enhance Operational Insight by Converting the Output of any AWS SDK Commands to Prometheus Metrics
Have you ever wished you had the output of an AWS Command to enrich your dashboards or alerts? The AWS control plane contains a rich set of information that can be operationally insightful!
Recently I encountered a customer running multiple Amazon Elastic Kubernetes Service (Amazon EKS) clusters in an IP constrained environment. When a subnet ran out of IPs, pods could no longer be run and the cluster became unhealthy! We needed a way to get the remaining IP Addresses for a subnet into Prometheus and Grafana.
The remaining IP Addresses in a subnet are available through the AWS
DescribeSubnet Command. The challenge is to execute this command regularly, extract the information that we need, and make it available for Prometheus to ‘scrape’.
I created the
aws-sdk-prom-metrics tool with an easy-to-understand configuration file to solve this. For this example, we’ll solve for
DescribeSubnets, but the tool is flexible enough to be used for any AWS SDK Command!
- Create an AWS Cloud9 Environment following all of the steps in the Amazon EKS Workshop section ‘Start the Workshop…’ here.
- Create an Amazon EKS Cluster following all of the steps in the Amazon EKS Workshop section ‘Launch using eksctl’ here.
- Configure/install Amazon Managed Service for Prometheus on the cluster above by following the quickstart guide here.
- Configure Amazon Managed Grafana with the AMP environment above as a data-source. Follow the Quickstart guide here.
Clone the repository and install libraries
We’ll deploy a sample configuration that’s included with the project. This will execute the
DescribeSubnets command every minute, and it will publish the results as Prometheus metrics.
Clone the repository from GitHub and install the required NPM libraries for it to function:
Build and push the image to AWS Elastic Container Registry
These commands will create our Docker image, tag it correctly, then push it to Amazon ECR.
Update the configuration file to include deployment details
We’ll use the AWS Cloud9 Editor to modify the configuration file that we’ve been using to include deployment configuration details.
In AWS Cloud9, select the folder icon, and navigate to find the
subnet-remaining-ips.yaml file. Double-click the filename to edit the contents.
Remove the comments from the
deploymentConfig section to match the following image:
To find the correct value for
oidcProvider, run the command:
This will output something similar to the following:
Use the identifier at the end and the ‘us-east-1’ region code to update the
oidcProvider with a correct value.
To find the value for
imageUri, we can use our repository that we saved earlier. Use the value from the following output for
Finally, we must place our AWS Account ID in the value for
imageUri value will start with your AWS Account ID. Make sure that you surround the value in quotes so that YAML interprets it correctly!
When you’re finished, your configuration file will look something like this:
Deploy an IAM Role for Service Accounts (IRSA)
We use the Kubernetes SDK to create a deployment in the aws-sdk-prom-metrics namespace.
It may take a minute or two before everything is deployed. Let’s confirm using the kubectl command.
You should see output similar to this showing that everything is ready/running:
Visualize the data
Prometheus will automatically collect data from pods that contain an annotation of
prometheus.io/scrape: true. Our deployed Pods contain this annotation, so no action is required to have them ‘scraped’ by Prometheus. We should have data flowing right away!
Sign in to Amazon Managed Grafana, and use the metric explorer. You should see a metric named awssdk_remaining_ips.
Use the Grafana Import Dashboard utility to import a dashboard. Copy and paste the contents from
~/environment/sample-dashboards/subnet-remaining-ips-dashboard.json into Grafana.
You can now see all of the remaining IP Addresses by Subnet Identifier, filtered by Availability Zone!
Building your own configuration file and more information
I hope that you find this tool as useful as I have for gaining deeper insights into the operation of your AWS environments!
Note that the tool isn’t limited to
DescribeSubnets, and has the capability to run any AWS SDK Command. Find more samples in the ‘config’ folder of the GitHub repository.
For more details, and a walkthrough of how to create your own configuration files, check out the README in the GitHub repository here.
About the author: