AWS Cloud Operations Blog
Using Curated Packages and AWS managed Open Source services to observe your On Premises Kubernetes environment
Customers who run containerized workloads on Kubernetes clusters on their hardware use Amazon EKS Anywhere (Amazon EKS-A). Customers look for prescriptive guidance for the observability of their modern applications running on EKS-A. Using AWS-managed open-source services such as AWS Distro for OpenTelemetry (ADOT), Amazon Managed Service for Prometheus, and Amazon Managed Grafana helps customers to offload the operational burden of managing the infrastructure of observability tooling.
Amazon EKS-A curated packages are trusted, up-to-date, and compatible software supported by Amazon to extend your EKS-A cluster’s functionality while reducing the need for multiple vendor support agreements. ADOT now supports EKS-A curated package which is an OpenTelemetry collector providing a vendor-agnostic solution to receive, process, and export telemetry data. It removes the need to run, operate, and maintain multiple agents/collectors. ADOT Collector is an AWS-supported distribution of the OpenTelemetry Collector. OpenTelemetry collector provides a vendor-agnostic solution to receive, process, and export telemetry data. It removes the need to run, operate, and maintain multiple agents/collectors. ADOT Collector is an AWS-supported distribution of the OpenTelemetry Collector.
The grafana-operator is a Kubernetes operator built to help you manage your Grafana instances inside Kubernetes. Grafana Operator allows you to manage and create Grafana dashboards, data sources, etc., declaratively between multiple instances easily and scalable. The Grafana operator now supports managing resources such as dashboards, data sources, etc., hosted on external environments like Amazon Managed Grafana. GitOps manages application and infrastructure deployment so that the system is described declaratively in a Git repository. It is an operational model that allows you to manage the state of multiple Kubernetes clusters by leveraging the best practices of version control, immutable artifacts, and automation. Flux is a GitOps tool that automates the deployment of applications on Kubernetes. It works by continuously monitoring the state of a Git repository and applying any changes to a cluster. So Grafana Operator ultimately enables us to use GitOps mechanisms using CNCF projects such as Flux to create and manage the lifecycle of resources in Amazon Managed Grafana from Amazon EKS-A cluster.
In this post, we will show you how to use ADOT EKS-A curated package, AWS managed open source services and Grafana-operator to observe your on-premises Kubernetes cluster.
Solution Overview
Solution Walkthrough
In this solution, we start with using ADOT EKS-A curated package to remote write Prometheus-compatible metrics from your EKS-A cluster to Amazon Managed Service for Prometheus. We will then use GitOps mechanisms with Flux and Grafana Operator from your EKS-A cluster to create and manage Grafana resources such as dashboards, data sources, etc., hosted on external environments like Amazon Managed Grafana to visualize metrics from your on-premises Kubernetes cluster.
Prerequisites
Ensure the following prerequisites are complete:
- A Linux-based host machine using Amazon EC2 instance, Cloud9 instance, or a local machine with access to your AWS account.
- Ensure your AWS account has access to EKS Anywhere curated packages. If not, please follow EKS Anywhere curated package management to get a subscription.
- Configure admin access to EKS Anywhere cluster from the host machine.
- Configure IAM Roles for Service Account (IRSA) on EKS Anywhere cluster.
- An existing Amazon Managed Grafana Workspacein your AWS account.
- Install the following tools on the host machine:
- AWS CLI version 2 to interact with AWS services using CLI commands.
- Helm to deploy and manage Kubernetes applications
- kubectl to communicate with the Kubernetes API server
- eksctl and eksctl anywhere to create and manage EKS Anywhere cluster
- Git to clone the required source repository from GitHub
- curl to make HTTP requests
- envsubst to substitute environment variables in shell
Setup Environment
Set the following environment variables:
Ensure pod-identity-webhook is deployed in observability
namespace at which ADOT will be deployed. If not, follow IAM Roles for Service Accounts configuration steps to deploy the same.
Setting up Amazon Managed Service for Prometheus
Here, we will deploy a curated ADOT package with a configuration to write metrics to Amazon Managed Service for Prometheus (AMP). Start with creating Amazon Managed Service for Prometheus workspace, using the command:
Set the following environment variables with values from Amazon Managed Service for Prometheus workspace created:
Then, run the steps to create an IAM role that grants fine-grained permission to AMP workspace with the OIDC provider as a trusted entity to assume this role.
Deploy AWS Distro for OpenTelemetry (ADOT) curated package for EKS Anywhere
Create a service account for ADOT in EKS Anywhere cluster.
The pod-identity-webhook deployment in observability
namespace should be complete before proceeding to the next step.
Create an ADOT package configuration file with AMP. See the ADOT configuration for more details.
Validate installation using the command.
Installing External Secrets Operator
We will set up External Secrets Operator to securely access Amazon Managed Grafana workspace API key.
Follow the steps to create the Amazon Managed Grafana workspace API key and create secret /eksa/amg-api-key
in AWS Secrets Manager.
Install External Secrets Operator using the command:
Confirm installation using the command:
Then, create IRSA for accessing AWS Secrets Manager secret with fine-grained access.
Then, create a service account for ExternalSecret.
Create ClusterSecretStore with service account-based authentication
Verify ClusterSecretStore status using command
Then, create ExternalSecret in grafana-operator
namespace with a secret target name as grafana-admin-credentials
. This configuration will sync Kubernetes secret grafana-admin-credentials with AWS Secrets Manage secret /eksa/amg-api-key
every hour. Grafana Operator expects this secret to be available through data key GF_SECURITY_ADMIN_APIKEY
.
Validate configuration using the command
We can verify the value of Kubernetes secrets synched using the command.
If we need to force sync for any troubleshooting reasons, then run the commands.
Installing Grafana Operator
Install Grafana Operator in namespace grafana-operator
Verify installation by using command:
Installing Prometheus Node Exporter
Run the command to deploy prometheus-node-exporter to generate various metrics.
Verify the prometheus-node-exporter status using the command.
GitOps with Amazon Managed Grafana
We will use GitOps sync via Flux to create Grafana Datasources and Dashboards in Amazon Managed Grafana using Grafana Operator. Deploy Flux in your EKS Anywhere cluster using the command:
Use the declarative code snippet from One-Observability-demo GitHub repo to create data sources for Amazon Managed Service for Prometheus and dashboards in Amazon Managed Grafana . This snippet needs variables such as AMG_AWS_REGION, AMP_ENDPOINT_URL, AMG_ENDPOINT_URL, and GRAFANA_NODEEXP_DASH_URL
with required values. We will use Flux Post build variable substitution to dynamically render these variables from a ConfigMap and avoid hardcoding values in manifest files.
Then, set One-Observability-demo GitHub repo as source GitRepository in Flux and verify using the commands.
Next, setup Kustomization for Flux to sync GitRepository and verify using the following commands.
Check the identity of Amazon Managed Grafana created and status using the command.
Verify the data source configuration and status using the command. We should see the Amazon Managed Service for Prometheus endpoint and no errors in the status message as shown:
Also, verify the Grafana Dashboards status using the command.
Then, let us navigate to the Amazon Managed Grafana console and verify the data source grafana-operator-amp-datasource
created by grafana-operator.
Click and open the grafana-operator-amp-datasource
, scroll to the bottom, and click “Save & test”.
Finally, let’s navigate to the Amazon Managed Grafana console, click on Search Dashboards, and you will be able to see a Dashboard by the name Grafana Operator - Node Exporter / Nodes
. Click it, set the data source to grafana-operator-amp-datasource
, and view Grafana Dashboard created out of the box having all the metrics from Prometheus Node Exporter installed on your Amazon EKS Anywhere Cluster.
Clean up
We continue to incur costs until deleting the infrastructure created for this post. Use the commands to delete resources created during this post.
Conclusion
In this post, you learned how to use ADOT EKS-A curated package to remote write Prometheus-compatible metrics from your EKS-A cluster to Amazon Managed Service for Prometheus. Further, We used GitOps mechanisms with Flux and Grafana Operator from your EKS-A cluster to create Grafana-managed resources such as dashboards, data sources, etc., hosted on external environments like Amazon Managed Grafana to visualize metrics from your on-premises Kubernetes cluster. Please read our blog on Using Open Source Grafana Operator on your Kubernetes cluster to manage Amazon Managed Grafana if you want to implement a similar solution on your Amazon EKS cluster on AWS Cloud.
To learn more about AWS Observability services, check the resources below:
- AWS Observability Best Practices Guide
- One Observability Workshop
- Terraform AWS Observability Accelerator
- CDK AWS Observability Accelerator
- EKS Anywhere curated package management
- Blue/Green Kubernetes upgrades for Amazon EKS Anywhere using Flux
- Monitoring Amazon EKS Anywhere using Amazon Managed Service for Prometheus and Amazon Managed Grafana