Containers
Multi-cluster cost monitoring for Amazon EKS using Kubecost and Amazon Managed Service for Prometheus
Introduction
Amazon Managed Service for Prometheus is a Prometheus-compatible service that monitors and provides alerts on containerized applications and infrastructure at scale. In the previous post, Integrating Kubecost with Amazon Managed Service for Prometheus, we discussed how you can integrate Kubecost with Amazon Managed Service for Prometheus (AMP) to get granular visibility into your Amazon Elastic Kubernetes Service (Amazon EKS) cluster costs, letting you aggregate costs by the majority of Kubernetes contexts, starting from the cluster level down to the container level. The integration helps customers monitor a single Amazon EKS cluster without worrying about scaling the Prometheus instance. However, the complexity increases when your infrastructure grows to the size of multiple Amazon EKS clusters running across numerous regions and AWS accounts. You need to retrieve or gather the cost data from multiple endpoints to track the costs and generate reports of multiple Amazon EKS clusters for your show back or chargeback purposes. This is a time consuming and complicated process.
As part of AWS’ partnership with Kubecost, we are excited to announce this subsequent integration with Amazon Managed Service for Prometheus to help customers effectively monitor their Kubernetes costs without worrying about scaling the Prometheus instance. With the Amazon EKS optimized Kubecost bundle or with the Kubecost Enterprise License, AWS customers can now get a unified view into Kubernetes costs across multiple Amazon EKS cluster. In this post, you’ll learn how to set up cost monitoring across multiple Amazon EKS clusters in a federated view with Kubecost and Amazon Managed Service for Prometheus.
Solution overview
The architecture of this integration is similar to Amazon EKS cost monitoring with Kubecost, which is described in the previous post, with some enhancements as follows:
In this integration, an additional AWS SigV4 container is added to the cost-analyzer pod, which acts as a proxy to help query metrics from Amazon Managed Service for Prometheus using the AWS SigV4 signing process. It enables password-less authentication to reduce the risk of exposing your AWS credentials.
When the Amazon Managed Service for Prometheus integration is enabled, the bundled Prometheus server in the Kubecost Helm Chart is configured in the remote_write mode. The bundled Prometheus server sends the collected metrics to Amazon Managed Service for Prometheus using the AWS SigV4 signing process. All metrics and data are stored in Amazon Managed Service for Prometheus, and Kubecost queries the metrics directly from Amazon Managed Service for Prometheus instead of the bundled Prometheus. It helps customers to not worry about maintaining and scaling the local Prometheus instance.
There are two architectures you can deploy:
- The Quick-Start architecture supports the setup of up to 100 clusters.
- The Federated architecture supports the setup of over 100 clusters.
Quick-Start architecture
The infrastructure can manage up to 100 clusters. The following architecture diagram illustrates the small-scale infrastructure setup:
To support the large-scale infrastructure that has over 100 clusters, Kubecost uses Amazon Simple Storage Service (Amazon S3) to improve the query performance efficiently. On top of the Amazon Prometheus Workspace, Kubecost stores the Kubecost’s extract, transform, and load (ETL) data in a central Amazon S3 bucket. Kubecost’s ETL data is a computed cache based on Prometheus’s metrics, from which customers can perform all possible Kubecost queries. By storing the ETL data on an Amazon S3 bucket, this integration offers resiliency to your cost allocation data, improves the performance, and enables high availability architecture for your Kubecost setup.
The following architecture diagram illustrates the large-scale infrastructure setup:
Walkthrough
Prerequisites
- You have an existing AWS account.
- You have AWS Identity and Access Management (AWS IAM) credentials to create Amazon Managed Service for Prometheus and AWS IAM roles programmatically.
- You have an existing Amazon EKS cluster with OpenID Connect (OIDC) enabled.
- Your Amazon EKS clusters have Amazon Elastic Block Store (Amazon EBS) Container Storage Interface CSI driver installed
Create Amazon Managed Service for Prometheus workspace
Step 1: run the following command to get the information of your current EKS cluster:
Step 2: run the following command to create a new Amazon Managed Service for Prometheus workspace
The Amazon Managed Service for Prometheus workspace should be created in a few seconds. Run the following command to get the workspace ID:
Set up the environment
Step 1: set environment variables for integrating Kubecost with Amazon Managed Service for Prometheus
Run the following command to set environment variables for integrating Kubecost with Amazon Managed Service for Prometheus
Step 2: set up Amazon S3 bucket, AWS IAM policy, and Kubernetes secret for storing Kubecost ETL files
Note: You can ignore this step 2 for the small-scale infrastructure setup
a. Create Object store Amazon S3 bucket to store Kubecost ETL metrics:
Run the following command in your workspace:
b. Create AWS IAM Policy to grant access to the Amazon S3 bucket.
The following policy is for demo purposes only. You may need to consult your security team and make appropriate changes depending on your organization’s requirements.
Run the following command in your workspace:
c. Create Kubernetes secret to allow Kubecost to write ETL files to the Amazon S3 bucket.
Run the following command in your workspace:
Step 3: set up IRSA to allow Kubecost and Prometheus to read and write metrics from Amazon Managed Service for Prometheus
These following commands help to automate the following tasks:
- Create an AWS IAM role with the AWS managed IAM policy and trusted policy for the following service accounts: kubecost-cost-analyzer-amp, kubecost-prometheus-server-amp.
- Modify current Kubernetes service accounts with annotation to attach a new AWS IAM role.
Run the following command in your workspace:
For more information, you can check AWS documentation for AWS IAM roles for service accounts and learn more about Amazon Managed Service for Prometheus managed policy at Identity-based policy examples for Amazon Managed Service for Prometheus
Integrating Kubecost with Amazon Managed Service for Prometheus
Prepare the configuration file
Run the following command to create a file called config-values.yaml, which contains the defaults that Kubecost uses for connecting to your Amazon Managed Service for Prometheus workspace.
Primary cluster
Run this command to install Kubecost and integrate it with the Amazon Managed Service for Prometheus workspace as the primary:
The installation steps are similar to PRIMARY CLUSTER, except you don’t need to follow the steps in the section Create Amazon Managed Service for Prometheus workspace, and you need to update these environment variables below to match with your ADDITIONAL CLUSTERS. Please note that the AMP_WORKSPACE_ID and KC_BUCKET are the same as the Primary cluster.
Run this command to install Kubecost and integrate it with the Amazon Managed Service for Prometheus workspace as the additional cluster:
bash helm upgrade -i ${RELEASE} \ oci://public.ecr.aws/kubecost/cost-analyzer --version $VERSION \ --namespace ${RELEASE} --create-namespace \ -f https://tinyurl.com/kubecost-amazon-eks \ -f config-values.yaml \ -f https://raw.githubusercontent.com/kubecost/poc-common-configurations/main/etl-federation/agent-federated.yaml \ # Remove this line if you want to set up small-scale infrastructure --set global.amp.prometheusServerEndpoint=${QUERYURL} \ --set global.amp.remoteWriteService=${REMOTEWRITEURL} \ --set kubecostProductConfigs.clusterName=${YOUR_CLUSTER_NAME} \ --set kubecostProductConfigs.projectID=${AWS_ACCOUNT_ID} \ --set prometheus.server.global.external_labels.cluster_id=${YOUR_CLUSTER_NAME} \ --set serviceAccount.create=false \ --set prometheus.serviceAccounts.server.create=false \ --set serviceAccount.name=kubecost-cost-analyzer-amp \ --set prometheus.serviceAccounts.server.name=kubecost-prometheus-server-amp \
--set federatedETL.federator.useMultiClusterDB=true \
Monitoring costs of your multi-cluster infrastructure
Expose Kubecost dashboard
After you install Kubecost on the primary cluster and all additional clusters, you can switch back to your primary cluster and run the following command to expose the Kubecost dashboard:
On your web browser, navigate to http://localhost:9090 to access the dashboard.
You can now start monitoring your Amazon EKS cluster cost and efficiency. Depending on your organization’s requirements and setup, there are several options to expose Kubecost for ongoing internal access. You can also check this AWS workshop to learn how to expose Kubecost using AWS Load Balancer Controller.
Using Kubecost dashboard
When you access Kubecost dashboard, the default Overview view shows you comprehensive information about all Amazon EKS clusters monitored by Kubecost with Amazon Managed Service for Prometheus (active) and a list of unmonitored Amazon EKS clusters (unmonitored). You can see it in the following example screenshot:
In the Monitor/Allocation view, Kubecost provides granular visibility of your multiple Amazon EKS clusters costs aggregated by different Kubernetes context such as namespaces, controllers, pods, or labels. This help you to understand which parts of your application or projects are contributing to Amazon EKS spend. The following screenshot shows an example of Amazon EKS cluster cost aggregated by Namespace.
Additionally, to monitor your AWS services costs in one platform, you can integrate Kubecost with your AWS Cost and Usage reports and enable Cloud Costs to see the costs of each AWS service across your AWS accounts. The following example screenshot shows the cost of each AWS service in the Monitor/Cloud Costs view.
Additional usage with Amazon Managed Service for Prometheus
Because all cost metrics emitted by Kubecost are centrally stored and managed in Amazon Managed Service for Prometheus for multiple Amazon EKS clusters, you can integrate with other observability tools supported by Amazon Managed Service for Prometheus to utilize that data. For example, you can write custom cost related PromQL queries and visualize it on Amazon Managed Grafana ,or use Alert Manager in multi-cluster mode. You can learn more about these integrations at Using AWS Observability Accelerator. To learn more about the Amazon Managed Service for Prometheus service quotas, you can refer to the documentation at Amazon Managed Service for Prometheus service quotas.
Cleaning up
Conclusion
In this post, we showed you how you can use Kubecost to monitor multi-cluster Amazon EKS environments using Amazon Managed Service for Prometheus as the metrics store so you don’t have to worry about managing your own infrastructure to store Kubecost data. In collaboration with Kubecost, we’re excited to release this new feature that allows you to monitor and track multiple Amazon EKS clusters costs in a single pane of glass. This setup offers rich features exclusively to Amazon EKS customers with no additional Kubecost license required, and includes Kubecost troubleshooting support. If you have Kubecost’s Enterprise license, additional features are enabled, such as Governance features that allow you to set budget rules for different projects or audit the costly deployments on your Amazon EKS cluster. The enterprise licenses are available from Kubecost or through AWS Marketplace. If you would like to learn more from the Kubecost team, contact them here.
Other useful resources for AWS Observability:
- Hands-on workshop for all AWS Observability services – One Observability Workshop
- Terraform based easy to use Observability setup for Amazon EKS – AWS Observability Accelerator
- AWS Observability Best Practices Guide
Linh Lam, Solutions Architect, Kubecost
Linh Lam is a Kubecost Solution Architect, ISV, focusing on integration and building solutions for customers. He is also passionate about application modernization, serverless, and container technology. Outside of work he enjoys hiking, camping, and building his home audio systems.