How to track costs in multi-tenant Amazon EKS clusters using Kubecost

Many AWS customers use Amazon Elastic Kubernetes Service (Amazon EKS) to operate multi-tenant Kubernetes clusters where workloads that belong to different teams or projects run in a shared cluster. Customers like that Kubernetes offers centralized management of workloads, enabling administrators to create, update, scale, and secure workloads using a single API. In this post we demonstrate the options for breaking down the operating costs of EKS clusters.

Customers have shared with us the importance of cost allocation by tenant in EKS clusters. We are working on providing a native way to track and allocate costs in multi-tenant Kubernetes environments. Today, you can use one of these methods to distribute costs by tenant:

Hard multi-tenancy — Run separate EKS clusters in dedicated AWS accounts.
Soft multi-tenancy — Run multiple node groups in a shared EKS cluster.
Consumption based billing — Use resource consumption to calculate the cost incurred in a shared EKS cluster.

Hard multi-tenancy

Running workloads in separate AWS accounts is the easiest way to track costs. By creating a separate EKS cluster for each tenant in dedicated accounts, you can identify the cost incurred for the cluster and its dependencies without having to run reports to determine each tenant’s spend.

However, this requires maintaining multiple EKS clusters, which may increase your AWS spend and make network architecture complex. If services in the cluster need to communicate with services running in other accounts, you will need to provide connectivity between these accounts. Not only will you pay for multiple EKS control planes, but you will also incur network transfer charges whenever traffic passes the VPC boundary.

You also lose the efficiency benefits of a shared cluster; you will have to run shared components that you use for monitoring, logging, and networking (like service mesh) in each cluster. Due to these reasons, this option can be disadvantageous unless you need a separate billing that cannot include anything but the costs incurred by this workload.

Soft multi-tenancy

Another method to enable cost allocation is sharing a cluster in which each tenant gets its own dedicated node group. You can use Kubernetes features like Node Selectors and Node Affinity to instruct Kubernetes Scheduler to run a tenant’s workload on dedicated node groups. You can tag the EC2 instances in a node group with an identifier (like product name or team name) and use tags to distribute costs.

A downside of this approach is that you may end up with unused capacity in each node group and may not fully utilize the cost savings that come when you run a densely packed cluster.

With this method, you will still need to allocate the cost of shared resources like the EKS control plane, shared cluster-level services (used for logging, monitoring, governance, etc.), Elastic Load Balancing (ELB), NAT Gateways, and network transfer charges.

Consumption based billing

The most efficient way to track costs in multi-tenant Kubernetes clusters is to distribute incurred costs based on the amount of resources consumed by workloads. This pattern allows you to maximize the utilization of your EC2 instances because different workloads can share nodes, which allows you to increase the pod-density on your nodes.

However, calculating costs by workload or namespaces is a challenging task. To determine the resources used by a group of pods, you have to aggregate the compute resource usage (used or reserved CPU, memory, disk) for a given period and calculate costs. But containers are generally short-lived and may scale frequently, and the actual resource usage fluctuates over time. In other words, you cannot simply take a day’s usage and multiply it by thirty to estimate the monthly bill. Understanding the cost-responsibility of a workload requires aggregating all the resources consumed or reserved during a timeframe, and evaluating the charges based on the cost of the resource and the duration of the usage. Much easier said than done.

Kubecost is one such tool that lets you distribute the cost of running your Kubernetes cluster based on resource usage. Following the multi-tenancy best practices, you can create a namespace for each tenant and use Kubecost to determine each tenant’s cost responsibility. It allows you to drill down expenditure by Service, Deployment, namespace, label, pod, container, team, and product.

With Kubecost, you can track the spend by tenants without creating separate clusters or separating workloads by worker nodes. It provides an easy to implement and a cost-effective way to perform chargebacks in multi-tenant Kubernetes clusters.

About Kubecost

Kubecost, an open-core tool by Stackwatch, provides cost monitoring and capacity management solutions. One of its primary use cases is to give cost visibility across Kubernetes clusters. It uses three metrics to determine the cost of a workload:

Time in running state
Resources consumed or reserved
Price of resources consumed or reserved

The open source version of Kubecost limits pricing data storage to fifteen days. To retain metrics beyond fifteen days, you’ll have to upgrade to the paid version.

Prometheus

The default installation of Kubecost includes an optimized Prometheus server that only contains metrics that are useful to Kubecost. This optimized version retains 70-90% fewer metrics than a standard Prometheus deployment. You can also use an existing Prometheus installation.

Using Kubecost

Kubecost runs in your cluster. The steps below will guide you through the installation process.

Create a namespace for Kubecost’s components:

kubectl create namespace kubecost

Helm is the recommended way to install Kubecost. See using Helm with Amazon EKS if you don’t have Helm installed. If you have an aversion to Helm, Kubecost also offers other installation methods. The steps in this tutorial use Helm 3.

Add Kubecost Helm repository:

helm repo add kubecost https://kubecost.github.io/cost-analyzer/

Before proceeding, you need to get a unique token by visiting kubecost.com/install. You will have to enter your token during installation.

Install Kubecost:

helm install kubecost kubecost/cost-analyzer \
    --namespace kubecost \
    --set kubecostToken="<Your kubecostToken>"

The custom values that you can use with this Helm chart are found here.

If your cluster already has Prometheus installed, you can customize the Helm chart to skip kube-state-metrics and node-exporter installation. You can read more about using custom Prometheus with Kubecost here.

helm update kubecost kubecost/cost-analyzer --namespace kubecost \
--set kubecostToken="<Your kubecostToken>" \
--set prometheus.kubeStateMetrics.enabled=false \
--set prometheus.nodeExporter.enabled=false

Kubecost, by default, uses AWS public pricing to calculate the cost of running the cluster (control plane and data plane). It allows you to slice and dice costs by various dimensions like namespace, service, deployment, etc. Additionally, by attaching an IAM policy and granting access to the Cost and Usage Report, Kubecost can also reflect effective Reserved Instance and Savings Plans rates.

Kubecost dashboard

Kubecost provides a web dashboard that you can access either through kubectl port-forward, an ingress, or a load balancer. An Ingress with basic authentication is used in this tutorial. The paid version of Kubecost also supports restricting access to the dashboard using SSO/SAML and providing varying level of access. For example, restricting team’s view to only the products they are responsible for.

You can access the Kubecost dashboard from your device using method #1:

kubectl port-forward --namespace kubecost \
    deployment/kubecost-cost-analyzer 9090

The dashboard will be accessible at http://localhost:9090.

Install NGINX Ingress

You can use NGINX Ingress to make the Kubecost dashboard available to those that don’t have access to kubectl.

Install the NGINX Ingress Controller:

# Add Helm stable repo
helm repo add stable https://kubernetes-charts.storage.googleapis.com

# Install nginx-ingress 
helm install example-ingress stable/nginx-ingress -n kubecost

I don’t have a registered domain to use for the Kubecost dashboard, so I had to change the host in the Ingress spec to match the ELB DNS name.

Export the DNS name of NGINX Ingress’s ELB:

export ELB=$(kubectl get svc -n kubecost example-ingress-nginx-ingress-controller \
    -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')

Create an Ingress:

echo "
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/auth-realm: Authentication Required - ok
    nginx.ingress.kubernetes.io/auth-secret: kubecost-auth
    nginx.ingress.kubernetes.io/auth-type: basic
  labels:
    app: cost-analyzer
    app.kubernetes.io/instance: kubecost
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: cost-analyzer
    helm.sh/chart: cost-analyzer-1.60.1
  name: kubecost-cost-analyzer
  namespace: kubecost
spec:
  rules:
  - host: $ELB
    http:
      paths:
      - backend:
          serviceName: kubecost-cost-analyzer
          servicePort: 9090
        path: /
" | kubectl apply -f -

We will use basic auth to restrict access to the dashboard. Create a password file:

htpasswd -c auth kubecost-admin
New password:
Re-type new password:
Adding password for user kubecost-admin

Create a secret from the password file:

kubectl create secret generic \
    kubecost-auth \
    --from-file auth \
    -n kubecost

The final ingress configuration should look like this (some lines have been removed from the output):

kubectl get ingresses. kubecost-cost-analyzer -o yaml -Kubecost provides a web dashboard that you can access either through kubectl port-forward or you can use an ingress or a load balancer to expose it outside the cluster. An ingress with basic authentication is used in this tutorial. The paid version of Kubecost also supports restricting access to the dashboard using SSO/SAML (http://docs.kubecost.com/user-management.html)and providing varying level of access.n kubecost

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/auth-realm: Authentication Required - ok
    nginx.ingress.kubernetes.io/auth-secret: kubecost-auth
    nginx.ingress.kubernetes.io/auth-type: basic
  labels:
    app: cost-analyzer
    app.kubernetes.io/instance: kubecost
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: cost-analyzer
    helm.sh/chart: cost-analyzer-1.60.1
  name: kubecost-cost-analyzer
  namespace: kubecost
spec:
  rules:
  - host: aa777bde02780402173825f8fb4c6c6-1099488085.us-west-2.elb.amazonaws.com
    http:
      paths:
      - backend:
          serviceName: kubecost-cost-analyzer
          servicePort: 9090
        path: /
status:
  loadBalancer:
    ingress:
    - ip: 192.168.130.94

You can now access the Kubecost dashboard using the Ingress. You can access your dashboard by going to ELB’s address. You can obtain the address like this:

echo https://$ELB

You will be prompted to accept the certificate depending on your browser.

The dashboard allows you to drill down monthly run-rate by a dimension like namespace. It also supports exporting reports in CSV format.

You can run reports to view charges incurred by services in a shared cluster. Kubecost offers saved reports where you can access commonly viewed reports for chargebacks.

Cost optimization recommendations

Kubecost also presents cost-savings opportunities. It can help you locate unused volumes, over-provisioned replicas, or abandoned workloads. The abandoned workloads section under Savings shows pods that have not received any meaningful traffic. You can adjust the traffic threshold and time duration to find pods that may not be actively used.

Kubecost provides accurate Spot pricing using the AWS Spot Instance data feed. It can also assist with pod right-sizing. It tracks the declared requests for containers and provides recommendations based on usage. The included Grafana dashboards show you resource utilization in your cluster.

Cluster right-sizing

Kubecost provides right-sizing recommendations based on Kubernetes-native metrics. It uses two primary inputs: first, your own description of the type of work running on the cluster (e.g. development, production, high availability); second, it detects the historical “shape” of each workload’s resource requirements, as measured by Kubecost metrics. The product then considers different heuristic or bin-packing strategies for meeting the cluster’s requirements.

Conclusion

We showed you how to use Kubecost to track costs in a shared EKS environment. If you opt for hard multi-tenancy, you can either track cost separately using a different instance of Kubecost per cluster, or upgrade to Kubecost paid support, which provides cost-visibility across multiple clusters using the same dashboard.

According to Nathan Taber, PM for EKS, “Customers should think of multi-tenancy options as a spectrum where you balance between operational simplicity and control. On one side, having a single cluster is operationally simple, but more difficult to audit and control. On the other side, having one cluster per account is easy to audit and control, but can create operational complexities if you have a lot of clusters. The right decision for you will depend on the number of clusters you need to run, the structure of your organization, and the operational tools you are willing to adopt and maintain to manage multiple clusters and accounts.” Whichever pattern you go with, you can use Kubecost to understand, allocate and reduce costs. Kubecost is available in the AWS Marketplace.