Understand Your Amazon EKS Spend and Enable FinOps for Kubernetes with Anodot
By Roni Karp, CTO – Anodot
By David Feldstein, Sr. Containers Specialist – AWS
As the adoption of Amazon Elastic Kubernetes Service (Amazon EKS) for running Kubernetes clusters grows, customers are seeking ways to better understand and control their costs.
Analyzing, optimizing, and tying costs to business KPIs is a challenge that interests many Amazon Web Services (AWS) customers. In this post, you will learn how to gain visibility into your containerized application costs, optimize usage, and form a shared language for DevOps and finance teams—a practice often referred to as FinOps.
Amazon EKS gives you the flexibility to run containerized applications on Kubernetes at scale, in the cloud or on premises when using Amazon EKS Anywhere. Amazon EKS provides highly-available and secure Kubernetes clusters while automating key tasks such as security patching, node provisioning, and cluster updates.
While Amazon EKS simplifies Kubernetes operations tasks, customers are also interested in understanding the cost drivers for containerized applications running on EKS, and learning best practices to control and optimize those costs.
To address these customer needs and shed more light on Amazon EKS costs, AWS has collaborated with Anodot, an AWS Partner that uses machine learning (ML) to autonomously monitor and alert on costs in near real time, and provides recommendations to help customers quickly resolve cost anomalies.
Understanding Amazon EKS Pricing
The Amazon EKS pricing model is straightforward and contains two major components: You pay $0.10 per hour for each Amazon EKS cluster you configure, and pay for the AWS resources (compute like Amazon EC2 instances, and storage like Amazon EBS volumes) that you create within each cluster to run your Kubernetes worker nodes.
Let’s consider an example: you run your containerized workloads on a single EKS cluster in N. Virginia with four Kubernetes worker nodes on Amazon Elastic Compute Cloud (Amazon EC2). If your worker nodes are C6i.2xlarge EC2 instances with 30 GB EBS storage, your monthly cost breakdown is as follows:
Total monthly cost for the EKS cluster = 1 cluster + C6i.2xlarge x 4 = $12,000
This estimate was generated via AWS Pricing Calculator, which provides only an estimate of your AWS fees and doesn’t include any taxes that may apply. Your actual fees depend on a variety of factors, including your actual usage of AWS services.
When it comes to calculating actual costs, the reality is a little more complex, as you may not have a static number of worker nodes. The number of nodes may change depending on how you scale your workload. Cluster integrations, such as Cluster Autoscaler and Karpenter, alter the number of worker nodes in your cluster based on your workloads and their configurations.
Understanding the Cost Impact of Each Kubernetes Component
You can think of Kubernetes as a hierarchy of clusters, nodes, and pods. A cluster is a group of nodes (EC2 instances) and pods (your application) that run on these nodes.
Figure 1 – Kubernetes hierarchy.
Your Kubernetes costs within Amazon EKS are driven by the following components:
- Clusters: When you deploy an Amazon EKS cluster, AWS creates, manages, and scales the control plane nodes for you. You can use features like Managed Node Groups to create worker nodes for your clusters. These worker node(s) host pods that are the components packaging containerized applications.
- Nodes: Nodes are the actual Amazon EC2 instances that pods run on. Node resources (CPU and memory) are divided into resources needed to run the operating system; resources needed to run the Kubernetes agents such as Kubelet and container runtime; resources reserved for the eviction threshold; and resources available for your pods to run containers and your applications on.
- Pods: These are the smallest deployable units you can create and manage in Kubernetes. A pod is a group of one or more containers, with shared storage and network resources, and a specification for how to run the containers. When configuring pods, you should specify the number of resources a container requests (vCPU, RAM, and disk) as well as the limit for each resource each container can utilize.
- Other components: Additional Kubernetes components that impact your clusters are:
- Kubelet: Runs on each node and ensures each pod runs per predefined pod specifications.
- Control plane: Makes global decisions about the cluster and is responsible for responding to events such as scheduling, scaling and more.
- kube-scheduler that’s responsible for deploying and scheduling new pods and selecting the nodes to run them on and the HorizontalPodAutoscalar (HPA), a control plane component responsible for scaling pods up and down to match demand.
- etcd: A control plane component that stores pods specifications such as the minimum and maximum number of pods per cluster.
- HorizontalPodAutoscaler (HPA): A control plane component responsible for scaling pods up and down to match.
Although each of these components impact how many nodes you will use in your Amazon EKS deployment, the nodes (EC2 instances) are the only components for which you directly pay, other than the normal hourly rate for the cluster.
How Resource Requests Impact Your Kubernetes Costs
Pod resource requests are the primary driver of the number of EC2 instances needed to support your cluster. When configuring a pod, you specify resource requests and limits for vCPU and memory.
When a pod is deployed on a node, the requested resources are allocated and become unavailable to other pods deployed on the same node. Once a node’s resources are fully allocated, a cluster autoscaling tool (often Cluster Autoscaler or Karpenter) will spin up a new node to host additional pods.
Let’s assume you are using a C6i.large instance (powered with 2 vCPUs and 4 GiB RAM) as a cluster node and that 2 GiB of RAM and 0.2 vCPUs are used by the operating system (OS), Kubernetes agents, and other Kubernetes resources. In such a case, the remaining 1.8 vCPU and 2 GiB of RAM are available for running pods.
If you request 0.5 GiB of memory per pod then you’ll be able to run up to four pods on this node (Figure 2 – Image 1 below). When a fifth pod is needed, a new node will be added to the cluster adding to your costs. If you request 0.25 GiB of memory per pod, you’ll be able to run eight pods on each instance (Figure 2 – Image 2 below).
Figure 2 – Two examples of how to stack pods on a C6i.large instance.
Incompletely configuring resource specifications can impact the node costs within your cluster:
- If you specify a container memory limit but do not specify a memory request, Kubernetes will automatically assign a memory request value that matches the limit. This could potentially consume more node capacity than required, and eventually causing unneeded nodes to be added to your cluster.
- If you specify a CPU limit but do not specify a CPU request value, Kubernetes automatically assigns a request value to match the limit. This results in more resources being assigned to each container than necessarily required, consuming node capacity and unnecessarily increasing the total number of nodes in your cluster.
Tying Kubernetes Investments to Business Value
In practice, understanding the business value of even a relatively simple Kubernetes deployment can be difficult. This is partially because the DevOps team managing your clusters and the finance stakeholders responsible for quantifying value rarely share a common language or equivalent technical understanding, and partially due to the inherent onion-like complexity of these environments.
The following questions can help teams codify indistinct concerns into shared deliverables to be addressed through FinOps collaboration:
- How can you track the overall cost efficiency of your Kubernetes deployment?
- How can you determine how well your pods and nodes are being utilized and set organizational and application-level benchmarks for utilizing node capacity?
- How can you identify scenarios in which pods request more resources than actually required and take action to ensure more prudent usage?
- How can you get a view of each application’s individual total costs, even though they likely share the same nodes with other applications, and possibly serve multiple customers from the same node?
Start by understanding the tagging paradigm when it comes to containers and Kubernetes. When using Amazon EC2 for non-containerized applications, you can tag EC2 instances and other resources supporting this application with an application tag. Later, you can sum up the overall costs of your application by using AWS Cost Explorer to group and report on everything associated with a specific application tag.
Within Kubernetes, however, this technique does not work as pods supporting different applications or customers run on shared EC2 instances. This means you need a different mechanism to help break down the resource costs per application, customer, or any other business dimension.
Anodot is a cloud cost management platform that monitors cloud metrics together with revenue and business metrics, so you can understand the true unit economics (revenue, cost, or margin) of your customers, applications, teams, and more. With Anodot, FinOps stakeholders from finance and DevOps can continuously optimize their cloud investments to drive strategic business initiatives.
Anodot integrates with Prometheus and Amazon CloudWatch Container Insights, which allow you to collect, aggregate, and summarize metrics and logs from your containerized applications and microservices. Using Container Insights will add additional charges as documented.
Anodot correlates the metrics collected with data from the AWS Cost and Usage Report, AWS pricing, and other sources. This correlation helps get insights into pod resource utilization, nodes utilization, and waste, and get visibility into the true cost of each application you run.
Connecting Kubernetes with Anodot
To get started, connect your AWS account with the Anodot cost platform by following these steps. Next, connect your Kubernetes cluster to Anodot and you’ll be ready to go with analyzing and optimizing your Kubernetes clusters.
The data collected is valuable, and even more so once it has been translated into meaningful business insights like:
- The portion of all costs that can be attributed to each business object in a company (services, applications, functions).
- The portion of all costs that can be attributed to each department or business unit.
- The portion of all costs that can be attributed to each customer.
Figure 3 – Pod resource utilization and dollar amount of waste identified.
Anodot enables efficient breakdown of these costs and enables you to share them within your organization. Anodot displays the utilization level of each pod and node, reports on waste identified within each, and shows the dollar value of this waste.
Figure 4 – Node utilization over time and dollar amount of waste identified.
Anodot’s business mapping logic enables users to produce rule-powered maps that associate costs with business cost centers. Simple or complex rules can be defined for tags, namespaces, deployments, labels, and other identifiers, and can be refined through logical operators.
When a single cost line item is shared by more than one mapping, you can define a cost allocation strategy such as equal, rational, or proportional division to split costs between the maps. Finally, you can visualize the maps and create dashboards to enable users to understand cost per department, application, or unit metric.
Figure 5 – Understanding node utilization enables more efficient pod stacking.
With Anodot business mapping in place, you can slice and dice Kubernetes costs and associate its costs to business objects and internal identifiers, such as business units, cost centers, customers, and applications.
Figure 6 – Business mapping definition to enable division and allocation of shared costs.
Anodot also enables your FinOps team to back out and visualize the contribution of each service grouping (across compute, storage, and data transfer) and their components (such as the individual costs of Prometheus and kube-proxy) to total Kubernetes costs over time.
Figure 7 – See the complete cost of your business applications running on Kubernetes.
Understanding the financial impact of your Kubernetes clusters requires the understanding of the cost drivers impacting its costs, as well as a tool to help aggregate Kubernetes data, correlate it with other sources, and deliver meaningful insights into how to improve your true costs and business performance.
To support your Amazon EKS monitoring and optimization efforts, and give you visibility into your real containerized application costs, Anodot offers a complimentary 30-day trial of Kubernetes cost analysis and optimization. This gives you visibility into your EKS costs and utilization metrics, and reveals ways to increase the value gained from every resource you spin.
Anodot – AWS Partner Spotlight
Anodot is an AWS Partner that uses machine learning to autonomously monitor and alert in real time on any business-related incidents across the organization.