AWS HPC Blog
Explore costs of AWS Batch jobs run on Amazon EKS using pod labels and Kubecost
Last October, AWS Batch introduced support for running batch workloads on your Amazon Elastic Kubernetes Service (Amazon EKS) clusters, and today we are excited about releasing support for adding pod labels within your EKS job definitions. One of the uses for pod labels is to being able to track which job pods ran on which nodes in order to track the cost of these jobs.
Since AWS Batch workloads on Amazon EKS usually leverage a pre-existing cluster that may be shared with other workloads, determining the resources used by AWS Batch pods can be a challenge. This is where Kubecost shines, as it provides Amazon EKS customers with the ability to accurately track Kubernetes resource level costs by namespace, cluster, pod, or organizational concepts (e.g., by team or application) and that includes pods running AWS Batch jobs. Our collaboration with Kubecost provides customers a way to install Kubecost via a Helm chart, or as an EKS add-on as part of their AWS Marketplace offering.
By default, AWS Batch launches pods with the following labels:
- amazonaws.com/compute-environment-uuid – The UUID of the Batch compute environment.
- amazonaws.com/job-id – The UUID of the Batch job corresponding to the running pod.
- amazonaws.com/node-uid – The UID of the node pods are placed on.
While these are already useful for use with Kubecost, the addition of pod labels to jobs opens up many more possibilities, such as tagging jobs by workload, project, or a group within your organization. You can define custom pod labels that conform the Kubernetes pod label specification using the metadata
section of the eksProperties / podProperties
request when registering job definitions, or the eksPropertiesOverride / podProperties
when submitting jobs. Here is an example JSON defining a few labels we will use in Kubecost.
"podProperties": {
"metadata": {
"labels": {
"batchQueue": "batch-eks-JQ",
"batchComputeEnv": "batch-eks-CE",
"batchUser":"maxime",
"jobName":"monte-carlo-run",
"project":"simulations"
},
}
# other podProperties objects …
}
Let’s take a look at some examples of the types of cost reporting you can get from Kubecost.
Example 1: Show me the total cost of my cluster by namespace
You can use Kubecost to quickly see an overview of costs across all of your namespaces. Since AWS Batch requires its own namespace in your Amazon EKS clusters, you can view the total costs of your AWS Batch jobs easily. In this example, we called our AWS Batch EKS namespace batch-eks-namespace.
Figure 1 shows the Kubecost Monitor -> Allocations view, which by default aggragates cost by Namespace.
Example 2: Show me the costs for a given AWS Batch EKS compute environment
AWS Batch allows you to attach multiple compute environments to a single Amazon EKS cluster. For example, if you separate GPU resources from pure CPU jobs. We can take advantage of the default pod label for compute environments to separate the costs per Batch compute environment.
To view costs by individual AWS Batch compute environment on EKS:
- Select Aggregate by
- In Label Name, enter
batch.amazonaws.com/compute-environment-uuid
Compute environment ARNs are not able to be used as pod labels, since ARNs can be longer than the 63 character limit for pod label names and values. Since it’s not straight forward to map a compute environment UUID to the ARN, you can label pods with a short name for the compute environment. Our earlier pod label example used the batchComputeEnv
label name, and we can use this to aggragate cost for the compute environments in the cluster (Figure 3).
Example 3: Use a pod label for showing cost per project
You can use a pod label to label pods with a project, a department, or group within the organization, or different types of workloads. In our example, we labeled pods with a project
and batchUser
. Figure 4 shows the cost allocations using both of these labels in a Multi-aggregation.
Conclusion
With the support of Amazon EKS, AWS Batch enables you to leverage the Kubernetes eco-system to run and monitor your AWS Batch jobs. Kubecost is a great example that enables customers to monitor and track the costs of using AWS Batch on EKS in more details.
If you want to get started using Kubecost to monitor the cost of AWS Batch jobs, follow the Kubecost deployment documentation to install it on an existing Amazon EKS cluster. And let us know how the results work for you. You can find us at ask-hpc@amazon.com.