Containers

Dynamic Kubernetes request right sizing with Kubecost

This post was co-written with Kai Wombacher, Founding Product Manager at Kubecost.

In this post we show you how to use the Kubecost Amazon Elastic Kubernetes Service (Amazon EKS) add-on to lower infrastructure costs and boost Kubernetes efficiency. The Container Request Right Sizing feature allows you to find how container requests are configured, look for inefficiencies, and fix them either manually or through automated remediation.

Specifically, we cover how to review Kubecost’s right sizing recommendations and take action on them using one-time updates or scheduled, automated resizing within your Amazon EKS environment to continuously optimize resource usage.

Over-requested containers are one of the most common sources of cloud resource waste in Kubernetes environments. Without visibility and automation, development teams can request far more resources than their applications use, which leads to overprovisioned nodes and higher costs.

What are container requests?

In Kubernetes, a container request is a declared amount of CPU and memory that a workload needs. It plays a crucial role in how workloads are scheduled and how nodes are used.

When a container specifies a CPU or memory request, the scheduler looks for a node that has at least that amount of unallocated capacity. When a pod is placed on a node, the requested resources are essentially reserved, regardless of whether the container uses them in practice.

Although this reservation behavior makes sure that workloads have access to the resources they need, it can also lead to inefficient resource usage if requests are set too high. For example, if a container requests 1 CPU but only uses 200 millicores (0.2 CPU), then that added 0.8 CPU goes unused, yet the node capacity is still reserved and charged for. On a large cluster with hundreds or thousands of containers, these small inefficiencies can add significant excess capacity and unnecessary spend.

Beyond impacting cost, requests influence Quality-of-Service (QoS) classes, which affect how Kubernetes handles eviction decisions when resources run low. Correctly sizing requests helps strike a balance between performance, availability, and efficiency.

Kubecost savings insights

The Kubecost Amazon EKS add-on provides visibility into containers that are over-requesting resources. The built-in Container Request Sizing dashboard surfaces these inefficiencies and provides actionable recommendations, such as an estimate of how much you could save each month by right sizing requests.

In the Savings Insights dashboard, you will find:

  • A list of containers eligible for resizing.
  • The current efficiency of CPU and memory requests.
  • The containers’ maximum and average usage.
  • Estimated dollar savings per container.
  • Total potential savings for your cluster.
Container Request Right-Sizing Recommendations. Showing efficiency, estimate of monthly savings/recommendation acceptance.

Figure 1: Container Request Right Sizing recommendations

Right sizing is especially impactful in development, testing, and staging environments, where workloads often have relaxed performance requirements and can tolerate more aggressive downsizing.

Customizing recommendations

Not all workloads are the same, and right sizing strategies shouldn’t be either. Kubecost allows you to tailor request sizing recommendations based on workload type, criticality, and operational requirements. This flexibility makes sure that optimization doesn’t come at the expense of availability or performance.

You can choose from preset optimization profiles, such as “development”, “production”, or “high availability”, or define custom profiles to suit your specific use cases.

Customizable parameters include the following:

  • Target CPU and memory usage percentages.
  • Query window duration, such as the past 48 hours, or the past week.
  • Workload filters by label, namespace, or controller type.

For production environments, you may want to use more conservative usage targets, such as 65% to maintain headroom for traffic spikes. For less critical environments, you might set targets at 80% to reduce overprovisioning.

Fine-tuning these inputs allows you make sure that Kubecost generates recommendations aligned with your team’s priorities and risk tolerance.

Acting on Kubecost recommendations

Spotting overprovisioned containers is just step one. Kubecost allows you to take immediate or scheduled action to reduce resource requests in a safe, scalable way with its automated container right sizing.

Although the Kubecost add-on for Amazon EKS supports multi-cluster views and cost analysis, the request-sizing capabilities described in this post apply only to the EKS cluster where the Kubecost Amazon EKS add-on is deployed.

If you want to apply similar automation across multiple EKS clusters, then check out Kubecost Enterprise for broader policy enforcement across your fleet.

One-time resizing

In many cases, teams may want to apply right sizing as a one-time action before enabling automation, allowing them to review changes, observe workload stability, or align adjustments with internal maintenance windows. This approach helps teams validate the impact of resizing, gain confidence in recommendations, and make sure that right sizing aligns with application requirements before moving to fully automated workflows.

One-time resizing can be used to:

  • Clean up resource waste following a deployment.
  • Apply savings during a monthly cost review.
  • Enforce new policies or team-level budgets.
Enable Resize Requests (Applies recommendations per your customizations) Enable Auto-Scaling (Scaling requests based on customization)

Figure 2: Enable Resize Requests, and Enable Autoscaling

Resize Requests Now in the dashboard applies the recommended request sizes directly to your workload controllers, as shown in the preceding figure. This change is immediate and reflects your chosen customization profile.

Based on your filter settings, the resize applies to any supported controller types, such as Deployments, StatefulSets, or Jobs.

Scheduled right sizing

For workloads that change frequently or experience cyclical traffic patterns, one-time adjustments aren’t always enough. This is where scheduled automation comes in.

Choosing Enable Autoscaling on the container right sizing page (under Savings > Insights), allows you to configure recurring resizing jobs that adjust requests based on recent usage data, making sure that your workloads stay right sized over time.

For example, you could configure a resizing job that:

  • Runs every 2 hours.
  • Looks at the past 48 hours of usage data.
  • Targets 80% CPU and memory usage.
  • Applies only to Deployments in the development namespace.

These scheduled jobs are lightweight and non-intrusive. They provide an ongoing mechanism to eliminate wasted spend on compute from your workloads without needing manual intervention.

Automating resizing with Helm

You can also automate container request right sizing through Helm during Kubecost installation. This provides an infrastructure as code (IaC) approach to configuring resizing jobs as part of your cluster setup.

The following example configuration enables resizing for all Deployments in a cluster:

clusterController:
  enabled: true
  actionConfigs:
    containerRightsize:
      filterConfig:
        - filter: |
            controllerKind:"deployment"
      schedule:
        start: "2024-08-20T00:00:00Z"
        frequencyMinutes: 120
        recommendationQueryWindow: "48h"
        targetUtilizationCPU: 0.8
        targetUtilizationMemory: 0.8

This configuration runs the resizing job every two hours, targeting 80% usage and making sure that requests reflect the most recent usage patterns. This is ideal for platform teams looking to enforce best practices at scale, without needing to write and maintain custom automation scripts.

Conclusion

Organizations using request sizing Kubecost features have reported substantial benefits, such as:

  • 20–60% reduction in compute costs in non-production environments.
  • Higher node usage, leading to better ROI from infrastructure.
  • Faster pod scheduling due to reduced resource contention.
  • Greater visibility into cluster performance bottlenecks.

In multi-tenant environments, right sizing can also help teams meet internal chargeback or showback policies by improving the accuracy of cost attribution based on actual usage.

In this post, we showed that right sizing container resource requests is an effective and low-risk way to reduce Kubernetes infrastructure costs. The Kubecost Amazon EKS add-on allows you to do the following:

  • Identify inefficient workloads.
  • Receive data-driven sizing recommendations.
  • Customize optimization strategies to match your environment.
  • Apply changes manually or automate them on a schedule.

You can do this without installing any third-party agents or writing custom automation, just deploy the Kubecost Amazon EKS add-on, connect to Kubecost, and start optimizing.

If you’re not currently using Kubecost, then the Get Started page can provide you with the steps to install Kubecost on your EKS cluster.


About the authors

Kai Wombacher is the Founding Product Manager at IBM Kubecost. Kai works on Kubecost’s solutions for monitoring, managing, and optimizing Kubernetes spend at scale. He has years of experience delivering solutions for technical organizations, including cutting-edge K8s tools and automated, end-to-end machine learning models.

Jason Janiak is a Partner Solutions Architect at AWS. Jason contributes to partner success through collaboration/creating opportunities to further their growth/integration with AWS. Outside of work he enjoys hiking, travel, and meditation.

Mike Stefaniak is a Senior Manager on the Amazon EKS Product team at Amazon Web Services. He focuses on all things Kubernetes, delivering features that help customers accelerate their modernization journey on AWS.