Containers
Category: *Post Types
How to run AI model inference with GPUs on Amazon EKS Auto Mode
In this post, we show you how to swiftly deploy inference workloads on EKS Auto Mode and demonstrate key features that streamline GPU management. We walk through a practical example by deploying open weight models from OpenAI using vLLM, while showing best practices for model deployment and maintaining operational efficiency.
Dynamic Kubernetes request right sizing with Kubecost
In this post, we demonstrate how to utilize the Kubecost Amazon EKS add-on to reduce infrastructure costs and enhance Kubernetes efficiency through Container Request Right Sizing, which helps identify and fix inefficient container resource configurations. We explore how to review Kubecost’s right sizing recommendations and implement them through either one-time updates or scheduled automated resizing within Amazon EKS environments for continuous resource optimization.
Introducing Seekable OCI Parallel Pull mode for Amazon EKS
In this post, we explore how SOCI Parallel Pull Mode transforms container image pulls through configurable parallelization strategies, addressing performance bottlenecks in both download and unpacking phases. The solution demonstrates significant improvements in pull times, showing nearly 60% acceleration when tested with a 10GB Deep Learning Container image, making it particularly valuable for AI/ML workloads with large, complex images.
Migrate to Amazon EKS: Data plane cost modeling with Karpenter and KWOK
In this post, we demonstrate how to use Karpenter and KWOK to simulate Kubernetes migrations to Amazon EKS, enabling organizations to estimate compute costs before actual migration. The solution involves creating a test environment, backing it up with Velero, restoring it in a new EKS cluster, and analyzing Karpenter’s node provisioning decisions to build accurate cost estimates.
Best practices for resilience and availability on Amazon ECS
In this post, we explore advanced implementation patterns for building highly available services on Amazon ECS, including idempotency, resilience to transient failures, static stability across Availability Zones, deployment safety, and chaos engineering techniques. The post provides detailed guidance on how these patterns can be implemented when deploying applications on Amazon ECS to ensure maximum resilience and availability.
Simplify network connectivity using Tailscale with Amazon EKS Hybrid Nodes
This post guides readers through integrating Tailscale with Amazon EKS Hybrid Nodes to simplify and secure network connectivity between on-premises infrastructure and AWS. The integration enables encrypted point-to-point connections using the WireGuard protocol, creating a peer-to-peer mesh network that streamlines the network architecture needed for EKS Hybrid Nodes.
Scaling beyond IPv4: integrating IPv6 Amazon EKS clusters into existing Istio Service Mesh
Organizations are increasingly adopting IPv6 for their Amazon Elastic Kubernetes Service (Amazon EKS) deployments, driven by three key factors: depletion of private IPv4 addresses, the need to streamline or eliminate overlay networks, and improved network security requirements on Amazon Web Services (AWS). In IPv6-enabled EKS clusters, each pod receives a unique IPv6 address from the […]
Deep dive into cluster networking for Amazon EKS Hybrid Nodes
In this post, we dive deep into cluster networking configurations for Amazon EKS Hybrid Nodes, exploring different Container Network Interface (CNI) options and load balancing solutions to meet various networking requirements. The post demonstrates how to implement BGP routing with Cilium CNI, static routing with Calico CNI, and set up both on-premises load balancing using MetalLB and external load balancing using AWS Load Balancer Controller.
Under the hood: Amazon EKS ultra scale clusters
This post was co-authored by Shyam Jeedigunta, Principal Engineer, Amazon EKS; Apoorva Kulkarni, Sr. Specialist Solutions Architect, Containers and Raghav Tripathi, Sr. Software Dev Manager, Amazon EKS. Today, Amazon Elastic Kubernetes Service (Amazon EKS) announced support for clusters with up to 100,000 nodes. With Amazon EC2’s new generation accelerated computing instance types, this translates to […]
Amazon EKS enables ultra scale AI/ML workloads with support for 100K nodes per cluster
We’re excited to announce that Amazon Elastic Kubernetes Service (Amazon EKS) now supports up to 100,000 worker nodes in a single cluster, enabling customers to scale up to 1.6 million AWS Trainium accelerators or 800K NVIDIA GPUs to train and run the largest AI/ML models. This capability empowers customers to pursue their most ambitious AI […]