Amazon EKS

Getting started with Amazon EKS

Choose your own path

Amazon EKS is a fully managed Kubernetes service that makes it easy to run containers at scale on AWS. Whether you're modernizing with microservices, running large-scale machine learning workloads, or building with emerging technologies like generative AI, Amazon EKS helps customers run their mission-critical containerized applications while reducing operational overhead and accelerating innovation. Choose your path to learn how EKS can help you efficiently operate production-grade Kubernetes environments and follow curated steps to get started with your specific use case.

Page topics

Path 1-0: Agentic AI
1
Path 1-1: Deploying agents
4
Path 1-2: Agentic operations for Amazon EKS
4
Path 2-0: Generative AI
1
Path 2-1: Model deployment and inference
4
Path 3-0: No use case in mind?
4

Path 1-0: Agentic AI

Open all

Amazon EKS enables two distinct approaches to Agentic AI. First, you can deploy and scale autonomous agents as containerized applications, giving you control over your agent infrastructure. Second, you can streamline Kubernetes operations and application development to enable agents and AI assistants to simplify operations and troubleshoot issues through natural language interactions using Agent2Agent Protocol (A2A) and Model Context Protocol (MCP). This path guides you through both approaches — deploying agents on Amazon EKS and using agentic AI to enhance the Amazon EKS developer and operator experience.

Path 1-1: Deploying agents

Open all

Deploy and scale autonomous AI agents on Amazon EKS using the open-source Strands Agents SDK or your preferred agent framework. This approach gives you complete control over your agent infrastructure, allowing you to use any model and customize your implementation. EKS provides production-grade capabilities for running containerized AI agents with high availability and scalability.

Explore the fundamentals of building and deploying AI agents on EKS. Learn about Strands Agents SDK and how it simplifies agent development, or apply these concepts to your preferred framework. Study a real life example on weather forecasting to understand how a simple agent can integrate with external APIs, handle streaming responses, and process natural language queries. This example demonstrates key concepts like system prompts, tool integration, and API workflows that you'll need when deploying agents on EKS.

Follow our step-by-step guide to deploying Strands Agents SDK Agents to Amazon EKS. Starting with learning how to containerize your agent, set up FastAPI endpoints, implement streaming responses, and package your application using Docker. Use our sample project to understand essential concepts like EKS Auto Mode configuration, Helm deployments, and basic testing. While this guide uses Strands SDK, the principles apply to deploying any containerized agent on EKS.

Learn to scale and operate your agent deployments reliably in production. Implement automated scaling to handle varying workloads, achieve high availability through backup and failover configurations, and set up comprehensive monitoring using CloudWatch Container Insights. Follow our EKS Best Practices Guide for Running AI/ML Workloads to ensure your agent infrastructure is secure and observable. Take our self-paced Agentic AI on EKS Workshop to get step-by-step guidance for deploying AI agents at scale.

Path 1-2: Agentic operations for Amazon EKS

Open all

Transform your Kubernetes operations by providing AI coding assistants real-time tools and resources through the Amazon EKS MCP server. This equips AI agents to interact directly with your EKS clusters, with contextual guidance and automation through natural language interactions. From cluster creation to troubleshooting, these AI agents help streamline your Kubernetes operations while maintaining AWS best practices.

Learn how the different AWS MCP servers facilitate the interaction between AI models and AWS services and resources. Explore the EKS MCP Server Guide to understand how AI agents can help automate common operational tasks, from cluster management to troubleshooting. Set up your development environment to configure AI assistants like Amazon Q Developer CLI or Cline with the EKS MCP server integration.

Follow our step-by-step guide to streamlining Kubernetes operations with the Amazon EKS MCP server. Learn to use natural language commands for containerizing and deploying applications on EKS. See this demo to learn more on how AI agents can help generate Kubernetes manifests, manage cluster resources, and automate deployment workflows using the EKS MCP server's tools.

Follow our AI-assisted troubleshooting walkthrough with the Amazon EKS MCP server to see how AI agents can help monitor application health and resolve common issues. Through practical examples of debugging pod failures and infrastructure problems, learn to use natural language queries to check CloudWatch metrics, analyze logs, and diagnose problems. This hands-on guide demonstrates how AI assistance can help you leverage Amazon CloudWatch and other AWS services to maintain healthy applications on EKS.

Path 2-0: Generative AI

Open all

The Generative AI landscape is evolving rapidly, with organizations building, deploying and scaling diverse AI/ML workloads for use cases ranging from distributed model training, fine tuning and large-scale inference deployments. Customers including Anthropic and Adobe are choosing Amazon EKS to get fine-grained control over compute resources while maintaining operational efficiency. Check out this guide to get an overview on why customers choose EKS for AI/ML for common use cases such as model training and deployment, Retrieval augmented generation (RAG) and inference.

Path 2-1: Model deployment and inference

Open all

Amazon EKS enables production-grade inference deployments with support for GPU optimization, multi-model serving, and automated scaling. Organizations can leverage their existing EKS expertise and operational practices to quickly deploy and manage inference workloads alongside other applications. Through integration with open-source tools and breadth of accelerators on AWS, companies like Vannevar Labs and Omi have achieved significant cost reduction and performance improvements while maintaining operational consistency across their infrastructure.

Learn the infrastructure and architecture fundamentals for deploying inference workloads on EKS in this solution guide covering key topics like GPU support, model serving patterns, and resource optimization. Explore the open-source AI on EKS project, which provides ready to deploy blueprints such as setting up a scalable LLM inference service with infrastructure-as-code templates for production deployment.

Start with our Best Practices Cluster Setup Guide for Real-Time Inference to create an EKS cluster optimized for production inference workloads. Deploy models using our production-ready AI on EKS inference charts, which provides Helm charts and infrastructure-as-code templates for popular frameworks like vLLM and NVIDIA Triton. For traditional ML workloads, check out the AWS Deep Learning Containers Developer guide for CPU and GPU-based inference. for deployment patterns.

Follow our hands-on workshops to deploy inference workloads on EKS using your choice of accelerator - the NVIDIA based workshop for GPU-based inference and AWS Neuron based workshop using Inferentia & Trainium accelerators. Both workshops cover essential tasks like device plugin setup, resource management, and monitoring. Refer to the comprehensive EKS Best Practices Guide for AI/ML workloads to ensure your inference deployments follow proven patterns for compute, networking, storage, and observability. These guides serve as ongoing references as you operate and evolve your inference architecture on EKS.