Containers

Category: Technical How-to

Back up and restore your Amazon EKS cluster resources using Velero

In this post, you’ll learn to back up and restore Amazon EKS cluster resources and persistent volume data using Velero. You’ll deploy a sample stateful application, back it up, and restore it to a different namespace within the same cluster. Along the way, you’ll configure least-privilege AWS Identity and Access Management (AWS IAM) roles using Amazon EKS Pod Identity and scope Velero’s Kubernetes permissions with a custom ClusterRole. A ClusterRole is a Kubernetes resource that defines cluster-wide permissions.

Implement centralized observability for multi-account Amazon EKS

This post shows you how to unify your existing Container Insights and CloudWatch data into a centralized monitoring hub using a hub-and-spoke architecture. You will unify fragmented observability data into a single pane of glass that maintains security boundaries while removing the need for account switching. The solution requires no changes to your existing monitoring infrastructure. It connects what you already have. You will reduce incident response time by removing context switching between accounts and Regions. From one console, you will identify clusters experiencing elevated error rates, spot pod CPU and memory spikes, and track which clusters require version upgrades organization wide. This visibility helps you add capacity before issues occur.

Cross-Region disaster recovery for Amazon EKS using AWS Backup

In this post, we walk you through a complete cross-Region DR implementation for Amazon EKS using AWS Backup. We deploy a stateful retail store application in a source Region, back it up, copy the backup to a DR Region, and restore the full application, including its persistent data, to a pre-provisioned cluster in the secondary Region. By the end of this walkthrough, you will have a fully functional DR environment with your application running in the secondary Region with all stateful data intact.

Implement SPIFFE/SPIRE authorization on Amazon EKS

In this post, we show you how to implement SPIFFE/SPIRE on Amazon EKS to establish secure service-to-service communication using a nested architecture. You’ll learn how to deploy SPIRE across multiple Amazon EKS clusters, configure workload attestation, and implement fine-grained authorization policies that scale with your infrastructure.

Deploying Model Context Protocol (MCP) servers on Amazon ECS

In this post, we will walk you through a three-tier MCP application deployed entirely on Amazon ECS, using Service Connect for service-to-service communication and Express Mode for automated load balancing, to show how to take an MCP-based workload from concept to production.

Building intelligent knowledge graphs for Amazon EKS operations using AWS DevOps Agent

In this post, we demonstrate how AWS DevOps Agent works—from alert generation to identifying the affected EKS cluster, building knowledge graphs, and troubleshooting application or infrastructure issues, ultimately reducing MTTI and MTTR for your Kubernetes operations.

Building PCI DSS-Compliant Architectures on Amazon EKS

In this post, we explore key considerations, best practices, and architectural decisions hosting applications on EKS in shared tenancy environments while maintaining PCI DSS compliance. Please note this information is for reference purposes only and does not constitute legal or compliance advice—customers remain responsible for making their own independent assessment, and AWS products or services are provided ‘as is’ without warranties, representations, or conditions of any kind.

Deploy production generative AI at the edge using Amazon EKS Hybrid Nodes with NVIDIA DGX

This post demonstrates a real-world example of integrating EKS Hybrid Nodes with NVIDIA DGX Spark, a compact and energy-efficient GPU platform optimized for edge AI deployment. In this post we walk you through deploying a large language model (LLM) for low-latency generative AI inference on-premises, setting up node monitoring and GPU observability with centralized management through Amazon EKS.

Automated deployments with GitHub Actions for Amazon ECS Express Mode

In this post, we will walk you through building an automated deployment pipeline using GitHub Actions. You will create a workflow that triggers on code changes, builds Docker images, pushes them to Amazon ECR, and deploys to Amazon ECS Express Mode using IAM roles for secure authentication. By the end, you will have a continuous integration and continuous delivery (CI/CD) workflow that automatically deploys your application when you push code.