Containers

Category: Learning Levels

Deploy production generative AI at the edge using Amazon EKS Hybrid Nodes with NVIDIA DGX

This post demonstrates a real-world example of integrating EKS Hybrid Nodes with NVIDIA DGX Spark, a compact and energy-efficient GPU platform optimized for edge AI deployment. In this post we walk you through deploying a large language model (LLM) for low-latency generative AI inference on-premises, setting up node monitoring and GPU observability with centralized management through Amazon EKS.

Automated deployments with GitHub Actions for Amazon ECS Express Mode

In this post, we will walk you through building an automated deployment pipeline using GitHub Actions. You will create a workflow that triggers on code changes, builds Docker images, pushes them to Amazon ECR, and deploys to Amazon ECS Express Mode using IAM roles for secure authentication. By the end, you will have a continuous integration and continuous delivery (CI/CD) workflow that automatically deploys your application when you push code.

Beyond metrics: Extracting actionable insights from Amazon EKS with Amazon Q Business

In this post, we demonstrate a solution that uses Amazon Data Firehose to aggregate logs from the Amazon EKS control plane and data plane, and send them to Amazon Simple Storage Service (Amazon S3). Finally, we use Amazon Q Business and its Amazon S3 connector to synchronize the logs, index the log data in Amazon S3, and enable a chat experience powered by the generative AI capabilities of Amazon Q Business.

Simplify Kubernetes cluster management using ACK, kro and Amazon EKS

In this blog post, we show how to create and manage a fleet of Amazon Elastic Kubernetes Service (Amazon EKS) clusters using Kube Resource Orchestrator (kro), AWS Controllers for Kubernetes (ACK), and Argo CD. These tools allow you to implement a GitOps-based cluster management solution to increase productivity and improve consistency and standardization by using the Kubernetes API for end-to-end operations.

Monitor Amazon ECS Events with Amazon EventBridge Filtering

In this post, we demonstrate how to capture specific Amazon ECS events using EventBridge rules for enhanced monitoring and troubleshooting of your containerized applications. We show you how to customize EventBridge filtering patterns to capture the specific Amazon ECS events that matter for your troubleshooting and monitoring needs.

Part 2: Observing and scaling MLOps infrastructure on Amazon EKS 

In this post, we focus on observing and scaling ML operations (MLOps) infrastructure on Kubernetes. MLOps platforms running on Amazon EKS provide powerful built-in capabilities for logging, monitoring, and alerting that are essential for maintaining healthy ML systems at scale.

Deep dive: Streamlining GitOps with Amazon EKS capability for Argo CD

In this deep dive, we explore advanced scenarios with Argo CD including hub-and-spoke multi-cluster deployments, native AWS service integrations, multi-tenancy implementation, scaling with advanced Argo CD configurations and integration with CI/CD pipeline.