Technical How-to | Containers

Maximizing GPU utilization with NVIDIA’s Multi-Instance GPU (MIG) on Amazon EKS: Running more pods per GPU for enhanced performance

With the Generative Artificial intelligence (GenAI) and machine learning (ML) surge, GPU-intensive tasks such as machine learning, graphics rendering, and high-performance computing are becoming increasingly prevalent. However, many of these tasks do not always require the full performance and resources of a high-end GPU. This underutilization of GPU resources leads to inefficiencies, increased costs, and […]

Deploy Generative AI Models on Amazon EKS

Introduction Generative Artificial Intelligence (Gen AI) is transforming the way businesses function and is accelerating the pace of innovation. In general, the AI field is changing the way businesses utilize technology. Generative AI technology involves tuning and deploying Large Language Models (LLM), and gives developers access to those models to execute prompts and conversations. Platform […]

Run Spark-RAPIDS ML workloads with GPUs on Amazon EMR on EKS

Introduction Apache Spark revolutionized big data processing with its distributed computing capabilities, which enabled efficient data processing at scale. It offers the flexibility to run on traditional Central Processing Unit (CPUs) as well as specialized Graphic Processing Units (GPUs), which provides distinct advantages for various workloads. As the demand for faster and more efficient machine […]

Improving operational visibility with AWS Fargate task retirement notifications

Introduction AWS Fargate, the serverless compute engine for containerized workloads, removes the undifferentiated heavy lifting of securing and patching the underlying infrastructure. In this blog post we dive into AWS Fargate task retirement, one of the ways AWS keeps the infrastructure secure and up to date. AWS has recently updated the AWS Fargate task retirement […]

Serve distinct domains with TLS powered by ACM on Amazon EKS

Introduction AWS Elastic Load Balancers provide native ingress solutions for workloads deployed on Amazon Elastic Kubernetes Service (Amazon EKS) clusters at both L4 and L7 with Network Load Balancer and Application Load Balancer (ALB). The AWS Load Balancer Controller, formerly called the AWS ALB Ingress Controller, satisfies Kubernetes ingress using ALB and service type load […]

Multi-account infrastructure provisioning with AWS Control Tower and AWS Proton

Introduction The majority of the enterprise customers tend to establish centralize control and well-architected organization-wide policies when it comes to distribution of cloud resources in multiple teams. These teams are primarily divided into three categories: IT operations, Enterprise Security, and Application (App)-development. While delivery of business value from application standpoint falls under the purview of […]

Using SBOM to find vulnerable container images running on Amazon EKS clusters

Introduction When you purchase a packaged food item in your local grocery store, you probably check the list of ingredients written to understand what’s inside and make sure you aren’t consuming ingredients inadvertently that you don’t want to or are known to have adverse health effects. Do you think in a similar way when you […]

Implement custom service discovery for Amazon ECS Anywhere tasks

Introduction Amazon Elastic Container Service (Amazon ECS) is a managed container orchestration service offered by AWS. It simplifies the deployment, management, and scalability of containerized applications using Amazon ECS task definitions through the AWS Management Console, AWS Command Line Interface (AWS CLI), or AWS Software Development Kits (AWS SDKs). Customers who require running containerized workloads, […]

Automating custom networking to solve IPv4 exhaustion in Amazon EKS

Introduction When Amazon VPC Container Network Interface (CNI) plugin assigns IPv4 addresses to Pods, it allocates them from the VPC CIDR range assigned to the cluster. While it makes Pods first-class citizens within the VPC network, it often leads to exhaustion of the limited number of IPv4 addresses available in the VPCs. The long term […]

Application first delivery on Kubernetes with Open Application Model

This post was co-written with Daniel Higuero, CTO, Napptive Introduction In the era of cloud-native applications, Kubernetes has emerged as a prominent technology in the container orchestration space. However, using Kubernetes requires users to not only run and manage cluster configurations, cluster-wide add-ons, and auxiliary tooling, but also to understanding application deployment configurations (e.g., Deployments, […]

Category: Technical How-to