Distributed machine learning with Amazon ECS

Running distributed machine learning (ML) workloads on Amazon Elastic Container Service (Amazon ECS) allows ML teams to focus on creating, training and deploying models, rather than spending time managing the container orchestration engine. With a simple architecture, control plane transparent upgrades, and native AWS Identity and Access Management (IAM) authentication, Amazon ECS provides a great environment […]

How Grover saves costs with 80% Spot in production using Karpenter with Amazon EKS

This post is co-written with Suraj Nair, Sr. DevOps Engineer at Grover. Introduction Grover is a Berlin based global leader in technology rentals, enabling people and empowering businesses to subscribe to tech products monthly instead of buying them. As a pioneer in the circular economy, Grover’s business model of renting out and refurbishing tech products results […]

Monitoring Windows pods with Prometheus and Grafana

This post was co-authored by Cezar Guimarães, Sr. Software Engineer, VTEX Introduction Customers across the globe are increasingly adopting Amazon Elastic Kubernetes Service (Amazon EKS) to run their Windows workloads. This is a result of customers figuring out that refactoring existing Windows-based applications into an open-source environment, while ideal, is a very complex task. It […]

How VMware Tanzu CloudHealth modernized container workloads from self-managed Kubernetes to Amazon Elastic Kubernetes Service

This post is co-written with Rivlin Pereira, Staff DevOps Engineer at VMware Introduction VMware Tanzu CloudHealth is the cloud cost management platform of choice for more than 20,000 organizations worldwide that rely on it to optimize and govern the largest and most complex multi-cloud environments. In this post, we will talk about how VMware Tanzu […]

How Perry Street Software Implemented Resilient Deployment Strategies with Amazon ECS

This post was coauthored by Ben Duffield and Eric Silverberg at Perry Street Software, with contributions from Adam Tucker, Piotr Wald, and Cristian Constantinescu of PSS Introduction You just finished deploying that important change you spent weeks preparing, when you see this email subject in your inbox: Alarm: HTTPCode_Target_5XX_Count. Ugh. The code you have just […]

Configure Amazon EKS for environmental sustainability

Introduction Sustainable cloud design requires understanding and minimizing the impacts of architectural decisions. With conscientious cloud architecture, we can innovate rapidly while treading lightly on our shared environment. As cloud computing becomes ubiquitous, it’s imperative that we build sustainable cloud architectures that minimize environmental impacts. While cloud economies of scale improve efficiency, our design choices […]

Enabling mTLS with ALB in Amazon EKS

Introduction In today’s interconnected world, communication faces evolving security threats. From sensitive financial transactions in online banking to secure data transmissions in the automobile industry, ensuring trust and authenticity between businesses is becoming more and more critical. This is where Mutual Transport Layer Security (mTLS) can be an option to offer enhanced security through advanced […]

Deep dive into Amazon EKS scalability testing

Introduction The “Elastic” in Amazon Elastic Kubernetes Service (Amazon EKS) refers to the ability to “acquire resources as you need them and release resources when you no longer need them”. Amazon EKS should scale to handle almost all workloads but we often hear questions from Amazon EKS customers like: “What is the maximum number of […]

Build preview environments for Amazon ECS applications with AWS Copilot

Introduction In the software development sphere, immediate evaluation of every code adjustment and deploying pull requests to active environments for immediate preview and feedback is essential. This practice is instrumental in reducing post-deployment issues and operational disruptions, underscoring the urgency for dedicated preview environments. Without these environments, the risk of merging unassessed features into the […]

Secure Amazon Elastic Container Service workloads with Amazon ECS Service Connect

Introduction With this release, Amazon Elastic Container Service (Amazon ECS) integrates with AWS Private Certificate Authority (CA) and automates the process of issuing, distributing, and rotating certificates, which makes it simple for customers to secure traffic between services without adding extra operational workload. Now Amazon ECS Service Connect customers can encrypt service-to-service communication using Transport […]