AWS Cloud Operations Blog

Category: Compute

Optimize your cloud deployments with Prioritized Trusted Advisor recommendations in your operational workflows

AWS Trusted Advisor Priority helps you focus on the most important recommendations for optimizing your cloud deployments, improving resilience, and addressing security gaps. As an AWS Enterprise Support customer, you gain access to prioritized and context-driven recommendations, curated both by your AWS account team and machine-generated checks from AWS services. Note: AWS Trusted Advisor Priority […]

Gain operational insights for NVIDIA GPU workloads using Amazon CloudWatch Container Insights

As machine learning models grow more advanced, they require extensive computing power to train efficiently. Many organizations are turning to GPU-accelerated Kubernetes clusters for both model training and online inference. However, properly monitoring GPU usage is critical for machine learning engineers and cluster administrators to understand model performance and to optimize infrastructure utilization. Without visibility […]

Get Disk Utilization of Your Fleet Using AWS Systems Manager Custom Inventory Types

Get Disk Utilization of Your Fleet Using AWS Systems Manager Custom Inventory Types

Some of my customers need assistance while operating their Amazon Elastic Compute Cloud (Amazon EC2) infrastructure. They need to: Review the disk usage of various volumes/ disks within an EC2 instance. To do it in a scalable way, one does not need to access the instance either through a Remote Desktop Session (RDP) or use […]

Image with a blue background with the following text Accelerate VMware Migrations to AWS using AWS Migration Hub Journeys

Accelerate VMware Migrations to AWS using AWS Migration Hub Journeys

In January 2024, we introduced Migration Hub Journeys to guide and accelerate the migration and modernization of applications. Journeys help optimize planning, execution, and tracking through task-based templates with expert guidance, specialized tools, and cross-team collaboration, enabling you to migrate and modernize applications seamlessly. Today, we’re excited to publish new migration journey templates for AWS […]

Enabling Self Service for Cloud Custodian policies on AWS using AWS Service Catalog

Customers are increasingly seeking tools and solutions that can help them achieve their desired outcomes more efficiently and effectively. In the context of cloud management, the need for self-service capabilities has become more pronounced as organizations strive to optimize their cloud resources, improve security, and enhance their overall cloud operations. AWS Service Catalog offers the […]

Enhancing observability with a managed monitoring solution for Amazon EKS

Enhancing observability with a managed monitoring solution for Amazon EKS

Introduction Keeping a watchful eye on your Kubernetes infrastructure is crucial for ensuring optimal performance, identifying bottlenecks, and troubleshooting issues promptly. In the ever-evolving world of cloud-native applications, Amazon Elastic Kubernetes Service (EKS) has emerged as a popular choice for deploying and managing containerized workloads. However, monitoring Kubernetes clusters can be challenging due to their […]

Event Driven Architecture using Amazon EventBridge - Part 1

Event Driven Architecture using Amazon EventBridge – Part 1

This post is co-authored with Andy Suarez and Kevin Breton (from KnowBe4). For any successful growing organization, there comes a point when the technical architecture struggles to meet the demands of an expanding and interconnected business environment. The increasing complexity and technical debt in legacy systems create pain points that constrain innovation. To overcome these […]

How to automate application log ingestion from Amazon EKS on Fargate into AWS CloudTrail Lake

How to automate application log ingestion from Amazon EKS on Fargate into AWS CloudTrail Lake

Customers often look for options to capture and centralized storage of application logs from Amazon Elastic Kubernetes Service on Fargate (Amazon EKS on Fargate) Pods to investigate root causes or analyze security incidents. Customers also like the capability to easily query the logs to assist with security investigations. In this blog post, we show you […]

Migrate VMware virtual machines (VMs) to Amazon EC2 with the AWS Application Migration Service Replication Agent

Introduction In this blog post, we will walk you through the step-by-step process of completing VMware virtual machine (VM) migrations to Amazon Elastic Compute Cloud (Amazon EC2) using the (Application Migration Service). Moreover, we will show how to apply a custom post-launch action script to remove proprietary VMware tools from the migrated VMs. Migrating on-premises […]

How StormForge reduces complexity and ensures scalability with Amazon Managed Service for Prometheus

This blog post was co-written by Brent Eager, Senior Software Engineer, StormForge StormForge is the creator of Optimize Live, a Kubernetes vertical rightsizing solution that is compatible with the Kubernetes HorizontalPodAutoscaler (HPA). Using cluster-based agents, machine learning, and Amazon Managed Service for Prometheus, Optimize Live is able to continuously calculate and apply optimal resource requests, […]