AWS Cloud Operations & Migrations Blog

Category: Management Tools

Identifying resilience drift using AWS Resilience Hub

Most people think of disaster recovery as a mechanism to protect their applications against big events. However, in the fast-paced world of development where new code and infrastructure changes are occurring several times a month, it is important to put mechanisms in place to proactively understand impacts to the resilience posture of your applications. In […]

VTEX scales to 150 million metrics using Amazon Managed Service for Prometheus

VTEX scales to 150 million metrics using Amazon Managed Service for Prometheus

VTEX is a multi-tenant platform with a distributed engineering operation. Observing hundreds of services in real time in an efficient manner is a technical challenge for the business. In this blog, we will show how VTEX created a resilient open source-based architecture aligned with a sharding strategy, using Amazon Managed Service for Prometheus (AMP) to […]

Automating Amazon EC2 Instances Monitoring with Prometheus EC2 Service Discovery and AWS Distro for OpenTelemetry

Traditionally, scraping application Prometheus metrics required manual updates to a configuration file, posing challenges in dynamic AWS environments where Amazon EC2 instances are frequently created or terminated. This not only proves time consuming but also introduces the risk of configuration errors, lacking the agility necessary in dynamic environments. In this blog post, we will demonstrate […]

Monitor your AWS resources on your mobile device with AWS Console Mobile Application

AWS customers are increasingly relying on AWS User Notifications to monitor and get real-time notifications about the AWS resources that are most important to them. The AWS Console Mobile Application can be configured as a notification delivery channel, where users can monitor AWS resources, get detailed resource notifications, diagnose issues, and take remedial actions, from […]

Streamline Platform Engineering using AWS CodeStar Connections with AWS Service Catalog

Introduction AWS Service Catalog and AWS CloudFormation now support Git-sync capabilities to allow Platform Engineers to streamline their DevOps processes by keeping their Infrastructure as Code (IaC) templates in their source control libraries like GitHub and BitBucket. These enhancements help Platform Engineers to more effectively create, version, and manage their Well-Architected patterns with application teams […]

Accelerate troubleshooting with structured logs in Amazon CloudWatch

Accelerate troubleshooting with structured logs in Amazon CloudWatch

Troubleshooting often involves complex analysis across fragmented telemetry data. While alarms on metrics can signal high-level deviations, deeper context often resides in other areas such as log messages, which help uncover the root cause. This disjointed approach not only consumes time and effort, but also inflates telemetry costs. In this post, we’ll showcase how structured […]

Enhance Kubernetes Operational Visibility with AWS Chatbot

Many customers run their mission critical container workloads on Amazon Web Services (AWS)  using Amazon Elastic Kubernetes Service (Amazon EKS). One of the key focus areas for them is to analyze and act on operational events quickly. Getting real-time visibility into performance issues, traffic spikes and infrastructure events can enable teams to quickly address issues and […]

Enhance your AWS cloud infrastructure security with AWS Managed Services (AMS)

Introduction A security or data loss incident can lead to both financial and reputational losses. Maintaining security and compliance is a shared responsibility between AWS and you (our customer), where AWS is responsible for “Security of the Cloud” and you are responsible for “Security in the Cloud”. However, security in the cloud has a much […]

Simplify query authoring in AWS Config advanced queries with natural language query generation

AWS Config advanced queries provide a SQL-based querying interface to retrieve resource configuration metadata of AWS resources and identify resource compliance state. You can use AWS Config advanced queries in a single AWS Account and Region or in a multi-account and cross-region setup with AWS Config configuration aggregators. Writing queries requires you to know SQL […]

How EverQuote Underwent a Serverless Transformation using AWS

This post is co-written with Conor Teer, Senior Software Engineer, at EverQuote, David Kelly, Principal Software Engineer at EverQuote, and Mark O’Connell, SVP of Engineering at EverQuote. EverQuote is a leading online insurance marketplace that helps protect life’s most important assets- family, property, and future by simplifying the experience of shopping for insurance, making it […]