AWS Cloud Operations Blog
Category: Management & Governance
VTEX scales to 150 million metrics using Amazon Managed Service for Prometheus
VTEX is a multi-tenant platform with a distributed engineering operation. Observing hundreds of services in real time in an efficient manner is a technical challenge for the business. In this blog, we will show how VTEX created a resilient open source-based architecture aligned with a sharding strategy, using Amazon Managed Service for Prometheus (AMP) to […]
Automating Amazon EC2 Instances Monitoring with Prometheus EC2 Service Discovery and AWS Distro for OpenTelemetry
Traditionally, scraping application Prometheus metrics required manual updates to a configuration file, posing challenges in dynamic AWS environments where Amazon EC2 instances are frequently created or terminated. This not only proves time consuming but also introduces the risk of configuration errors, lacking the agility necessary in dynamic environments. In this blog post, we will demonstrate […]
Monitor your AWS resources on your mobile device with AWS Console Mobile Application
AWS customers are increasingly relying on AWS User Notifications to monitor and get real-time notifications about the AWS resources that are most important to them. The AWS Console Mobile Application can be configured as a notification delivery channel, where users can monitor AWS resources, get detailed resource notifications, diagnose issues, and take remedial actions, from […]
Accelerate troubleshooting with structured logs in Amazon CloudWatch
Troubleshooting often involves complex analysis across fragmented telemetry data. While alarms on metrics can signal high-level deviations, deeper context often resides in other areas such as log messages, which help uncover the root cause. This disjointed approach not only consumes time and effort, but also inflates telemetry costs. In this post, we’ll showcase how structured […]
Enhance your AWS cloud infrastructure security with AWS Managed Services (AMS)
Introduction A security or data loss incident can lead to both financial and reputational losses. Maintaining security and compliance is a shared responsibility between AWS and you (our customer), where AWS is responsible for “Security of the Cloud” and you are responsible for “Security in the Cloud”. However, security in the cloud has a much […]
Optimize AWS Resource Management with Tag Inventory Reports leveraging AWS Resource Explorer
Customers are increasingly seeking an efficient solution to manage their expanding AWS resources, spanning AWS accounts and Regions, amidst changes like mergers, acquisitions, and cloud migrations. AWS Tags offer an effective solution for organizing, identifying, and filtering resources by categorizing them based on criteria such as purpose, owner, or environment. AWS customers would like to […]
Leveraging custom AWS Config rules to optimize cost saving on AWS
AWS Config assesses, audits, and evaluates the configurations and relationships of your resources in your AWS account. Why might we want to use this service for cost optimization? Well consider a scenario where we can be alerted if a specific Amazon Relational Database Service (Amazon RDS) instance is deployed in the account. If a larger […]
Implementing automated and centralized tagging controls with AWS Config and AWS Organizations
Introduction This blog post is for customers who want to implement automated tagging controls and strategy for cost allocation. Customers want to centralize and maintain consistency for tags across AWS Organizations so they are available outside their AWS environment (e.g. in build scripts, etc.) or enforce centralized conditional tagging on existing and new AWS resources […]
From Planning to Execution – Harnessing AWS Migration Hub Journeys to Accelerate Migrations and Modernization
Cloud migrations and modernization are a lengthy, intricate, and continually evolving processes. Despite this, McKinsey studies indicate that customers are increasing cloud budgets and the number of applications that they plan to migrate. One of the primary complexities of migration and modernization projects are that collaboration with stakeholders can be cumbersome, relying on random ad-hoc […]
Monitoring Windows services with Amazon CloudWatch
If you run Windows workloads on Amazon Elastic Compute Cloud (Amazon EC2), monitoring the health and performance of your Windows Services is essential for reliable systems administration. It’s not just about ensuring uptime; it’s about having a pulse on your system’s health and performance. With a variety of services operating in the background, each playing […]