AWS Cloud Operations Blog
Category: Amazon CloudWatch
How Audible used Amazon CloudWatch cross-account observability to resolve severity tickets faster
This blog was co-written with Audible’s Apurva Jatakia, Kaushik S., and David Etler. Audible’s consumption services platform serves thousands of requests every second, and each incoming request is served by a distributed set of microservices owned by different teams. An Audible team, in charge of a platform called Stagg, is responsible for five separate microservices. […]
Build Cloud Operations skills using the new AWS Observability Training
Full-stack observability at AWS includes AWS-native, Application Performance Monitoring (APM), and open-source solutions, giving you the ability to understand what is happening across your technology stack at any time. AWS Observability lets you collect, correlate, aggregate, and analyze telemetry in your network, infrastructure, and applications in the cloud, hybrid, or on-premises environments so you can gain […]
Using Amazon CloudWatch metrics to monitor time to expiration for Reserved Instances | Amazon Web Services
This post shows you how to monitor the days remaining for Amazon EC2 Reserved Instances. The solution uses a custom Amazon CloudWatch metric published via an AWS Lambda function. It creates a CloudWatch alarm and an Amazon Simple Notification Service (Amazon SNS) topic for notification when the alarm exceeds the user-defined threshold. CloudWatch allows you […]
How Capgemini used AWS Systems Manager and AWS cloud native observability to provide self-service logging and analytics
This post was written in collaboration with David Wansell, an Enterprise Cloud Architect at Capgemini with over 20 years of experience across multiple enterprise domains. He designs and builds automation and solutions that enable customers to deliver on their desired outcomes in their cloud adoption journey. Log analysis helps customers to manage infrastructure and applications […]
Monitoring Amazon RDS and Amazon Aurora using Amazon Managed Grafana
Organizations running critical applications on AWS using fully managed database services such as Amazon Relational Database Service (Amazon RDS) and Amazon Aurora rely on robust monitoring to ensure that their databases are performant, and cause no service disruptions to their customers. Amazon Managed Grafana is a fully managed and secure data visualization service that you […]
How to Automate Incident Response with PagerDuty and AWS Systems Manager Incident Manager
Incident response is a core operations capability for organizations to develop, and a core element in the AWS Cloud Adoption Framework (AWS CAF). Responding to operations incidents quickly is important to minimize their impacts. Automating incident response helps you scale your capabilities, rapidly reduce the recovery time, and reduce repetitive work by your cloud operations teams. […]
Group Amazon CloudWatch Synthetics canaries for an aggregated view across regions
Customers frequently use CloudWatch canaries to monitor their applications that enables them to identify issues pro-actively and resolve them before they reach their end users. In today’s world with the cloud making it much simpler to expand globally and provision infrastructure across different parts of the world, customers tend to localize their infrastructure to the […]
How CloudWatch cross-account observability helps JPMorgan Chase improve Federated Data Lake Monitoring
AWS best practices guide customers to deploy their applications across multiple AWS accounts to establish security and billing boundary between teams and to reduce the impact of operational events. As enterprises grow and scale with tons of resources, customers often need a unified observability experience to help them search, visualize, and analyze their cross-account telemetry […]
Using Amazon CloudWatch RUM with a React web application in five steps
In this post we will explain how you can use Amazon CloudWatch RUM to monitor a single-page web application built using React. CloudWatch RUM is a real user monitoring (RUM) capability which helps you identify and debug client-side issues and enhance the end user’s digital experience. The data that you can visualize and analyze includes […]
How Thomson Reuters used Amazon CloudWatch to improve availability and operational efficiency of Directory Services
Thomson Reuters Corporation (TR) is a Canadian multinational media company that provides critical online and print information, know-how, decision making tools, software, and services for the legal industry. TR’s Tax and Accounting business serves law firms, tax and accounting firms, global trade organizations, educational institutions, and more. Thomson Reuters operates in more than 100 countries […]