AWS Cloud Operations & Migrations Blog

Tag: observability

How Audible used Amazon CloudWatch cross-account observability to resolve severity tickets faster

This blog was co-written with Audible’s Apurva Jatakia, Kaushik S., and David Etler. Audible’s consumption services platform serves thousands of requests every second, and each incoming request is served by a distributed set of microservices owned by different teams. An Audible team, in charge of a platform called Stagg, is responsible for five separate microservices. […]

Announcing inbound network access control in Amazon Managed Grafana

Many customers that use Amazon Managed Grafana have a need to restrict the Grafana workspace public access and enable fine-grained control to allow which traffic sources can reach the Grafana workspace. Today, we are announcing Amazon Managed Grafana’s new feature that supports inbound network access control. This enables you to secure Grafana workspaces using VPC […]

How CloudWatch cross-account observability helps JPMorgan Chase improve Federated Data Lake Monitoring

AWS best practices guide customers to deploy their applications across multiple AWS accounts to establish security and billing boundary between teams and to reduce the impact of operational events. As enterprises grow and scale with tons of resources, customers often need a unified observability experience to help them search, visualize, and analyze their cross-account telemetry […]

Title of blog on box image

What’s new in AWS Observability at re:Invent 2022

Kick off your AWS re:Invent 2022 week with a round-up of the AWS Observability launches across Amazon CloudWatch, AWS X-Ray, Amazon Managed Grafana, and Amazon Managed Service for Prometheus. From understanding impact of internet issues on your application performance and availability with CloudWatch, to VPC support and Prometheus alerting in Managed Grafana, read on to […]

How to develop an Observability strategy – Part 2

Your observability strategy starts with your business. “Observability” describes how well you can understand what’s happening in a system. Developing an observability strategy isn’t a one-time effort. It’s a continuous improvement effort that occurs throughout the lifecycle of your workloads. It enables your teams to determine whether or not the workloads they design and run […]

Viewing collectd statistics with Amazon Managed Service for Prometheus and Amazon Managed Service for Grafana

Monitoring systems are essential for a resilient solution. A popular tool to monitor Linux-based physical or virtual machines is collectd – a daemon to collect system and application performance metrics periodically. However, collectd doesn’t provide long-term storage for metrics, rich querying, visualization, or an alerting solution. The Amazon Managed Service for Prometheus is a serverless […]

Amazon Athena, Amazon Redshift Plugins and New Features in Amazon Managed Grafana

During late August 2021, we made Amazon Managed Grafana generally available, and around re:Invent we launched some new features, specifically for new plugins. This post provides you with the high-level overview and shows you some of them in action. Amazon Managed Grafana is a fully managed service that handles the provisioning, setup, scaling, and maintenance […]

How BT uses Amazon CloudWatch to monitor millions of devices

In this guest post, Ciaran Kearney, Data Engineer at multinational telecommunications company BT discusses how BT built a monitoring solution using Amazon CloudWatch dashboards, composite alarms, and embedded metric format to support the monitoring of millions of devices. Customers with high-cardinality monitoring use cases often face challenges when it comes to implementing observability. Monitoring high-cardinality workloads […]

Getting Started with Amazon Managed Service for Prometheus

4/9/2021 – Updated the Prometheus server deployment setup part by removing the AWS SigV4 side-car proxy container. This is no longer needed as the Prometheus server now directly signs requests made to the AMP remote write API. Amazon Managed Service for Prometheus (AMP) is a Prometheus-compatible monitoring service for container infrastructure and application metrics for […]

Featured Image

Cross-Region application monitoring using Amazon CloudWatch Synthetics and AWS CloudFormation

Customers need a way to find problems with their application before the real end users encounter them. They need to predict how their application will perform in supported geographies and isolate the root cause of any detected bottlenecks. Synthetic monitoring allows customers to emulate business processes or user transactions from different geographies and monitor their […]