AWS Cloud Operations Blog

Category: Monitoring and observability

Use Amazon CloudWatch Internet Monitor for greater visibility into online experiences

Today millions of internet users access applications hosted globally across 167,000 cities served by over 74,000 autonomous systems (ASNs). Tracking constantly changing network routes can be a daunting task for Site Reliability Engineers (SREs), application developers, network operators, systems engineers, and cloud solutions architects. With Amazon CloudWatch Internet Monitor, teams can quickly identify the network […]

Visualize and gain insights into your VPC Flow logs with Amazon Managed Grafana

Modern IT infrastructure in Cloud is becoming increasingly distributed and data intensive. With the growing number of devices, applications, and users consuming the services, the amount of data being transmitted across networks is increasing rapidly. This increase in data warrants organizations to have visibility in the network traffic. Analysis of network traffic can help in […]

How Hapag-Lloyd established observability for serverless multi-account workloads

This post is co-authored by Grzegorz Kaczor from Hapag-Lloyd AG and Michael Graumann and Daniel Moser from AWS. Introduction Establishing observability over the state, performance, health, and security posture of applications is key to successfully operating multi-account workloads in the cloud. As the number and size of workloads increases, finding and correlating all available information […]

How CloudWatch cross-account observability helps JPMorgan Chase improve Federated Data Lake Monitoring

AWS best practices guide customers to deploy their applications across multiple AWS accounts to establish security and billing boundary between teams and to reduce the impact of operational events. As enterprises grow and scale with tons of resources, customers often need a unified observability experience to help them search, visualize, and analyze their cross-account telemetry […]

Top 10 AWS Cloud Operations and Migrations Blog posts of 2022

With 2022 behind us, we want to take the opportunity to highlight our readers and the top blog posts from 2022. A big thank you to all our readers but also our authors who continue to work on delighting our customers with their blog posts. #1 Announcing AWS CloudTrail Lake – a managed audit and […]

Monitoring the status of Windows services with Amazon CloudWatch

When you have an application that relies on a specific Windows service being up and running, knowing the status of this service can be a useful part of your observability solution. This service status data can be displayed on dashboards, used to create alarms, or used to trigger automated resolutions. This post presents a solution […]

Visualizing Amazon CloudWatch Costs – Part 2 – Where does the data come from?

In part 1 of this series we explored an Amazon CloudWatch dashboard which provides a real-time view of some of the typical main contributors to CloudWatch costs. In this second post, we’ll look at how the CloudWatch dashboard widgets were created so that you can learn how to create something similar, or modify the widgets […]

Visualizing Amazon CloudWatch Costs – Part 1

Amazon CloudWatch monitors your AWS resources and the applications you run on AWS in real-time. You can use CloudWatch to collect metrics, logs, traces, set up alarms, create synthetic checks, and more. The information you collect lets you observe, validate, and alert on areas of interest to you. In this two-part post, we’ll explore a […]

Picture of cube with title of blog

Know Before You Go – AWS re:Invent 2022 Monitoring & Observability

Whether you are building out applications in the cloud, modernizing your environment, or migrating workloads, observability is vital to your success. Monitoring and observability provide operational visibility and insight into your workloads and are crucial to operational excellence. AWS Observability will be at re:Invent 2022 to share how you can leverage observability for your organization. […]

How to Monitor Databricks with Amazon CloudWatch

This post was written by Lei Pan and Sajith Appukuttan from Databricks. In this post, we look closely at monitoring and alerting systems – both critical components of any production-level environment. We’ll start with a review of the key reasons why engineers should build a monitoring/alerting system for their environment, the benefits, as well as […]