AWS Cloud Operations Blog
Getting Started with CloudWatch agent and collectd
Observability helps you understand the health, usage, performance, and customer experience for your workloads. Observability can support many use cases, from detecting incidents and supporting incident resolution, to understanding the impact of new features on your users and workflow. Establishing the right solution depends on being able to gather the right data for your situation. […]
Monitoring the status of Windows services with Amazon CloudWatch
When you have an application that relies on a specific Windows service being up and running, knowing the status of this service can be a useful part of your observability solution. This service status data can be displayed on dashboards, used to create alarms, or used to trigger automated resolutions. This post presents a solution […]
Visualizing Amazon CloudWatch Costs – Part 2 – Where does the data come from?
In part 1 of this series we explored an Amazon CloudWatch dashboard which provides a real-time view of some of the typical main contributors to CloudWatch costs. In this second post, we’ll look at how the CloudWatch dashboard widgets were created so that you can learn how to create something similar, or modify the widgets […]
Visualizing Amazon CloudWatch Costs – Part 1
Amazon CloudWatch monitors your AWS resources and the applications you run on AWS in real-time. You can use CloudWatch to collect metrics, logs, traces, set up alarms, create synthetic checks, and more. The information you collect lets you observe, validate, and alert on areas of interest to you. In this two-part post, we’ll explore a […]
Extending and exploring alarm history in Amazon CloudWatch – part 1
Alarm history data can be invaluable in diagnosing trends, impacts and root causes for issues in your application. In this two-part blog series, we will demonstrate how to move beyond the standard 14 day alarm history, and turn your Amazon CloudWatch alarm state changes into logs and metrics that you can graph on your CloudWatch […]
Extending and exploring alarm history in Amazon CloudWatch – part 2
In part 1 of this blog series, we demonstrated how to utilize an Amazon EventBridge rule to create Amazon CloudWatch logs and metrics from a change in state of your CloudWatch alarms. To diagnose trends, impacts, and root causes, you may want to see trends in alarm history or visualize this data alongside other CloudWatch […]
Operational insights in Systems Manager OpsCenter help you identify duplicate issues and noisy event sources
If you use AWS Systems Manager OpsCenter, you might be familiar with the challenges of large numbers of OpsItems. When the same problem causes the creation of a significant number of OpsItems, it can be hard to see that these OpsItems are in fact the result of a single issue. It can also be difficult […]



