AWS Cloud Operations & Migrations Blog
Category: Monitoring and observability
Simplify your canary by batching multiple URLs in Amazon CloudWatch Synthetics
Learn with Shree on how to simplify your canary by batching multiple URLs in Amazon CloudWatch Synthetics.
Collect, aggregate, and analyze Rancher Kubernetes Cluster logs with Amazon CloudWatch
Rancher is a popular open-source container management tool utilized by many organizations that provides an intuitive user interface for managing and deploying the Kubernetes clusters on Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Compute Cloud (Amazon EC2). When Rancher deploys Kubernetes onto nodes in Amazon EC2, it uses Rancher Kubernetes Engine (RKE), which is Rancher’s […]
SNMP monitoring using Amazon CloudWatch and Elastic Logstash
Customers want a single pane of glass for their systems operations where they can visualize the health and performance of applications running in several AWS Regions and in their on-premises environment. Simple Network Management Protocol (SNMP) is an internet standard protocol for collecting and organizing information about managed devices on IP networks and for modifying […]
Amazon Managed Grafana is now Generally Available
At re:Invent 2020, we introduced Amazon Managed Grafana and made it available in preview. Since then, we’ve been working on numerous enhancements that were made available during preview. Now we’re excited to launch Amazon Managed Grafana in General Availability (GA), and with this post we’ll lay out exactly what this means. Figure 1: List of […]
Amazon Managed Grafana supports direct SAML integration with identity providers
In response to customer requests, Amazon Managed Grafana now supports direct Security Assertion Markup Language (SAML) 2.0 integration, without the need to go through AWS Identity and Access Management (AWS IAM) or AWS Single Sign-On (AWS SSO). SAML authentication support enables you to use your existing identity provider to offer single sign-on for logging into […]
Enhance CloudWatch metrics with metric math functions
In June 2021, the Amazon CloudWatch team launched 14 new metric math functions. In this blog post, I’ll describe these new functions and show how you can use them to enhance your existing CloudWatch metrics, dashboards, and alarms. Metrics are an important part of observability and monitoring. A numerical representation of data measured over time, […]
Create fine-grained CloudWatch canary schedules with cron expressions
In this post, I’ll explain how to create fine-grained canary schedules to meet your business requirements using built-in cron expression scheduling in Amazon CloudWatch Synthetics. You can use CloudWatch Synthetics to create canaries, configurable scripts that run on a schedule, to monitor your endpoints and APIs. Because canaries follow the same routes and perform the […]
Implement operations observability in landing zone environments
In an earlier blog post, Automate customized deployment of cross-account/cross-region CloudWatch dashboards using tags, we showed you how to implement Amazon CloudWatch dashboards for specific events with automation. This solution is great for seasonal events, holidays, important releases, and other use cases. In this blog post, we will review a landing zone environment and share a […]
Use Amazon EventBridge rules to run AWS Systems Manager automation in response to CloudWatch alarms
Since its launch in 2009, Amazon CloudWatch has become the cloud-native choice for a monitoring and observability service built for DevOps engineers, developers, site reliability engineers (SREs), and IT managers. CloudWatch provides you with data and actionable insights to monitor your applications, respond to system-wide performance changes, optimize resource utilization, and get a unified view […]
Improve monitoring of AWS Systems Manager Agent
The ability to present a single pane of glass simplifies the process of tracking and controlling IT systems. Enterprises that run workloads on AWS use AWS Systems Manager because of its security, ease of management, and centralized reporting. When an agent loses connection to the management platform, you can lose visibility into system behavior and […]