AWS Cloud Operations & Migrations Blog

Category: Monitoring and observability

Collect, aggregate, and analyze Rancher Kubernetes Cluster logs with Amazon CloudWatch

Collect, aggregate, and analyze Rancher Kubernetes Cluster logs with Amazon CloudWatch

Rancher is a popular open-source container management tool utilized by many organizations that provides an intuitive user interface for managing and deploying the Kubernetes clusters on Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Compute Cloud (Amazon EC2). When Rancher deploys Kubernetes onto nodes in Amazon EC2, it uses Rancher Kubernetes Engine (RKE), which is Rancher’s […]

SNMP monitoring using Amazon CloudWatch and Elastic Logstash

SNMP monitoring using Amazon CloudWatch and Elastic Logstash

Customers want a single pane of glass for their systems operations where they can visualize the health and performance of applications running in several AWS Regions and in their on-premises environment. Simple Network Management Protocol (SNMP) is an internet standard protocol for collecting and organizing information about managed devices on IP networks and for modifying […]

Amazon Managed Grafana is now Generally Available

Amazon Managed Grafana is now Generally Available

At re:Invent 2020, we introduced Amazon Managed Grafana and made it available in preview. Since then, we’ve been working on numerous enhancements that were made available during preview. Now we’re excited to launch Amazon Managed Grafana in General Availability (GA), and with this post we’ll lay out exactly what this means. Figure 1: List of […]

Amazon Managed Grafana supports direct SAML integration with identity providers

Amazon Managed Grafana supports direct SAML integration with identity providers

In response to customer requests, Amazon Managed Grafana now supports direct Security Assertion Markup Language (SAML) 2.0 integration, without the need to go through AWS Identity and Access Management (AWS IAM) or AWS Single Sign-On (AWS SSO). SAML authentication support enables you to use your existing identity provider to offer single sign-on for logging into […]

Enhance CloudWatch metrics with metric math functions

Enhance CloudWatch metrics with metric math functions

In June 2021, the Amazon CloudWatch team launched 14 new metric math functions. In this blog post, I’ll describe these new functions and show how you can use them to enhance your existing CloudWatch metrics, dashboards, and alarms. Metrics are an important part of observability and monitoring. A numerical representation of data measured over time, […]

Create fine-grained CloudWatch canary schedules with cron expressions

Create fine-grained CloudWatch canary schedules with cron expressions

In this post, I’ll explain how to create fine-grained canary schedules to meet your business requirements using built-in cron expression scheduling in Amazon CloudWatch Synthetics. You can use CloudWatch Synthetics to create canaries, configurable scripts that run on a schedule, to monitor your endpoints and APIs. Because canaries follow the same routes and perform the […]

Implement operations observability in landing zone environments

Implement operations observability in landing zone environments

In an earlier blog post, Automate customized deployment of cross-account/cross-region CloudWatch dashboards using tags, we showed you how to implement Amazon CloudWatch dashboards for specific events with automation. This solution is great for seasonal events, holidays, important releases, and other use cases. In this blog post, we will review a landing zone environment and share a […]

Use Amazon EventBridge rules to run AWS Systems Manager automation in response to CloudWatch Alarms

Use Amazon EventBridge rules to run AWS Systems Manager automation in response to CloudWatch alarms

Since its launch in 2009, Amazon CloudWatch has become the cloud-native choice for a monitoring and observability service built for DevOps engineers, developers, site reliability engineers (SREs), and IT managers. CloudWatch provides you with data and actionable insights to monitor your applications, respond to system-wide performance changes, optimize resource utilization, and get a unified view […]

Improve monitoring of AWS Systems Manager Agent

Improve monitoring of AWS Systems Manager Agent

The ability to present a single pane of glass simplifies the process of tracking and controlling IT systems. Enterprises that run workloads on AWS use AWS Systems Manager because of its security, ease of management, and centralized reporting. When an agent loses connection to the management platform, you can lose visibility into system behavior and […]