AWS Cloud Operations Blog

Category: Management Tools

Exploring AWS Config data using Amazon Athena and Amazon Managed Grafana

This post is co-written with Jacob Rickerd, Principal Security Engineer at Attentive. The post walks through an example dashboard that Attentive, an AI-powered mobile marketing platform, uses for resource inventory, serving as a starting point for you to build comprehensive dashboards tailored to your environment and tag policies. Attentive is the AI-powered SMS and email […]

Support for Amazon CloudWatch Evidently ending soon

After careful consideration, we have made the decision to discontinue CloudWatch Evidently, effective 10/17/2025. Active customers will be able to use the service as normal until 10/17/2025, when support for the service will end. During this period, we will continue to provide critical security patches, but will no longer support any limit increase requests. On […]

Streamline change processes ­and improve governance with AWS Well-Architected

The AWS Well-Architected Framework (WA Framework) is designed to help cloud architects build secure, resilient, high-performing, and efficient workloads on AWS. It is structured around six pillars: Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, and Sustainability. Figure 1. The pillars of AWS Well-Architected Framework This post provides insights on how to streamline your change-management […]

Centralized monitoring and alerting for AWS Systems Manager Agent status on managed nodes across AWS Organization

Has the AWS Systems Manager Agent (SSM Agent) running on your critical servers on-premises or on Amazon Elastic Compute Cloud (Amazon EC2) lost healthy connection to AWS Systems Manager (SSM) for some reason and you wanted to be proactively notified when this happens? Do you wish to improve observability of your SSM Agent status and […]

Analyzing your custom metrics spend contributors in Amazon CloudWatch

With an ever-growing volume of custom metrics in Amazon CloudWatch, customers often find it difficult to understand and manage their spend on this service. One of the most common questions they have is how to identify which metrics contribute the most to their spend in CloudWatch. This blog post introduces a solution that lets you […]

Enhanced dashboard, latency suggestions in Amazon CloudWatch Internet Monitor

Amazon CloudWatch Internet Monitor provides near-continuous internet measurements for your internet traffic, including availability and performance metrics, tailored to your specific workload footprint on AWS. With Internet Monitor, you can get insights into average internet performance metrics over time, as well as get alerts for issues (health events). You’re notified about events that impact your end […]

Bootstrap your chaos engineering journey with AWS Fault Injection Service Scenarios Library

Ensuring the reliability and resilience of applications is crucial for maintaining business continuity, delivering a superior customer experience, and staying compliant with industry regulations. As defined in the AWS Well-Architected Framework Reliability Pillar, testing reliability plays an important role in ensuring reliability. Chaos engineering is a powerful way to not only test how your systems […]

Title image that says managing access to AWS accounts from Microsoft Teams and Slack at scale using AWS Organizations and AWS Chatbot

Managing access to AWS accounts from Microsoft Teams and Slack at scale using AWS Organizations and AWS Chatbot

Customers use chat collaboration applications like Microsoft Teams and Slack to collaborate and manage their AWS applications. AWS Chatbot is a ChatOps service that enables customers to monitor, troubleshoot issues, and manage AWS applications from chat channels. AWS Chatbot provides autonomy and customizability to DevOps teams operating their AWS environments on the go from chat […]

Accelerating migrations and IT Tasks for DKB using AWS Systems Manager

Accelerating migrations and IT Tasks for DKB using AWS Systems Manager

Deutsche Kreditbank AG (DKB), one of Germany’s largest direct banks with over five million customers. In 2023, DKB migrated their back-office IT infrastructure to Amazon Web Services (AWS). This Included their diverse infrastructure, backup, networking, and both Windows and Linux servers, while managing risks like downtime, data integrity, and security vulnerabilities. Customers in regulated industries […]

Automating metrics collection on Amazon EKS with Amazon Managed Service for Prometheus managed scrapers

Managing and operating monitoring systems for containerized applications can be a significant operational burden for customers such as metrics collection. As container environments scale, customers have to split metric collection across multiple collectors, right-size the collectors to handle peak loads, and continuously manage, patch, secure, and operationalize these collectors. This overhead can detract from an […]