AWS Cloud Operations Blog
Category: Amazon CloudWatch
Amazon CloudWatch Application Signals new enhancements for application monitoring
Today, we’re excited to announce new enhanced features in Amazon CloudWatch Application Signals that simplifies how you monitor large-scale distributed applications. Improvements to CloudWatch Application Signals application map automatically discovers and organizes services into groups based on their relationships, with support for custom grouping that aligns with your business perspective. You can now view the […]
Embracing AI- driven operations and observability at re:Invent 2025
As organizations continue to scale their cloud presence, effective operations become increasingly critical for success. AWS re:Invent 2025’s Cloud Operations track brings together industry experts, AWS leaders, and customers to share insights on modernizing monitoring & observability through This blog post will guide you through the key themes of operations and observability and highlight sessions […]
Reimagine AIOps with Amazon CloudWatch Investigations and Amazon Nova Sonic
Reimagine AIOps with Amazon CloudWatch Investigations and Amazon Nova Sonic in Amazon Bedrock to transform how cloud operations teams handle incidents. Traditional monitoring approaches require engineers to navigate multiple complex dashboards, analyze extensive logs, and manually execute remediation steps—a process that becomes particularly challenging during after-hours incidents or when away from workstations. When minutes matter […]
Simplifying Log Management using Amazon CloudWatch Logs Centralization
Managing logs across multiple AWS accounts and regions has always been a complex challenge for organizations. As AWS infrastructure grows to include separate accounts for production, development, and staging environments, along with regions, the complexity of log management increases exponentially. During critical incidents, especially during off-hours, teams spend valuable time, searching through multiple accounts, correlating […]
Optimizing metrics ingestion with Amazon Managed Service for Prometheus
Managing metrics collection at scale in complex cloud environments presents significant challenges for organizations, particularly when it comes to controlling costs and maintaining operational efficiency. As the volume of metrics grows exponentially with the expansion of container deployments and other cloud-native workloads, customers often struggle to balance comprehensive monitoring with resource optimization. This can lead […]
Advanced analytics using Amazon CloudWatch Logs Insights
Effective log management and analysis are critical for maintaining robust, secure, and high-performing systems. Amazon CloudWatch Logs Insights has long been a powerful tool for searching, filtering, and analyzing log data across multiple log groups. The addition of OpenSearch Piped Processing Language (PPL) and OpenSearch SQL language query support offers greater flexibility and familiarity in […]
Enhance your AIOps: Introducing Amazon CloudWatch and Application Signals MCP servers
Modern architectures generate vast amounts of observability data across metrics, logs, and traces. When issues arise, teams spend hours—sometimes days—manually correlating information across multiple dashboards to identify root causes, directly impacting MTTR and productivity. Amazon CloudWatch Application Signals addresses this challenge by providing deep application visibility through automatic instrumentation, capturing key metrics like latency, error […]
Alarming on SLOs in Amazon Search with CloudWatch Application Signals – Part 2
In practice: SLO monitoring with CloudWatch Application Signals In the previous post, we’ve shared the basic concepts and benefits of burn rate monitoring. In this post, we, the Amazon Product Search team, will share anecdotes from our migration from an in-house solution to CloudWatch Application Signals, and introduce how we actually implement monitoring and dashboards. […]
Alarming on SLOs in Amazon Search with CloudWatch Application Signals – Part 1
In theory: SLO concepts applied to Amazon Product Search In this series of posts, we will show you how we, the Amazon Product Search team, monitor key systems using Service Level Objectives (SLOs) and share our migration journey from an in-house solution to Amazon CloudWatch Application Signals. Amazon Product Search is a large distributed system […]
Using Amazon Bedrock and Amazon Nova for AI-Powered Incident Response
In today’s cloud-native world, incident response teams face overwhelming challenges. When critical applications fail, engineers must sift through mountains of observability data across multiple services; all while under intense pressure to restore service quickly. This manual correlation process is time-consuming, error-prone, and often delays resolution, resulting in extended outages and frustrated customers. Traditional monitoring tools […]