Monitoring and observability | AWS Cloud Operations Blog

Using Amazon Bedrock and Amazon Nova for AI-Powered Incident Response

In today’s cloud-native world, incident response teams face overwhelming challenges. When critical applications fail, engineers must sift through mountains of observability data across multiple services; all while under intense pressure to restore service quickly. This manual correlation process is time-consuming, error-prone, and often delays resolution, resulting in extended outages and frustrated customers. Traditional monitoring tools […]

Launching Amazon CloudWatch generative AI observability  (Preview)

As organizations rapidly deploy large language models (LLMs) and generative AI agents to power increasingly intelligent workloads, they struggle to monitor and troubleshoot the complex interactions within their AI applications. Traditional monitoring tools fall short in providing the visibility across components, leading to developers and AI/ML engineers to manually correlate interaction logs or building custom […]

SAP on AWS – Streamlined Operations and Monitoring

SAP ERP (Enterprise Resource Planning) systems are at the core of many enterprises, supporting a wide range of mission-critical processes, including Procure to Pay, Order to Cash, Production Planning, Financial Accounting, Supply Chain Management (SCM), and Human Capital Management. Given the critical role of SAP ERP, maintaining the stability, security, and efficiency of these ERP […]

Observing Agentic AI workloads using Amazon CloudWatch agent

Introduction As the adoption of agentic AI applications continues to grow, ensuring the reliability, performance, and overall observability of these systems becomes increasingly critical. Agentic AI applications, powered by large language models (LLM) and integrated with various data sources and APIs, can quickly become complex, making it challenging to gain visibility into their inner workings […]

How Indegene Optimizes User Experience with Amazon CloudWatch

In today’s digital healthcare landscape, optimal application performance and user experience are crucial for business success. Indegene, a digital-first life sciences commercialization company, combines deep medical expertise with domain-contextualized technology to help clients accelerate innovation, modernize operations, and improve customer experience. With the world’s top 20 pharma companies among its clientele, Indegene brings an AI-first […]

Key Governance, Risk, and Compliance Sessions at re:Inforce 2025

We are incredibly excited to see you at AWS re:Inforce, in Philadelphia, Pennsylvania, on June 16-18, 2025. This year’s Governance, Risk, and Compliance track features sessions on automating compliance, enhancing risk visibility, using generative AI for business growth, and maintaining security at scale, including 5 breakout sessions, 8 builder sessions, 7 chalk talks, 2 code […]

How Hapag-Lloyd automated incident management using AWS Step Functions

This post is co-authored by Grzegorz Kaczor and Daniel Steenbock from Hapag-Lloyd AG and Michael Graumann and Daniel Moser from AWS. Introduction In today’s fast-paced digital landscape, efficient incident management is crucial for maintaining high-quality customer experiences. In our previous article we discussed how the Web and Mobile department at Hapag-Lloyd established observability for serverless […]

Enhance your global network performance: A deep dive into Internet Monitor’s new optimization tools

Overview The Internet Monitor feature of Amazon CloudWatch Network Monitoring now includes enhanced traffic optimization recommendation guidance that you can use to explore how to help optimize your application’s latency by using different AWS Regions or Local Zones, or by using Amazon CloudFront. You can also learn how to reduce latency by routing specific IP […]

Get Operational Insights Fast with AWS Health and Amazon Q

For organizations with multiple AWS accounts, staying on top of planned AWS service changes and events is critical to keep operations and business running smoothly. Organizations use AWS Health for ongoing visibility into resource performance and the availability of AWS services and accounts, but the volume of notifications from AWS Health can sometimes be overwhelming. […]

How to detect and monitor Amazon Simple Storage Service (S3) access with AWS CloudTrail and Amazon CloudWatch

While protection of data is critical, equally important is observing who accesses it. AWS services allow you to control your data by determining where it’s stored, who has access, and how it’s secured. AWS CloudTrail provides an effective way to track data access activities. You can detect access attempts, and identify potential unauthorized attempts. CloudTrail, […]

AWS Cloud Operations Blog

Category: Monitoring and observability