AWS Cloud Operations Blog
Embracing AI- driven operations and observability at re:Invent 2025
As organizations continue to scale their cloud presence, effective operations become increasingly critical for success. AWS re:Invent 2025’s Cloud Operations track brings together industry experts, AWS leaders, and customers to share insights on modernizing monitoring & observability through
This blog post will guide you through the key themes of operations and observability and highlight sessions that will help you transform your cloud operations strategy.
Plan Your Monitoring & Observability Track Experience
With 30 sessions spanning five key themes, the operations management track offers something for everyone, from hands-on workshops to expert-level discussions. To make the most of your re:Invent experience, we recommend:
- Focus on your priorities: Select sessions that align with your organization’s immediate operational challenges
- Mix formats: Combine lecture-style sessions with interactive workshops and builders’ sessions
- Plan for skill development: Choose sessions that match your current skill level and those that stretch your capabilities
- Reserve early: Popular sessions fill up quickly, so reserve your spot as soon as registration opens
Generative AI & Intelligent Operations
The future of cloud operations is AI-driven, and this year’s sessions showcase groundbreaking implementations:
COP334 | Accelerating incident response through AIOps | Breakout Session
Location: Tuesday, Dec 2 3:00 PM – 4:00 PM PST |Wynn
Learn how generative AI is transforming operational practices by automatically analyzing telemetry data, identifying patterns, surfacing actionable insights and leveraging AI agents. Learn how modern AI capabilities can convert hours of manual troubleshooting across multiple systems into streamlined investigations resolved in minutes.
COP326 | Elevate application and generative AI observability | Breakout Session
Location: Wednesday, Dec 3 2:30 PM – 3:30 PM PST |Wynn
This session demonstrates how to leverage Amazon CloudWatch’s comprehensive observability capabilities for both traditional applications and generative AI workloads. Customers building generative AI-powered workloads face challenges in understanding end-user outcomes, AI performance, health, accuracy, and quality issues at scale.
COP335 | Observability for AI Agents and Traditional Workloads| Breakout Session
Location: Wednesday, Dec 3 8:30 AM – 9:30 AM PST| Wynn
This session demonstrates using Amazon CloudWatch to observe your complete application stack—from AI agent decision-making and behavior patterns to traditional infrastructure metrics like CPU and memory usage. Learn practical techniques for tracking AI agent performance alongside existing application monitoring, correlating issues between AI components and traditional services, and using Amazon CloudWatch’s search capabilities to identify problems quickly. Feat. CCC Intelligent Solutions
COP403 | Automate cloud operations with AI agents | Workshop
Location: Tuesday, Dec 2 12:30 PM – 2:30 PM PST |Wynn
Join this technical workshop to build automated cloud operations solutions using AI agents. Get hands-on experience implementing Amazon CloudWatch investigations and Amazon Bedrock Agents for streamlined debugging and intelligent analysis.
COP405 |Building agentic workflows for augmented observability | Code Talk
Location: Tuesday, Dec 2 11:30 AM – 12:30 PM PST |Wynn
In this live, interactive coding session, we’ll build an AI agent that transforms raw telemetry into actionable intelligence. We’ll build a system that correlates metrics, logs, and traces from Amazon CloudWatch. Our agent will analyze patterns, create relevant monitoring artifacts, and provide natural language explanations of complex issues. We’ll show how to productionize the agent to autonomously respond to operational events, creating a proactive observability workflow.
COP418 |Monitor the quality and accuracy of your generative AI workloads | Code Talk
Location: Tuesday, Dec 4 2:00 PM – 3:00 PM PST |Wynn
Join this live coding session to learn how Amazon CloudWatch enables AI observability and troubleshooting. We’ll build agentic applications using Amazon Bedrock AgentCore and Amazon EKS with Strands Agent SDK, automatically instrumented through AWS Distro for OpenTelemetry (ADOT). Learn how to visualize agent and model telemetry in CloudWatch’s generative AI dashboard, and troubleshoot common challenges like latency, errors, throttling, and token usage. Leave with practical skills to build and maintain reliable AI applications.
Observability and Performance Monitoring
Modern applications require comprehensive observability across the entire stack:
COP336 |Elevating application reliability | Breakout Session
Location: Wednesday, Dec 3 4:00 PM – 5:00 PM PST |MGM
Discover how to build and maintain resilient infrastructure using AWS native services, including Amazon CloudWatch, AWS Systems Manager, and AWS CloudTrail. We’ll demonstrate automatic anomaly detection and preventive measures implementation. Explore robust logging architectures for rapid incident investigation and continuous operational visibility. Learn to use generative AI for accelerated incident analysis and automated response playbooks. Whether dealing with infrastructure failures, security incidents, or performance degradation
COP404 | Build full-stack observability from applications to databases | Workshop
Location: Monday, Dec 1 3:00 PM – 5:00 PM PST|MGM
In this hands-on workshop, implement comprehensive observability using Amazon CloudWatch Application Signals and Database Insights. Learn to trace requests as they flow through your applications, correlate them with database performance metrics, and quickly identify bottlenecks. Get experience using CloudWatch to monitor application traces, metrics, and logs on unified dashboards, and to analyze Aurora database performance with SQL.
COP329 |Application Performance Monitoring: From design to implementation | Chalk Talk
Location: Monday, Dec 1 4:00 PM – 5:00 PM PST |Wynn
In this chalk talk, we will explore real-world application performance monitoring (APM) challenges and how to solve them using Amazon CloudWatch. Join us for a collaborative session where we dive deep into architecture designs, identify common pitfalls, and show you how to gain deep visibility into distributed systems.
COP421 |Design effective Amazon CloudWatch dashboards and alarms | Chalk Talk
Location: Monday, Dec 3 3:00 PM – 4:00 PM PST |Wynn
In this chalk talk, we will explore real-world application performance monitoring (APM) challenges and how to solve them using Amazon CloudWatch. Join us for a collaborative session where we dive deep into architecture designs, identify common pitfalls, and show you how to gain deep visibility into distributed systems.
Security and Compliance
Security remains a top priority, with sessions focused on integrated security operations:
COP307 |Enhancing security visibility: building scalable log analytics | Chalk Talk
Location: Wednesday, Dec 3 9:00 AM – 10:00 AM PST|MGM
In this interactive chalk talk, we’ll demonstrate scalable solutions for comprehensive log analysis and real-time threat detection. Discover practical techniques for creating custom security dashboards, implementing automated alerting, and establishing compliance monitoring workflows.
COP417| Scale security monitoring using AWS CloudTrail with generative AI| Chalk Talk
Location: Wednesday, Dec 3 10:30 AM – 11:30 AM PST|MGM
This interactive chalk talk explores building enterprise-scale security monitoring using AWS CloudTrail. Participants will discuss how to design comprehensive security architectures that leverage VPC endpoint network events, AI-powered natural language queries, and unified compliance dashboards.
COP337 |Correlating compliance signals across AWS | Chalk Talk
Location: Friday, Dec 5 11:30 AM – 12:30 PM PST |Caesars Forum
This interactive chalk talk explores how you can use the new zero-ETL in Amazon OpenSearch Service to eliminate data silos and enable comprehensive governance observability across your organization. You will learn about designing correlation workflows that directly query data from Amazon S3, including on-premises and multi-cloud environments, without data movement or duplication.
Open Source and Modern Operations
For teams leveraging open-source tools:
COP333|Scaling open source observability stack feat. Warner Bros Discovery | Breakout Session
Location: Monday, Dec 1 11:30 AM – 12:30 PM PST |Wynn
In this session, discover how to leverage AWS managed open-source services including Amazon Managed Service for Prometheus, Amazon OpenSearch Service, Amazon Managed Grafana, and OpenTelemetry to solve these obstacles effectively. You will learn how Warner Bros. Discovery implemented open source observability using modern architectural patterns for ingesting and processing telemetry data at scale, while maintaining security, and cost efficiency to accelerate incident response.
COP412 | Observability: The open source way |Workshop
Location: Wednesday, Dec 3 3:30 PM – 5:30 PM PST |Venetian
Hands-on experience with AWS managed services for Prometheus, OpenSearch, and Grafana. You will deploy a sample application and use OpenTelemetry for instrumentation to collect, store, analyze, and visualize observability data. Gain practical experience building cost-effective, scalable observability solutions using familiar tools without the burden of infrastructure management.
Special Highlight: AWS Behind the Scenes
COP415 |AWS Behind the Scenes: How AWS drives operational excellence & reliability | Breakout Session
Location: Monday, Dec 1 1:30 PM – 2:30 PM PST| Caesars Forum
In this technical deep dive, we’ll take you behind the scenes on how AWS services are operated. We’ll dive deep into how we instrument and monitor our services, and how we leverage Amazon CloudWatch and Amazon OpenSearch, including their latest features, in our daily operations. Through real operational stories and hands-on demonstrations with sample applications, you’ll learn the patterns and practices that help AWS teams maintain reliability.
Practical Recommendations for Attendees
- Start with foundational sessions like COP328 “Implementing Observability at Scale” before diving into advanced topics.
- Don’t miss the hands-on workshops – COP403, COP404, and COP408 offer invaluable practical experience.
- For those interested in AI operations, the COP334, COP335, and COP413 session series provides a comprehensive overview.
Join us at re:Invent 2025 to learn how AWS is revolutionizing cloud operations through AI, automation, and advanced observability. Registration is now open – secure your spot today! And don’t forget to visit the Monitoring & observability and AIOps kiosks in the AWS Village at the Venetian!
Haven’t registered? There’s still time to attend! Registered through the re:Invent portal.