AWS Cloud Operations Blog

AWS Observability ICYMI: Jan-May 2026

Welcome to the first edition of the AWS Observability ICYMI (In Case You Missed It) recap! The first five months of 2026 has been transformational for AWS observability with over 40 launches across Amazon CloudWatch, AWS X-Ray, Amazon Managed Grafana, and Amazon Managed Service for Prometheus. Two major themes defined this period: OpenTelemetry as the unified instrumentation standard and AI-powered operations that make observability accessible to everyone.

Whether you’re running containers on Amazon Elastic Kubernetes Service (EKS), managing databases across Regions, or building AI-assisted workflows, there’s something here for you. Let’s dive in.

Release count by product

Fig 1. AWS Observability release count by category

AWS Observability Release Calendar List

Fig 2. AWS Observability Release Calendar List

 

OpenTelemetry Goes Native in CloudWatch

CloudWatch OpenTelemetry Metrics (Preview)

Send metrics directly using the OpenTelemetry Protocol (OTLP) without custom conversion logic or additional tooling. OpenTelemetry metrics support up to 150 labels per metric and metric types including gauge, sum, histogram, and exponential histogram.

PromQL and Query Studio (Preview)

Query Studio combines PromQL and CloudWatch Metric Insights in a single interface-query AWS vended metrics and OpenTelemetry metrics using the language you prefer without switching between consoles. Includes a visual form builder and code editor. Available in US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney), Asia Pacific (Singapore), and Europe (Ireland).

Container Insights with OpenTelemetry for EKS (Preview)

OpenTelemetry-native metrics collection comes to Kubernetes monitoring, aligning container observability with the broader OpenTelemetry strategy.

Cross-Region Telemetry Enablement Rules

Audit and enable telemetry across multiple Regions from a single region. Available in all AWS commercial Regions.

 

AI-Powered Operations: The New Interface for Observability

CloudWatch Pipelines AI Configuration

Configure log processors using natural language descriptions powered by generative AI. Instead of manually writing transformation rules, describe your intent and have the system generate the appropriate configuration.

AWS Observability Kiro Power

Investigate application health issues faster with AI agent-assisted workflows directly in Kiro. This provides AI-native observability embedded directly in developer tooling.

 

CloudWatch Logs: More Power, More Flexibility

Query Concurrency Tripled

Concurrent query limit for Logs Insights QL increased from 30 to 100. You can now execute 10 StartQuery API and 10 GetQueryResults API calls per second per account per Region.

HTTP-Based Ingestion

New support for HTTP Log Collector, ND-JSON, Structured JSON, and OpenTelemetry protocols for log ingestion-more flexible options beyond the CloudWatch agent or AWS SDK.

Infrequent Access Enhancements

The Infrequent Access log class gained data protection capabilities and support for OpenSearch PPL and SQL query languages, making the cost-optimized tier viable for workloads that previously required Standard class.

Telemetry Collection Auto-Enablement Expansion

Automatic enablement now supports Amazon CloudFront Standard access logs, AWS Security Hub CSPM finding logs, and Amazon Bedrock AgentCore memory and gateway logs/traces.

Multi-Account Centralization by Data Source

Centralize logs based on data source name and type across multiple accounts-granular control over which logs flow to central accounts versus remaining local.

CloudWatch Logs Insights lookup query command

With the lookup command, you can join log data against a lookup table at query time, automatically enriching your results with meaningful values.

Logs Insights JOIN and sub-query commands

With JOIN and sub-query commands, you can accelerate troubleshooting across scenarios such as correlating application and infrastructure errors across different services and log groups, analyzing security events across multiple services, or tracking user sessions across distributed systems.

Logs Insights querying by log group tags

With this launch, customers can run a query across all log groups that share common tags. As log group tags are added or removed, queries automatically reflect the matching log groups, reducing operational overhead as environments grow.

 

CloudWatch Pipelines: Filter, Route, and Transform

Drop Events and Conditional Processing

New drop events processor and conditional processing capabilities allow content-aware filtering, routing, and transformation within pipelines. For more information on CloudWatch Pipelines, visit Amazon CloudWatch Pipelines documentation.

Compliance and Governance Controls

Data integrity and access control capabilities for log pipelines-addressing enterprise audit trail and controlled data flow requirements.

 

Metrics

Amazon Bedrock Time To First Token and Quota Consumption

TimeToFirstToken measures the latency from when a request is sent to when the first token is received, for streaming APIs (ConverseStream and InvokeModelWithResponseStream). EstimatedTPMQuotaUsage tracks your estimated Tokens Per Minute (TPM) quota consumption, including cache write tokens and output burndown multipliers, across all inference APIs (Converse, InvokeModel, ConverseStream, and InvokeModelWithResponseStream)

Direct Connect BGP monitoring

Three new Amazon CloudWatch metrics for virtual interfaces (VIFs) that provide visibility into Border Gateway Protocol (BGP) session health and route counts. Network engineers and operations teams managing hybrid cloud connectivity can now monitor BGP sessions natively through CloudWatch without building custom solutions or polling APIs.

AWS Private CA utilization metrics

The new metrics track the number of certificates issued by each CA and the total number of CAs in each Region, enabling you to monitor usage against these quotas and proactively manage CA lifecycle to maintain high availability.

Amazon S3 Express One Zone request metrics

You can use request metrics to track performance and monitor the operational health of applications that use S3 Express One Zone.

Amazon ECS Managed Instances supports NVIDIA GPU metrics

With the new GPU metrics, Amazon ECS Managed Instances customers can now monitor GPU capacity, utilization, memory, hardware health, and thermal conditions directly in CloudWatch.

AWS Outposts racks LagStatus CloudWatch metric

This metric provides you with the ability to monitor Outposts LAG connectivity status directly within the CloudWatch console, without having to rely on external networking tools or coordination with other teams.

Amazon ElastiCache adds thirteen new Amazon CloudWatch metrics for network capacity planning and engine diagnostics

Amazon ElastiCache customers can now detect network throttling, memory fragmentation, and connection exhaustion, using thirteen new Amazon CloudWatch metrics for node-based clusters. You can monitor these host-level and engine-level diagnostics directly from CloudWatch without running INFO commands on individual nodes or calculating baselines from raw byte counters.

 

Alarms, Application Signals, and RUM

Alarm Mute Rules

Temporarily mute alarm notifications during planned deployments, maintenance windows, and off-hours without compromising monitoring visibility. Finally -a native solution to alert fatigue during planned changes.

Application Signals SLO Capabilities

Three new capabilities for Service Level Objectives:

● SLO Recommendations: Suggests appropriate SLO targets based on historical performance

● Service-Level SLOs: Aggregates individual SLOs into service-level views

● SLO Performance Report: Executive-ready summaries of SLO compliance

RUM in European Sovereign Cloud

CloudWatch RUM expanded to the AWS European Sovereign Cloud (eusc-de-east-1) monitor web application performance without data leaving the sovereign boundary.

RUM App Monitors Overview

An improved overview surfaces fleet-wide health, SLO breaches, and distributed tracing coverage on a single page.

Amazon CloudWatch RUM Session Replay for Web Applications

Session Replay helps developers identify user experience issues — such as forms that fail to render or navigation flows that break — that can silently impact conversion and engagement, even when no one reports them

 

Amazon CloudWatch Database Insights

Regional Expansion January 20 & March 11

On-demand analysis expanded to Asia Pacific (New Zealand, Taipei, Thailand) and Mexico (Central), followed by AWS GovCloud (US). The feature automatically compares selected time periods against baseline performance, identifies anomalies, and provides specific remediation advice.

Lock Contention Diagnostics for RDS PostgreSQL

Provides lock contention diagnostics for Amazon RDS for PostgreSQL instances. This feature helps you identify the root cause behind both ongoing and historical lock contention issues within minutes. The lock contention diagnostics feature is available exclusively in the Advanced mode of CloudWatch Database Insights.

 

Amazon EC2

Organization-Wide EC2 Detailed Monitoring

Auto-enable EC2 for detailed monitoring across your entire AWS Organization from a single configuration point.

EC2 Visual Agent Configuration Editor

The EC2 console now includes a visual configuration editor for the CloudWatch agent-no more manual JSON editing.

 

AWS X-Ray: The OpenTelemetry Migration Is Official

X-Ray SDKs and Daemon Enter Maintenance Mode

AWS X-Ray SDKs and Daemon formally entered maintenance mode. From this date forward, releases are limited to security fixes only.

What this means for you:

● The X-Ray service remains fully supported-console, trace processing, and backend capabilities continue unchanged

● Migrate to OpenTelemetry-based instrumentation via AWS Distro for OpenTelemetry (ADOT)

● Language-specific migration guides are available for Java, Python, Node.js, .NET, Go, and Ruby

● Both zero-code auto-instrumentation and manual instrumentation are supported

 

Amazon Managed Grafana (AMG)

AWS GovCloud (US) Availability

Available in both GovCloud US-West and US-East Regions for government customers and regulated industries.

Encrypt with KMS Customer Managed Keys

Encrypt workspace data with your own encryption keys for compliance. Available in all Regions except GovCloud.

Grafana 12.4 Workspace Creation

A packed release:

● Drilldown apps Queryless, point-and-click exploration of Prometheus metrics, Loki logs, Tempo traces, and Pyroscope profiles

● Scenes-powered dashboards Boosted rendering performance

● Enhanced CloudWatch plugin PPL/SQL query support, cross-account Metrics Insights, and log anomaly detection

● Rebuilt table visualization CSS cell styling and interactive Actions buttons

 

Amazon Managed Service for Prometheus (AMP)

While no standalone AMP features launched this period, two ecosystem integrations significantly enhance its value:

OpenSearch Ingestion → AMP Sink

Build fully managed, end-to-end metrics ingestion pipelines without custom forwarding infrastructure. Route metrics to AMP for PromQL analysis while sending logs/traces to OpenSearch through a single pipeline.

Amazon OpenSearch Service supports Managed Prometheus and agent tracing

Query Prometheus metrics directly using native PromQL syntax alongside logs and traces in OpenSearch without duplicating data.

 

Recap

If you take away three things:

1. Start your OpenTelemetry migration now. With native OpenTelemetry metrics in CloudWatch, PromQL Query Studio, and X-Ray SDKs entering maintenance mode, the path forward is clear. ADOT is your instrumentation layer.

2. AI is the new observability interface. MCP servers, natural-language pipeline configuration, and Kiro integration mean you can increasingly talk to your monitoring data instead of writing queries.

3. Cost optimization got easier. Infrequent Access log class enhancements, free OpenTelemetry metrics during preview, and no-cost Pipeline features lower the barrier to comprehensive observability.

Still looking for more?

● Join a live or hands-on session from our Cloud Operations Enablement series: https://aws-experience.com/amer/smb/events/series/Cloud-Operations-Enablement

● Check out the CloudWatch documentation: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/

● Read the MCP servers blog post: https://aws.amazon.com/blogs/mt/enhance-your-aiops-introducing-amazon-cloudwatch-and-application-signals-mcp-servers/

See you next quarter!

 

Joe Alioto

Joe Alioto

Joe is a Worldwide Senior Specialist Solutions Architect for Cloud Operations at AWS, focusing on observability, AI-powered operations, and centralized operations management. With over two decades of operations engineering experience and the past two years dedicated to AI and AIOps, he helps customers build intelligent observability strategies that connect application performance, infrastructure metrics, and database workloads - increasingly using AI agents and automation to reduce mean time to resolution and transform how operations teams work at scale.