AWS Cloud Operations Blog
Category: Technical How-to
Log analysis with facets, correlation, enrichment, and automation in Amazon CloudWatch Log Analytics
Teams working with distributed applications accumulate logs across multiple log groups, including application logs, access logs, and audit trails. When something needs investigating, an engineer opens the console and starts writing queries from scratch. The same query gets written differently by different people. The results lack context because the log event does not contain who […]
Analyzing Claude Code usage with CloudWatch and OpenTelemetry
If your engineering organization uses AI coding agents like Claude Code, usage is likely growing faster than your ability to track it. Token consumption, cost per team, and developer productivity are questions that existing dashboards don’t answer, because the telemetry never made it to your observability backend. With Amazon CloudWatch OpenTelemetry Protocol (OTLP) in General […]
Transfer AWS accounts between AWS Organizations while preserving AWS Lake Formation permissions
Many AWS customers move their AWS accounts between organizations When your company manages more than one organization, and whether you regularly move accounts between them; or you are consolidating accounts after a merger, acquisition, or divesture. Account migrations are part of operating on AWS. Previously, moving an account meant removing it from the source organization, making it standalone, then inviting it to the target organization. For accounts with AWS Resource Access […]
Build a Multi Account Patch Compliance Dashboard with Kiro Specs
Introduction Robust patch management is essential for maintaining system security, reliability, and compliance across your IT infrastructure. AWS Systems Manager Patch Manager provides a full-featured patching solution, enabling you to automate the deployment of operating system updates to managed nodes across AWS accounts, on-premises, and multicloud environments. However, as your organization scales across dozens or […]
Import Historical data from AWS CloudTrail Lake to Amazon CloudWatch
Organizations managing workloads on AWS rely on AWS CloudTrail to answer the fundamental questions: Who did what, where, and when? Since January 2022, customers have stored their CloudTrail activity logs in CloudTrail Lake, a managed data lake purpose-built for capturing, storing, querying user and API activity across their AWS environment. As organizations scale across multiple […]
Introducing OpenTelemetry and PromQL support in Amazon CloudWatch
If you run Kubernetes or microservices workloads on AWS, your metrics likely carry dozens of labels: namespace, pod, container, node, deployment, replica set, and custom business dimensions. To get a complete picture of your environment, you may be splitting your metrics pipeline: Amazon CloudWatch for AWS metrics, and a separate Prometheus-compatible backend for high-cardinality (many […]
Adaptive sampling with AWS X-Ray to capture critical spans
Introduction Enterprise applications using AWS X-Ray generate large volumes of distributed tracing data across multiple services. Static sampling strategies keep costs down by capturing a fixed percentage of traffic. However, they frequently miss critical data during intermittent failures or sudden latency spikes. Tracing every request for maximum visibility at scale may increase sampling costs for […]
Automate AWS Systems Manager activation for hybrid-managed node registration
AWS Systems Manager (formerly known as SSM) is an AWS service that you can use to view and control your servers on AWS cloud and on-premises infrastructure. Systems Manager makes it easy to manage a hybrid environment. To set up servers and virtual machines (VMs) in your hybrid environment as Systems Manager managed instances, you […]
Simplify AWS Control Tower governance with enhanced AWS CloudFormation Hooks
Introduction Organizations using AWS Control Tower to govern their multi-account environments face a persistent challenge: when AWS CloudFormation deployments fail due to proactive control violations, teams receive minimal information about why the failure occurred or how to fix it. This lack of visibility leads to: Delayed deployments as developers struggle to understand cryptic error messages […]
Deploying custom Terraform to LZA-Managed Accounts with AFT
As organizations scale their AWS environments, managing infrastructure consistently while enabling team autonomy becomes increasingly challenging. Landing Zone Accelerator on AWS (LZA) and AWS Account Factory for Terraform (AFT) both extend AWS Control Tower to help customers manage AWS environments at scale, offering complementary strengths. Many AWS customers struggle to balance centralized security governance with […]








