AWS DevOps & Developer Productivity Blog

Category: DevOps

Feature Flag Orchestration with AWS DevOps Agent and LaunchDarkly

Introduction Organizations that use feature flags alongside incident response tooling often connect the two manually. When an outage occurs, engineers must identify which flags are relevant, decide whether to disable them, and coordinate the change across teams. This manual process adds latency at the moment it matters most. You can use AWS DevOps Agent and […]

Supercharge your cloud operations with the Kiro power for AWS DevOps Agent

When an alarm fires at 2 AM, the first thing most engineers do is grep logs, check recent deployments, and trace code paths. However, the context they need — metrics, traces, topology, configurations — lives in a separate browser tabs and applications. What if your IDE could bring that cloud intelligence directly to your code, […]

Accelerate Incident Resolution with PagerDuty and AWS DevOps Agent

When something breaks in production, you find out fast. Understanding why it broke, before the damage spreads, is the hard part. That is where Site Reliability Engineering (SRE) teams lose the most time. Think about the last time you got paged at 2 a.m. The alert said something broke, not why. You open four or […]

Diagnose EKS Node Issues Faster with AWS DevOps Agent and Custom MCP

AWS DevOps Agent can investigate a growing range of production incidents autonomously. It diagnoses CrashLoopBackOff failures, traces ConfigMap deletions through audit logs, and correlates Amazon CloudWatch metrics with cluster events — all without human intervention. But AWS DevOps Agent has a visibility boundary. When the data it needs lives outside its native integrations — on […]

How AWS DevOps Agent uses multi-agent reasoning to find root causes

How AWS DevOps Agent uses multi-agent reasoning to find root causes

Confirmation bias is one of the most common reasons incident investigations take longer than they should. An on-call engineer gets alerted, forms a theory based on initial triage and experience, finds one piece of supporting evidence, and stops looking. The actual root cause — buried in a different service, a different signal, a different time […]

Automate root cause analysis across Datadog and Elasticsearch with AWS DevOps Agent

Automate root cause analysis across Datadog and Elasticsearch with AWS DevOps Agent

Modern distributed systems route business transactions through dozens of microservices, message queues, and event streams. When a message fails to process or processing exceeds SLA thresholds, troubleshooting requires correlating logs from tools like Elasticsearch, metrics from Datadog, and infrastructure change events in AWS CloudTrail. Correlating these signals manually across heterogeneous backends, each with different query […]

Building an end-to-end agentic SRE using AWS DevOps Agent

Introduction As modern applications evolve into complex ecosystems of serverless functions, microservices, and event-driven architectures, incident response becomes increasingly challenging. DevOps and SRE teams spend hours manually correlating data across observability tools and troubleshooting issues, racing against SLA deadlines. This reactive firefighting drains productivity, degrades reliability, and delays innovation. AWS DevOps Agent provides an opportunity […]

AWS Transform custom: Enterprise Code Modernization with the Learn-Scale-Improve Flywheel

Enterprise modernization has reached an inflection point. You can transform one repository easily. Existing tools, including AWS Transform custom, work well for individual repositories, and the process is understood. But what about 50 repositories? 100? 200? When you need to modernize at enterprise scale, transforming code is only part of the challenge. Coordinating people, capturing […]

Title: Automating Incident Investigation with AWS DevOps Agent and Salesforce MCP Server

Automating Incident Investigation with AWS DevOps Agent and Salesforce MCP Server

This post was co-written with Ross Belmont, Senior Director, Rodrigo Duran, Strategist Director at Salesforce Every minute counts when managing a critical infrastructure incident. Organizations need to quickly identify issues, diagnose root causes, and implement solutions—all while keeping customers informed. AWS DevOps Agent changes this by automating investigation and response, reducing mean time to resolution […]

Securely connect AWS DevOps Agent to private services in your VPCs

by Alexandra Huides, Jordan Merrick, Mohak Kohli, and Tipu Qureshi on in DevOps Permalink Share

AWS DevOps Agent is your always-available operations teammate that resolves and proactively prevents incidents, optimizes application reliability and performance, and handles on-demand SRE tasks across AWS, multicloud, and on-premises environments. It integrates with your existing observability tools to correlate telemetry, code, and deployment data to reduce Mean Time To Repair (MTTR) and drive operational excellence. […]