Skip to main content

AWS DevOps Agent

AWS DevOps Agent (Preview)

Drive operational excellence with a frontier agent that resolves and proactively prevents incidents

Why AWS DevOps Agent?

AWS DevOps Agent is a frontier agent that resolves and proactively prevents incidents, continuously improving reliability and performance. AWS DevOps Agent investigates incidents and identifies operational improvements as an experienced DevOps engineer would: by learning your resources and their relationships, working with your observability tools, runbooks, code repositories, and CI/CD pipelines, and correlating telemetry, code, and deployment data across all of them to understand the relationships between your application resources, including applications in multicloud and hybrid environments. DevOps Agent uses this deep understanding of your operations and workloads to reduce MTTR (mean time to resolution) and drive operational excellence.

Benefits

AWS DevOps Agent is your always-on, autonomous on-call engineer. It begins investigating the moment an alert comes in, whether at 2 AM or during peak hours, to quickly restore your application to optimal performance. AWS DevOps Agent autonomously triages incidents 24/7, providing root cause analysis and actions for resolution. It uses its understanding of your application resources and relationships to quickly understand dependencies and interactions. AWS DevOps Agent streamlines incident response by automatically routing observations, findings, and mitigation steps through your preferred communication channels such as Slack, ServiceNow, and PagerDuty.

AWS DevOps Agent analyzes patterns across historical incidents to provide actionable recommendations that strengthen four key areas: observability, infrastructure optimization, deployment pipeline enhancement, and application resilience. For example, in the area of infrastructure optimization, if you experience unexpected traffic spikes, AWS DevOps Agent may recommend the Kubernetes Horizontal Pod Autoscaler (HPA) for EKS clusters to better distribute traffic.

AWS DevOps Agent enables you to access the untapped insights in your operational data by securely integrating with your workflows and observability tools, runbooks, code repositories, and CI/CD pipelines. AWS DevOps Agent offers built-in integrations with observability tools such as Amazon CloudWatch, Dynatrace, Datadog, New Relic, and Splunk, and code repositories and CI/CD pipelines like GitHub and GitLab. You can extend AWS DevOps Agent beyond its built-in integrations by connecting to your own MCP server, enabling integrations with additional tools such as your organization’s custom tools, specialized platforms, or proprietary ticketing systems. 

Customers

Commonwealth Bank of Australia

Commonwealth Bank of Australia is one of Australia's leading providers of integrated financial services serving over 17 million customers. The bank's Cloud Foundations group manages over 1,700 AWS accounts and provides centralized cloud operation services for thousands of engineers. While prototyping their next-generation internal platform, the team replicated a complex network and identity management issue to test AWS DevOps Agent. These types of issues can take a seasoned DevOps engineer hours to identify, and the agent found the root cause in under 15 minutes. "AWS DevOps Agent thinks and acts like a seasoned DevOps Engineer, helping our engineers build a banking infrastructure that’s faster, more resilient, and designed to deliver better experiences for our customers. This isn't just about faster resolution times—it's about maintaining the trust our customers put in us."

“AWS DevOps Agent's seamless integration with our existing enterprise tools including ServiceNow, Splunk, and our custom MCP servers makes it even more valuable for our operations. For CBA, this opens significant opportunities. We're exploring ways to scale this across our platform teams and help every internal customer leverage these capabilities. The ability to integrate with our existing SLOs through Grafana and Prometheus makes it even more valuable for our operations. AWS DevOps Agent is helping us build a more resilient, efficient banking infrastructure for millions of Australians.”

Jason Sandery, Head of Cloud Services, Commonwealth Bank of Australia

Missing alt text value

Western Governors University

“At WGU about 200,000 students rely on 24/7 online learning, making system reliability critical to their success. To better serve our students, we implemented AWS DevOps Agent integrated with Dynatrace in our production environment, and the initial results are significant. When performance issues occur from third-party API dependencies, networking problems, or application-level errors, Dynatrace immediately detects them, and the AWS DevOps Agent autonomously investigates our entire technology stack to pinpoint root causes.

The service provides comprehensive observability across our infrastructure, giving us visibility into external service dependencies, network performance, and application behavior in one unified solution. For a university committed to accessible education, this deeper insight and faster issue resolution means uninterrupted learning experiences. What previously required our team to manually correlate data across multiple systems now happens automatically, allowing our lean IT team to focus on strategic initiatives rather than troubleshooting. As we scale to serve more students, this enhanced observability ensures we maintain the reliability essential to student success.“

Nate Cummings, Sr. Director of Infrastructure, Western Governors University  

Missing alt text value

Deriv

"Deriv is one of the world's largest online brokers, serving more than 3 million traders worldwide. It offers an expansive range of trade types and features over 300 assets across popular markets, available on Deriv's award-winning, intuitive trading platforms. At Deriv, we've built our 25-year legacy on technological innovation and have now embraced an AI-first approach across our operations. As we continue this transformation, AWS DevOps Agent represents a natural evolution in how we manage our infrastructure. AWS DevOps Agent's intelligent automation capabilities will enable our teams to shift from reactive incident response to proactive system optimization, particularly valuable given the complexity of maintaining 24/7 trading services across multiple regulatory jurisdictions. The contextual intelligence features will enable our engineering teams to quickly assess system relationships and dependencies, reducing mean time to resolution for issues that impact customer transactions. The seamless integration with our existing AWS and third-party toolchain, and the ability of AWS DevOps Agent to learn from our operational patterns align with our philosophy of using AI to boost engineering efficiency and deliver an exceptional customer experience."

Najib Huq, Engineering Senior Manager, Deriv

Missing alt text value

Dhan.co

"Dhan is a leading online trading platform for stocks, options, futures and commodities, serving over 1.2 million active customers and processing over 9 million transactions daily. For a regulated trading platform like ours, maintaining high availability is crucial. We expect AWS DevOps Agent's automated analysis and contextual recommendations will help our teams ensure consistent service delivery. The agent's ability to learn from our operational patterns while maintaining compliance standards will be particularly valuable across our extended trading hours, from market open through late-night sessions. We anticipate AWS DevOps Agent will enhance our ability to maintain reliable trading infrastructure and meet strict financial service requirements. The AWS DevOps Agent's integration with our existing monitoring stack will help us consolidate alerts across multiple systems and streamline our operational processes.“


Alok Pandey, Co-Founder and CTO, Dhan (Raise Holdings)

Missing alt text value

RMIT University

“AWS DevOps Agent is proving to be a game-changer in our pursuit of zero-touch engineering at RMIT University. As a renowned global university serving over 105,000 students and 12,000 staff and researchers, we are constantly pushing the boundaries of cloud innovation. What’s remarkable about AWS DevOps Agent is its ability to reason across our entire landing zone topology - understanding the relationships between our workload, network, and management accounts as one cohesive ecosystem. We have tested and seen it dissect complex deployment issues in minutes, identifying network communication problems and dependency conflicts that would typically require extensive manual investigations across multiple teams. This level of intelligent automation will transform our troubleshooting cycles from 4-7 hours to under 30 minutes. We are building a future here where cloud operations are proactive, intelligent, and increasingly autonomous.”

Ken Mirvis, Senior Manager Cloud Engineering, RMIT University

Missing alt text value

Use cases

Incident response and resolution

AWS DevOps Agent autonomously triages incidents and guides teams to rapid resolution. AWS DevOps Agent integrates with observability tools, code repositories, and CI/CD pipelines to correlate and analyze telemetry, code, and deployment data, sharing its hypotheses, observations, and findings. Through systematic investigations, AWS DevOps Agent identifies root cause of issues stemming from system changes, input anomalies, resource limits, component failures, and dependency issues across your entire environment.

Automated incident coordination

You can initiate and guide investigations using interactive chat. AWS DevOps Agent acts as a member of your operations team, working directly within your collaboration tools like ServiceNow and Slack to share findings and coordinate response. When needed, create an AWS Support case directly from an investigation, giving AWS Support experts immediate context for faster resolution.

Prevent future operational incidents

AWS DevOps Agent analyzes patterns across historical incidents to provide actionable recommendations that strengthen four key areas: observability, infrastructure optimization, deployment pipeline enhancement, and application resilience. 

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages