Amazon DevOps Guru

ML-powered cloud operations service to improve application availability

Amazon DevOps Guru is a Machine Learning (ML) powered service that makes it easy to improve an application’s operational performance and availability. DevOps Guru detects behaviors that deviate from normal operating patterns so you can identify operational issues long before they impact your customers.

DevOps Guru uses machine learning models informed by years of and AWS operational excellence to identify anomalous application behavior (e.g. increased latency, error rates, resource constraints, etc.) and surface critical issues that could cause potential outages or service disruptions. When DevOps Guru identifies a critical issue, it automatically sends an alert and provides a summary of related anomalies, the likely root cause, and context about when and where the issue occurred. When possible DevOps Guru, also provides recommendations on how to remediate the issue.

DevOps Guru automatically ingests operational data from your AWS applications and provides a single dashboard to visualize issues in your operational data. You can get started with DevOps Guru to improve application availability and reliability with no manual setup or machine learning expertise.

What is Amazon DevOps Guru?



Automatically detect operational issues

Using machine learning, Amazon DevOps Guru automatically collects and analyzes data such as application metrics, logs, and events and identifying behaviors that deviate from normal operating patterns. It automatically detects and alerts on operational issues and risks, such as impending resource exhaustion, code and configuration changes that may cause outages, memory leaks, under-provisioned compute capacity, and database I/O overutilization.


Resolve issues quickly with ML-powered insights

Amazon DevOps Guru helps reduce the time to identify and resolve the root cause of issues by by correlating anomalous behavior and operational events. When an issue occurs, DevOps Guru generates insights with a summary of related anomalies, contextual information about the issue and, when possible, it provides actionable recommendations for remediation.


Easily scale and maintain availability

Amazon DevOps Guru saves you the time and effort involved in manually updating static rules and alarms so you can effectively monitor complex and evolving applications. When you migrate or adopt new AWS services, DevOps Guru automatically analyzes their metrics, logs, and events. Then it produces insights, helping you easily adapt to changing behavior and evolving system architecture.


Reduce noise and alarm fatigue

AmazonDevOps Guru helps Developers and IT operators reduce alarm noise and overcome alarm fatigue by using pre-trained machine learning models to correlate and group related anomalies and surface the most critical alerts. With DevOps Guru, you can reduce the need to manage multiple monitoring tools and alarms, which means you can focus on the root cause of the issue and remediation.

How it works


Amazon DevOps Guru Preview

Use cases

Operational audits

You can use Amazon DevOps Guru to get a quick summary of all the operationally significant events that have been, identified, sorted by their severity. Using the System Health Dashboard you can search for issues in specific applications, identify trends, and decide where developers should spend their time and resources.

Proactive resource exhaustion planning

Build predictive alarming for exhaustible resources such as memory, CPU, and disk space. Amazon DevOps Guru forecasts when resource utilization will exceed the provisioned capacity, and informs you by creating a notification in the dashboard, helping you avoid an impending outage.

Preventative maintenance

With Amazon DevOps Guru you can prevent incidents before they occur. DevOps Guru flags medium and low-severity findings that might not be critical, but if left alone worsen over time and affect the availability of your application. This helps you prioritize, and avoid unforeseen downtime. For example, DevOps Guru notifies you about hitting the limits of your auto scaling groups, changes in latency patterns, or increased API call volume. DevOps Guru also identifies AWS best practices to help you increase the overall availability of your application. 


“We run thousands of EC2 instances and I am always looking for ways to reduce the time my team spends on resolving operational issues. We are excited to use Amazon DevOps Guru and leverage its ML-powered insights to help us identify, correlate and remediate operational issues. This will help my team save hours and reduce our mean time to recovery (MTTR).”

- Valentino Volonghi
CTO, NextRoll

"My team follows an ops-for-life motto, and we are always on the lookout for ways to automate our manual activities. With Amazon DevOps Guru, we hope to realize that goal and let AIOps take over many of our day-to-day tasks, so my team can focus on IT innovation. We are now not only meeting the needs of the business but able to exceed them since we have more time to focus on what matters most – delivering value for our organization and our customers."

- Andrew Shieh
SmugMug’s Operations Director

Thomson Reuters
“Customer experience is vital to us. Dealing with multiple sources of alerts for availability, performance, and change requests can be a challenge when trying to prevent and mitigate incidents impacting our customers. We are excited to use Amazon DevOps Guru and leverage its ML-powered insights to provide clear paths for action. This allows us to mitigate issues quickly and avoid events that impact customers. The integration with PagerDuty is a bonus, as we can have recommendations delivered to the right people timely and efficiently.”

- Steve Thoennes
Director Infrastructure Hosting Portfolio


"Atlassian is proud to support Amazon on the launch of DevOps Guru and help empower teams to deploy code and operate services with confidence. With our new Opsgenie and Jira Service Management integration, the right teams can be immediately notified the instant DevOps Guru predicts a potential issue, or determines an incident has occurred. DevOps Guru provides a new dimension of insight, and Atlassian ensures the fastest response."

- Emel Dogrusoz
Head of Product, Opsgenie

Read how you can deliver operational insights directly to your on-call team by integrating Amazon DevOps Guru with Atlassian Opsgenie
"PagerDuty was built to drive the move to a DevOps culture by automating the entire incident response lifecycle with resolution. We’re excited to continue this commitment to DevOps with our latest integration with Amazon DevOps Guru. Leveraging Amazon’s decades of operational excellence and DevOps Guru’s machine learning capabilities, PagerDuty provides even more real-time signal-to-action capabilities to our joint customers. Through PagerDuty’s ingestion of DevOps Guru’s Amazon Simple Notification Service (SNS) notifications, AWS customers can take real-time action on operational issues before they become customer-impacting outages.”

- Jonathan Rende
SVP of Product

Learn more about delivering ML-powered operational insights to your on-call teams via PagerDuty and Amazon DevOps Guru

Blog posts & articles >>

devops guru 1a

New- Amazon DevOps Guru Helps Identify Application Errors and Fixes

December 2020

Harunobu Kameda

Read blog

devops guru 2

Easily configure Amazon DevOps Guru across multiple accounts and Regions using AWS CloudFormation StackSets

December 2020

Nikunj Vaidya & Nuatu Tseggai

Read blog

devops guru reinvent thumbnail

AWS re:Invent 2020: Improve application availability w ML-powered insights using Amazon DevOps Guru

December 2020

Jacob Sullivan

Watch the webinar

devops guru 4

Amazon DevOps Guru is powered by pre-trained ML models that encode operational excellence

February 2020

Caner Turkmen, Ravi Turlapati & Tim Januschowski

Read Blog


Automate code reviews
Catch code problems faster and earlier with Amazon CodeGuru

Standard Product Icons (Features) Squid Ink
Check out the product features

Easily build sophisticated personalization capabilities
into your applications

Learn more 
Sign up for a free account
Sign up for a free account

Instantly get access to the AWS Free Tier. 

Sign up 
Standard Product Icons (Start Building) Squid Ink
Start building in the console

Get started building with Amazon DevOps Guru in the AWS Management Console.

Sign in