AWS Incident Detection and Response

Proactive management for critical workloads

Why AWS Incident Detection and Response?

AWS Incident Detection and Response offers eligible AWS Enterprise Support customers proactive engagement and incident management to reduce the potential for failure and to accelerate recovery of critical workloads from disruption. It achieves these objectives by ensuring that there is joint preparation with AWS to develop runbooks and response plans customized to the context of each workload onboarded to the service. Onboarded workloads are monitored 24x7 by a team of Incident Management Engineers (IMEs) to detect and engage you on a call bridge within 5 minutes of a critical alarm.

AWS Incident Detection and Response begins with a review of your workloads for reliability and operational excellence. AWS experts work with you to define critical metrics and alarms that provide improved visibility into the application and infrastructure layers of your workloads, making it easy to find and prioritize issues during an incident. AWS Incident Management Engineers continuously monitor your workloads, detect critical incidents, and engage you on a call bridge with the right AWS experts to accelerate the recovery of your workloads. All incidents are managed with the highest level of severity and escalation, and AWS remains engaged until the incidents are resolved. Lessons learned from previous incidents inform improvements to response plans and workload architecture, driving a continuous improvement cycle to improve the resiliency of your workloads.

AWS Incident Detection and Response is available in English for workloads hosted in eligible AWS regions. Contact your account team to subscribe accounts and onboard your workloads to AWS Incident Detection and Response.

Benefits

We work with you to define critical metrics and alarms to provide improved visibility into the application and infrastructure layers of your workloads.

AWS Incident Management Engineers will proactively engage you within 5 minutes of an alarm, from your workloads, or in response to a critical case you submit.

Recover faster from disruptions through rapid engagement with AWS experts using pre-defined response plans and runbooks.

Proactively mitigate issues by improving the architecture and operations of your workloads with best practice guidance from AWS.