AWS Cloud Resilience
Build and run resilient, highly available applications in the AWS cloud
Why AWS Cloud Resilience?
Cloud resilience refers to the ability for an application to resist or recover from disruptions, including those related to infrastructure, dependent services, misconfigurations, transient network issues, and load spikes. Cloud resilience also plays a critical role in an organization’s broader business resilience strategy, including the ability to meet digital sovereignty requirements.
Resilient applications are those built with high availability—the percentage of time the application is available for use—and also those with a disaster recovery or continuity of operations plan in place.
Millions of customers trust that AWS is the right place to build and run their business and mission-critical applications with high availability.
AWS has made significant investments in building and running the world’s most resilient cloud. We have designed a unique and highly available global infrastructure, built safeguards into our service design and deployment mechanisms, and instilled resilience into our operational culture. AWS also makes it easier for you to build and run resilient applications in the cloud, with a comprehensive set of purpose-built resilience services, solutions, architectural best practices, and guidance.
Benefits
Highest network availability
AWS delivers the highest network availability of any cloud provider and is the only cloud provider to offer three or more Availability Zones (AZs) in all Regions, providing more redundancy and better isolation to contain issues.
Comprehensive resilience services and guidance
AWS makes it easier for customers to design, build, and run highly available applications through its comprehensive portfolio of purpose-built resilience services, integrated resilience features, and expert guidance.
Unparalleled operational expertise
AWS has over 17 years of proven operational expertise and unmatched scale helping millions of customers in regulated and non-regulated industries meet their resilience requirements.
Use cases
Designing and Building
Leverage the best practices in the Reliability and Operational Excellence Pillars from the AWS Well-Architected Framework to build resilient applications.
Evaluating and Testing
Continuously measure and test your workload performance against your resilience goals with AWS Resilience Hub and AWS Fault Injection Service.
Monitoring and Observability
Implement monitoring and observability services like Amazon CloudWatch to quickly detect, investigate, and remediate issues impacting your applications.
Failover and Failback
Use Amazon Application Recovery Controller, AWS Elastic Disaster Recovery, and AWS Backup to ensure your applications recover quickly.
Featured Services and Solutions
Broadridge
"At Broadridge, we have critical systems that can’t afford to be down. We developed an ‘always on’ program using AWS services to ensure we were having near-zero recovery time objectives and recovery point objectives."
Todd Peterson, Vice President of Broadridge

Ikano Bank
"At Ikano Bank, we wanted to fully realize the benefits of the cloud, particularly its disaster recovery capabilities, but didn’t have the in-house capability to make that happen. AWS Resilience Hub provided tailored recommendations based on the AWS Well-Architected Framework, ensuring that our implementation aligned with best practices for operational excellence and reliability. As a financial institution, this gives us peace of mind that we have resilience built into our systems."
Carl Lundquist, Head of IT Operations and Services, Ikano Bank

Ally Financial
"AWS enables us to be agile and rapidly deliver a highly resilient and highly reliable fault tolerant system so that we can prevent our customers from near zero critical failure points. Collaborating with AWS has really helped us push the boundaries of innovation and what we can deliver in financial services, which is a heavily regulated industry. It continues to help us innovate, it continues to help us provide net-new products to our customers, and just like AWS, helps us to focus on our customer needs and put them front and center."
Sada Rajagopalan, Senior Director Lead Cloud Engineering, Ally Financial

Featured Content
Resilience Lifecycle Framework
A continuous approach to resilience improvement
Improving the resilience posture of an application is not a one-time effort; it is a continuous process that should be incorporated into how you build and operate your applications. This whitepaper shares strategies, services, and mechanisms you can use to drive continuous resilience into your organization.