Let’s Architect! Creating resilient architecture
The AWS Well-Architected Framework defines resilience as “the capability to recover when stressed by load (more requests for service), attacks (either accidental through a bug, or deliberate through intention), and failure of any component in the workload’s components.”
The need for resilient workloads transcends all customer industries, but it can often can be misunderstood, which can lead to workloads that do not incorporate resilient architecture at all or workloads that are over-engineered.
Resilience is a technical problem, but it’s also about people and culture. It’s a continuous process that requires us to learn by iterating. Customers need to understand, from a business perspective, what their SLA requirements are, and from technical perspective, how they achieve this with their architecture. In this post, we share resources to help you build resilience into your AWS architecture.
Building a resilient architecture is not only about the technical implementation of the system, but also about the solutions for observability, operations, and people.
This video shows the Amazon approach for designing resilient systems, where individual teams build and own a service. This way, everyone has operational responsibility. You’ll learn how to deploy often, move fast, and design solutions for automatic rollback, which allows teams to revert their workload to a previous iteration if needed.
Resilience is an important consideration for developers. For instance, if a downstream service is not available, how can the software handle the situation? Which mechanisms should you use to implement retries? How can you prevent overloading the downstream service?
This video focuses on five strategies and design patterns that developers can use to build resilient applications. You’ll learn how to add timeouts, retries, exponential backoff with randomness, and circuit breakers into your code. These patterns are powerful because they can be abstracted and implemented in different scenarios.
This blog post shows you how AWS Resilience Hub can help you evaluate the resilience of your architecture. It gives you a central place to monitor, track, and evaluate your application’s resiliency based on your business goals. For example, after you define your RPO and RTO SLAs, Resilience Hub will evaluate your current architecture against them and show you whether you’ve met your goals. If you haven’t met your goals, it recommends changes to help you meet them.
Resilience encompasses a broad range of considerations, including infrastructure, application patterns, data management, and application building and monitoring. And after you incorporate resilience, it is essential to continuously maintain it.
This video provides useful principles for building continuous resilience in your applications. It also explores various considerations for implementing processes designed to provide continuous improvement through a DevOps methodology and shows you services you can use to incorporate resilience in the development process in a nearly continuous manner.
See you next time!
Thanks for joining our discussion on resilient architecture! See you in a couple of weeks with our content about governance in the cloud!
Looking for more architecture content? AWS Architecture Center provides reference architecture diagrams, vetted architecture solutions, Well-Architected best practices, patterns, icons, and more!
Other posts in this series
- Let’s Architect! Using open-source technologies on AWS
- Let’s Architect! Architecting for Sustainability
- Let’s Architect! Architecting for Machine Learning
- Let’s Architect! Architecting for Security
- Let’s Architect! Tools for Cloud Architects
- Let’s Architect! Architecting for Blockchain
- Let’s Architect! Architecting microservices with containers
- Let’s Architect! Serverless architecture on AWS