Skip to main content

Guidance for Multi-Region Resilient Microservice on AWS

Launch a failover sequence deployment across multiple AWS Regions to protect workloads

Overview

This Guidance demonstrates how to build highly resilient web applications that can withstand disruptions, minimizing impact on revenue and application downtime. By leveraging a multi-Region architecture, automated failover orchestration, and comprehensive monitoring, this Guidance helps ensure critical web applications remain available and consistent, even in the face of significant impairments. You can reduce the blast radius of affected users, maintain data integrity, and make informed decisions on when to failover between primary and standby Regions to maximize uptime and protect business continuity.

How it works

Active/Active State

This architecture diagram shows the active/active state across two AWS Regions.

A detailed architecture diagram illustrating a multi-region AWS deployment for a microservices application. The diagram features two AWS regions (us-east-1 and us-west-2), with Amazon ECS clusters running various microservices (UI, Carts, Checkout, Catalog, Assets, Orders), using resources such as Amazon CloudWatch, AWS Systems Manager, Amazon DynamoDB, ElastiCache for Redis, Amazon Aurora, and Amazon MQ for RabbitMQ Broker. Connectivity and redundancy are provided by Amazon Route 53, with global databases and tables for high availability.

Failover Sequence

This architecture diagram shows the failover sequence when the workload fails over to us-west-2 from us-east-1 AWS Region.

A detailed architecture diagram illustrating a highly available e-commerce application deployed across two AWS regions (us-east-1 and us-west-2). The design shows integration with various AWS services such as Amazon ECS, ALB, Route 53, Amazon DynamoDB, Amazon Aurora, Elasticache for Redis, Amazon MQ, AWS Systems Manager, and Amazon CloudWatch, with global table and database replication for fault tolerance and redundancy.

Get started

Deploy this Guidance

Use sample code to deploy this Guidance in your AWS account

Go to sample code on GitHub

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

AWS X-Ray traces application calls from Amazon ECS tasks, visualizing communication flows of microservices and analyzing user requests as they travel through the UI to underlying microservices. CloudWatch Synthetics generates traffic to the application, creating metrics for setting thresholds and alerting if issues arise. Systems Manager runbooks automate failover and failback processes, minimizing human error and ensuring the application meets recovery time objective (RTO) and recovery point objective (RPO) requirements.

Read the Operational Excellence whitepaper

AWS Identity and Access Management (IAM) roles and policies secure microservices' interactions with AWS services, enforcing robust security through meticulously defined permissions. AWS Key Management Service (AWS KMS) encrypts data at rest across services, including Aurora and DynamoDB.

Read the Security whitepaper

Elastic Load Balancing (ELB) routes traffic requests from the application's web interface to healthy Amazon ECS tasks, while Amazon ECS replaces unhealthy tasks and adds more tasks to handle increased load. Amazon Application Recovery Controller reliably enables and disables AWS Regions based on application traffic. DynamoDB global tables and Aurora global databases keep application data consistent within the RPO requirements across multiple AWS Regions. Systems Manager runbooks orchestrate components that need to be changed when shifting traffic from one AWS Region to another. Together, these services help ensure the application experiences minimal service interruptions.

Read the Reliability whitepaper

ELB distributes incoming traffic across multiple targets, preventing any single instance from becoming overwhelmed and maintaining high performance. Aurora read replicas offload read traffic from the primary database instance, distributing the workload and improving overall performance. Aurora global databases extends the benefits of read replicas across multiple Regions, enabling read scaling and improved performance for geographically distributed applications. DynamoDB global tables replicate DynamoDB tables across multiple AWS Regions, enabling low-latency data access for users worldwide.

Read the Performance Efficiency whitepaper

Auto scaling automatically adjusts the number of Amazon ECS tasks based on demand, so that you only pay for the resources needed. AWS Fargate for Amazon ECS eliminates the need to provision and manage servers, allowing you to run containers without the overhead of managing Amazon Elastic Compute Cloud (Amazon EC2) instances, leading to improved efficiency and reduced costs.

Read the Cost Optimization whitepaper

Auto scaling and DynamoDB On-Demand add capacity when needed and scale down when not required. On-demand services minimize the environmental impact of the workload by efficiently using only the necessary resources to meet the application's demands.

Read the Sustainability whitepaper

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.