Guidance for Designing Resilient Applications with Amazon Aurora and Amazon RDS Proxy

This Guidance helps you achieve near-zero recovery point objective (RPO) for your applications, minimizing data loss during potential Amazon Aurora failovers. You can improve data durability by using a persistent message queue that temporarily stores application data until it can safely be committed to the database. With this Guidance, you can design highly resilient databases for applications, helping to ensure minimal data loss and maintain data integrity.

Please note: [Disclaimer]

Architecture Diagram

[Architecture diagram description]

Download the architecture diagram PDF

Guidance Architecture Diagram for Designing Resilient Applications with Amazon Aurora and Amazon RDS Proxy

Step 1
A user generates a request to write to the Amazon Aurora database. This request is evaluated by the AWS WAF configured with standard rules to protect against common web exploits.

Step 2
If the request complies with the enacted AWS WAF policies, the request is routed to an Amazon API Gateway.

Step 3
API Gateway forwards HTTPS requests to an Amazon Simple Queue Service (Amazon SQS) queue.

Step 4
In the background, an event source mapping in AWS Lambda continuously polls the Amazon SQS queue for new messages. Amazon SQS is set to attempt message processing 25 times. If a message fails all attempts, it’s sent to an Amazon SQS dead-letter queue (DLQ).

Step 5
Upon receiving a new message, Lambda retrieves the database credentials stored in AWS Secrets Manager to connect to the Aurora database.

Step 6
The message retrieved from Amazon SQS is written by Lambda to the primary Aurora instance in Availability Zone 1. This instance serves as the writer instance, with a reader instance deployed to Availability Zone 2.

Step 7
In the event of a primary instance failure, Aurora automatically promotes a secondary instance to become the new primary, a process known as failover. Throughout this failover process, Lambda continues writing data to the Aurora cluster.

Get Started

Deploy this Guidance

Sample code

Use sample code to deploy this Guidance in your AWS account

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Operational Excellence

This Guidance helps ensure business continuity through failover to a secondary Availability Zone, achieving near-zero RPO. Amazon CloudWatch captures comprehensive metrics from all the services employed in the Guidance. Customizable CloudWatch dashboards provide a unified view for monitoring resources, enabling proactive identification and resolution of potential issues.

The Guidance is designed to automatically respond to Availability Zone failures, eliminating the need for manual interventions. CloudWatch offers insights into critical metrics pertaining to services and application dependencies, facilitating informed decision-making and enhancing operational resilience.

Read the Operational Excellence whitepaper
Security

AWS WAF and Amazon CloudFront safeguard the application against vulnerabilities, such as controlling malicious bot traffic and blocking common attack patterns, including SQL injection or cross-site scripting (XSS). AWS WAF monitors HTTP(S) requests to your protected web application resources, enabling granular control over access to your content.

Amazon Cognito authenticates user interface (UI) and API calls to the application, simplifying the implementation of customer identity and access management (CIAM) and establishing a strong identity foundation. Additionally, Secrets Manager securely stores database credentials. This service streamlines the management, retrieval, and rotation of database credentials, API keys, and other sensitive information throughout their lifecycles, enabling tight access control and auditing of secrets.

Read the Security whitepaper
Reliability

This Guidance incorporates Amazon SQS, a message queuing service, to help ensure data integrity and prevent packet loss during failover events when the Amazon Relational Database Service (Amazon RDS) Proxy re-establishes a connection to the standby Aurora database instance.

Amazon SQS enables reliable and scalable message delivery, allowing software components to send, store, and receive messages at any volume—without the risk of message loss or dependencies on the availability of other services. This resilient messaging infrastructure allows for seamless communication and data consistency, even in the face of transient failures or component restarts.

Read the Reliability whitepaper
Performance Efficiency

In this Guidance, Aurora seamlessly scales the database capacity to accommodate growing data volumes within the cluster volume for optimal performance and effective resource utilization. As a managed service, Aurora dynamically scales resources based on demand, providing the necessary elasticity to handle fluctuating workloads. Specifically, Aurora storage capacity automatically increases to accommodate growing data within the cluster volume, eliminating the need for manual intervention or capacity planning.

Read the Performance Efficiency whitepaper
Cost Optimization

API Gateway and Lambda are serverless managed services that eliminate the need for provisioned compute resources. These cloud-native services adopt a pay-per-use billing model, avoiding redundant costs associated with maintaining infrastructure when the application is not in active use. Through API Gateway and Lambda, resources are dynamically allocated and charged based on actual usage, optimizing operational expenses and enabling efficient resource utilization.

Read the Cost Optimization whitepaper
Sustainability

To help ensure efficient resource utilization and minimize environmental impact, this Guidance uses the Amazon SQS DLQ. If a request fails to be processed after 25 attempts, it is automatically redirected to the DLQ, avoiding infinite processing loops.

Analyzing the DLQ requests enables the identification and rectification of potential issues in the data source, preventing the recurrence of the same errors and wasted processing power in the future. The DLQ limits the number of re-processing attempts per packet so that Lambda resources are not indefinitely spun up to process requests that are unlikely to succeed. This minimizes the environmental impact of the Guidance by reducing unnecessary compute operations.

Read the Sustainability whitepaper

Implementation Resources

A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

Open implementation guide

Open sample code on GitHub

[SEO Subhead]

Architecture Diagram

Get Started

Deploy this Guidance

Sample code

Well-Architected Pillars

Implementation Resources

Related Content

[Title]

Disclaimer

Was this page helpful?

Guidance for Designing Resilient Applications with Amazon Aurora and Amazon RDS Proxy

[SEO Subhead]

Architecture Diagram

Get Started

Deploy this Guidance

Sample code

Well-Architected Pillars

Implementation Resources

Related Content

[Title]

Disclaimer

Was this page helpful?

Ending Support for Internet Explorer