[SEO Subhead]
This Guidance helps you achieve near-zero recovery point objective (RPO) for your applications, minimizing data loss during potential Amazon Aurora failovers. You can improve data durability by using a persistent message queue that temporarily stores application data until it can safely be committed to the database. With this Guidance, you can design highly resilient databases for applications, helping to ensure minimal data loss and maintain data integrity.
Please note: [Disclaimer]
Architecture Diagram
[Architecture diagram description]
Step 1
A user generates a request to write to the Amazon Aurora database. This request is evaluated by the AWS WAF configured with standard rules to protect against common web exploits.
Step 2
If the request complies with the enacted AWS WAF policies, the request is routed to an Amazon API Gateway.
Step 3
API Gateway forwards HTTPS requests to an Amazon Simple Queue Service (Amazon SQS) queue.
Step 4
In the background, an event source mapping in AWS Lambda continuously polls the Amazon SQS queue for new messages. Amazon SQS is set to attempt message processing 25 times. If a message fails all attempts, it’s sent to an Amazon SQS dead-letter queue (DLQ).
Step 5
Upon receiving a new message, Lambda retrieves the database credentials stored in AWS Secrets Manager to connect to the Aurora database.
Step 6
The message retrieved from Amazon SQS is written by Lambda to the primary Aurora instance in Availability Zone 1. This instance serves as the writer instance, with a reader instance deployed to Availability Zone 2.
Step 7
In the event of a primary instance failure, Aurora automatically promotes a secondary instance to become the new primary, a process known as failover. Throughout this failover process, Lambda continues writing data to the Aurora cluster.
Get Started
Deploy this Guidance
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
This Guidance helps ensure business continuity through failover to a secondary Availability Zone, achieving near-zero RPO. Amazon CloudWatch captures comprehensive metrics from all the services employed in the Guidance. Customizable CloudWatch dashboards provide a unified view for monitoring resources, enabling proactive identification and resolution of potential issues.
The Guidance is designed to automatically respond to Availability Zone failures, eliminating the need for manual interventions. CloudWatch offers insights into critical metrics pertaining to services and application dependencies, facilitating informed decision-making and enhancing operational resilience.
-
Security
AWS WAF and Amazon CloudFront safeguard the application against vulnerabilities, such as controlling malicious bot traffic and blocking common attack patterns, including SQL injection or cross-site scripting (XSS). AWS WAF monitors HTTP(S) requests to your protected web application resources, enabling granular control over access to your content.
Amazon Cognito authenticates user interface (UI) and API calls to the application, simplifying the implementation of customer identity and access management (CIAM) and establishing a strong identity foundation. Additionally, Secrets Manager securely stores database credentials. This service streamlines the management, retrieval, and rotation of database credentials, API keys, and other sensitive information throughout their lifecycles, enabling tight access control and auditing of secrets.
-
Reliability
This Guidance incorporates Amazon SQS, a message queuing service, to help ensure data integrity and prevent packet loss during failover events when the Amazon Relational Database Service (Amazon RDS) Proxy re-establishes a connection to the standby Aurora database instance.
Amazon SQS enables reliable and scalable message delivery, allowing software components to send, store, and receive messages at any volume—without the risk of message loss or dependencies on the availability of other services. This resilient messaging infrastructure allows for seamless communication and data consistency, even in the face of transient failures or component restarts.
-
Performance Efficiency
In this Guidance, Aurora seamlessly scales the database capacity to accommodate growing data volumes within the cluster volume for optimal performance and effective resource utilization. As a managed service, Aurora dynamically scales resources based on demand, providing the necessary elasticity to handle fluctuating workloads. Specifically, Aurora storage capacity automatically increases to accommodate growing data within the cluster volume, eliminating the need for manual intervention or capacity planning.
-
Cost Optimization
API Gateway and Lambda are serverless managed services that eliminate the need for provisioned compute resources. These cloud-native services adopt a pay-per-use billing model, avoiding redundant costs associated with maintaining infrastructure when the application is not in active use. Through API Gateway and Lambda, resources are dynamically allocated and charged based on actual usage, optimizing operational expenses and enabling efficient resource utilization.
-
Sustainability
To help ensure efficient resource utilization and minimize environmental impact, this Guidance uses the Amazon SQS DLQ. If a request fails to be processed after 25 attempts, it is automatically redirected to the DLQ, avoiding infinite processing loops.
Analyzing the DLQ requests enables the identification and rectification of potential issues in the data source, preventing the recurrence of the same errors and wasted processing power in the future. The DLQ limits the number of re-processing attempts per packet so that Lambda resources are not indefinitely spun up to process requests that are unlikely to succeed. This minimizes the environmental impact of the Guidance by reducing unnecessary compute operations.
Implementation Resources
A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Related Content
[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.