AWS Storage Blog
Boost testing confidence with automated Amazon RDS data replication from production to non-production environment
Automated testing in a pre-production environment is crucial for verifying the reliability and stability of software releases in any organization. However, for many applications, writing and executing these tests necessitates the use of data from production system. This production data is valuable for testing and development because it represents real-world scenarios, usage patterns, and edge cases that might not be captured in synthetic test data. It enables builders to create comprehensive test scenarios, conduct thorough performance evaluations, and gain greater confidence in system behavior under real-world conditions. The separation of production and pre-production environments into isolated, unconnected accounts adds unexpected complexity to the process. Furthermore, production environments often have strict access controls in place as a recommended best practice, making it challenging to run traditional data sync tools across these environments. Moreover, continuous replication between production and pre-production environments is undesirable because application teams need to modify the database contents for pre-production testing. They also do not want the pre-production database to be automatically overwritten until the tests are complete.
One effective approach to copying production data is through a backup and restore strategy, leveraging cloud-native tools. AWS Backup is a fully managed backup service that allows you to create backups of your databases, such as Amazon Relational Database Service (Amazon RDS) instances, and store them in a centralized location. AWS Step Functions allows you to create workflows to build distributed applications, automate processes, orchestrate microservices, etc. Combining these two services allows you to create a workflow that periodically copies data from the production environment to the pre-production environment. This gives your application teams access to up-to-date production data for testing purposes without compromising security or compliance requirements.
In this post, we present a way to set up automated data copying from a production to a pre-production environment periodically using AWS Backup. We focus on RDS instances, although the solution could be extended for other database types supported by AWS Backup, such as Amazon DynamoDB, Amazon Aurora DB clusters, etc., though not all databases are supported equally. Some supported Databases may lack specific capabilities, such as incremental backups. Check the AWS Backup feature availability for your chosen database or service. It is important to note that for this solution we assume that the data to be copied from the production environment does not contain personally identifiable information (PII) or any other data that is considered sensitive. Copying sensitive production data can be a security risk. You can follow this AWS Security post to enable data classification using Amazon Macie.
Solution overview
AWS Backup is a fully managed, cost-effective, policy-based Backup as a Service offering from AWS. It enables you to centralize and automate backups across AWS services and AWS Regions. It also helps you support your regulatory compliance obligations and meet your business continuity goals. Let’s see how you can leverage AWS Backup to copy data from production to pre-production environment.
The following figure outlines the high-level architecture of the solution. In the following example you will create a backup vault and a backup plan in the production account. The backup plan includes configuration of schedule for backup, choice of database resources to backup, and the location to copy backups (called recovery points in AWS Backup) after the backup completes. For example, here you may configure a Backup Plan to take a snapshot of a RDS instance every Sunday at 9am and copy the snapshot to the pre-production account.
When the backup is copied to backup vault of the pre-production account, an event is sent to the default event bus using Amazon EventBridge. With this event, we trigger a Step Functions state machine to start restoring the AWS Backup in the pre-production account in the specified subnet.
Figure 1: Architecture diagram of the solution
A note on AWS Backup restore testing
AWS Backup provides a native feature called restore testing enabling automated database restoration without the need for Step Functions orchestration. Although this feature offers a streamlined approach to database restoration, it only retains restored databases for a maximum of 7 days before deleting them. If your use case needs to refresh pre-production data more frequently than every 7 days, then restore testing could be a suitable alternative to the solution presented in this post.
The Step Functions solution enables customizable retention periods beyond 7-days. This solution provides complete control over the lifecycle of restored databases while maintaining the benefits of automated restoration.
For more information about AWS Backup Restore Testing and its capabilities, refer to the restore testing documentation.
Prerequisites
The following prerequisites are necessary to complete this solution.
A production account
This is the AWS account where your production RDS database instances reside. The backup is initiated from this account through a configurable cron schedule and copied to another (pre-production) account. You need to have admin (or a role with adequate AWS Identity and Access Management (IAM) permission) in this account to be able to apply the Terraform configurations provided in this solution.
A pre-production account
This is the account where you would like to send a copy your production data. Both pre-production and production accounts must be part of same AWS Organization. You need to have admin (or a role with adequate IAM permission) in this account to be able to apply the Terraform configurations provided in this solution.
Cross-account backups enabled from management account of the Organization
Before we copy backups from the production to pre-production account, we must enable cross-account backup in the management account of your Organization. This should only be done in the Organization management account. Go to the AWS Backup console in your Organizations management account and choose Settings under My account. Then under Cross-account management enable Cross-account backup. For more information, follow this documentation.
An RDS instance in the production account for testing the solution
Finally, to test our solution, create one RDS instance in the production account. You can follow this documentation to create RDS Instance. Make sure that you add the tag “CopyToPreProd=true” for the RDS database instance. This solution uses this tag to identify the RDS instances that would be copied to the pre-production account. You can use the following AWS Command Line Interface (AWS CLI) command to add the tag to the existing RDS instance.
IAM permission errors are a more common type of error encountered while doing cross-account backup. Therefore, you should make sure that you have all appropriate permissions in both the production and pre-production account.
Walkthrough
The following steps walk you through this solution. We will use the Terraform CLI to create and configure the required resources in the production and pre-production accounts to test this solution. The Terraform configurations are provided in the git repo.
Set up the production account and configure backup schedule
- Clone the solution repository and go to prod account directory:
- Set the values for relevant variables in
terraform.tfvars
file:
Testing tip: To expedite solution testing, you can configure an immediate backup after resource creation. Set the backup frequency to a custom schedule in your production account and calculate a time approximately 30 minutes from your deployment (allowing for resource creation)
This configuration makes sure that your first backup occurs shortly after the infrastructure deployment completes, which allows you to validate the solution without waiting for the next scheduled backup window.Remember to adjust the backup schedule to your preferred production frequency (daily, weekly, or monthly) after testing is complete.
- Initialize and plan the Terraform configuration:
- Check the output of the
terraform plan
, then apply the Terraform configuration:
terraform apply
Set up the pre-production account and configure restore step function.
- Navigate to the pre-production directory:
cd sample-copy-rds-production-to-pre-production-automation/pre-prod
- Set the values for relevant variables the
terraform.tfvars
file (pre-production):
- Initialize and plan the Terraform configuration:
- Check the output of the
terraform plan
, then apply the Terraform configuration:
terraform apply
Above steps set up the necessary AWS Backup infrastructure in both accounts. This enables automated database copying from the production to pre-production environments. The production account handles backup creation and copying, while the pre-production account manages the restore process through Step Functions workflows.
Solution Walkthrough
In this section we walk through how the automated solution works and validate each step of the process.
Backup creation in production:
When the scheduled backup time arrives, AWS Backup initiates a snapshot of your RDS instance. For RDS instances with storage sizes of a few Gigabytes, this process typically takes around 30 minutes. However, backup creation depends on various factors, such as database size, network speed, etc. Furthermore, there is no SLA for it. Therefore, you may experience a shorter or longer time for your RDS database instance. When it is finished, you should see the recovery point appear in your AWS Backup vault in the production account.
Figure 2: AWS Backup Vault in production account after RDS snapshot is created
Cross-account copy:
After the recovery point is available in the production account, AWS Backup automatically initiates a copy job to copy the RDS snapshot to your pre-production account. You can monitor this process in the Jobs section under the Copy jobs tab. Wait for the copy job to complete successfully.
Figure 3: Completed Copy jobs in the Production account
Recovery point verification in pre-production:
When the copy job completes, verify that the RDS snapshot appears as a recovery point in your pre-production account’s backup vault. This indicates successful cross-account copy of your RDS snapshot.
Figure 4: Copied RDS snapshot as Recovery point in the AWS Backup Vault of pre-production account
Automated restore process:
The availability of a new recovery point in the pre-production account’s backup vault triggers an EventBridge event, which launches our restore Step Functions workflow. This state machine orchestrates the database restoration process and sends status notifications to the “notify-db-restore-result” SNS topic. You can subscribe to this topic to receive success or failure notifications for the restore operations.
Figure 5: Step Function State machine execution in pre-production account that restores the copied backup from Production account
Database validation:
After successful completion of the Step Functions workflow, you should see your restored RDS instance running in the pre-production account. The restored database maintains the same authentication credentials as the source database from the production account. This allows you to validate the data immediately. If you have enabled IAM authentication in the production account for the supported database engines, then you can securely access restored database without more credential configuration.
Figure 6: RDS DB Instance that is restored in pre-production account
Now you can run tests in the pre-production environment on the production data. Once you are done with your testing, delete the RDS instance and the solution will automatically copy fresh data from prod and create a new RDS instance as per the schedule you have specified.
Cleaning up
- Clean up the resources created by Terraform in both accounts.
Pre-production account:
Production account:
- Delete the restored RDS resource in the pre-production account.
WARNING: Before proceeding with this step, carefully review all RDS instances tagged with ‘copied-from-prod=true’ in your pre-production account. Make sure that these instances are safe to delete and no critical workloads depend on them.
- Optionally, delete the Test RDS resource in the production account.
If you created an RDS instance in the production account for testing this solution (as mentioned in the Prerequisites section), then remove it:aws rds delete-db-instance --db-instance-identifier <your-test-rds-instance> --skip-final-snapshot
Important:
- Wait for each deletion operation to complete before proceeding to the next step.
- Monitor the AWS Management Console or use AWS CLI to verify the deletion status.
- Make sure that you make appropriate backups if needed before deletion.
- Double-check that you’re operating in the correct AWS account for each step.
Conclusion
In this post, we demonstrated how to use AWS Backup to securely copy production databases to lower environments for testing purposes. This solution enables development teams to automatically copy production data to pre-production environment on a scheduled basis enabling them to test their applications against production-like data and increase confidence in releases. Additionally, the solution maintains separation between environments with controlled data flow and to orchestrate data restore with automated processes.Although we focused on Amazon RDS instances in this walkthrough, you can extend this solution to other AWS services supported by AWS Backup, such as DynamoDB and Aurora clusters.This solution is recommended only for non-sensitive production data. If your databases contain sensitive information, then make sure of proper data classification and sanitization before copying to lower environments.This solution can enable you to improve your testing practices while maintaining security and compliance requirements across your AWS environments.