AWS Database Blog

Replicate and transform data in Amazon Aurora PostgreSQL across multiple Regions using AWS DMS

Global organizations that operate and do business in many countries need to be compliant with data sovereignty and other compliance rules like GDPR. For example, you may want to replicate data to other Regions while at the same time removing certain columns to adhere to privacy laws within a country.

In this post, we demonstrate how you can replicate data across Regions while transforming it using AWS Database Migration Service (AWS DMS) and Amazon Aurora PostgreSQL-Compatible Edition.

Amazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open-source databases. This solution also works with Amazon Relational Database Service (Amazon RDS) for PostgreSQL.

AWS DMS helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, which minimizes downtime to applications that rely on the database. AWS DMS can migrate your data to and from most widely used commercial and open-source databases. For more information, see Sources for data migration.

Solution overview

For our use case, we have a global product catalog with a requirement to replicate across multiple Regions while at the same time removing various columns due to data privacy compliance. Also, when the catalog is updated, the target copies need to be updated. We can use AWS DMS with ongoing replication to solve this challenge.

The following diagram illustrates our architecture used in this post.

Prerequisites

For this walkthrough, you should have the following prerequisites:

Deploying Aurora PostgreSQL and AWS DMS and replicating across Regions incurs costs in your account.

Replicate and transform

Now that the services are configured, we can create the AWS DMS migration task, which is where the data can be transformed via a transformation rule. For this post, we use the action of removing a column based on a compliance use case. You can use transformations to rename a schema or table, add a prefix or suffix to a schema or table, or include a specific table column. For more information about the transformation actions supported, see Transformation rules and actions.

For our demonstration, we use the following table.

Source
us-east-1
column_name
employee_id
first_name
last_name
email
phone_number
social_security
hire_date
age
job_id
salary
manager_id
department_id
  1. When you create the migration task, select Migrate existing data and replicate ongoing changes for the Migration type for ongoing replication.
  2. Under Table mappings, specify the database schema and tables from the source database.
  3. Under Transformation rules, choose Add transformation rule.
  4. For Target, choose Column.
  5. For Schema name, enter a name.
  6. For Table name, enter the table to transform.
  7. For Column name, enter the column to remove.
  8. For Action, choose Remove Column.
  9. Choose Create task.

Because this is a CDC (change data capture) task, the task continues to run with a status of Load complete, replication ongoing until the task is manually stopped. This allows all changes on the source to be replicated continuously.

You can repeat these steps for other Regions with different transformation rules based on your requirements.

Transformation results

For this demonstration, we removed the social_security column from both target Regions and age only in eu-west-1, as depicted in the following table.

Source Target 1 Target 2
us-east-1 eu-west-2 eu-west-1
column_name column_name column_name
employee_id employee_id employee_id
first_name first_name first_name
last_name last_name last_name
email email email
phone_number phone_number phone_number
social_security
hire_date hire_date hire_date
age age
job_id job_id job_id
salary salary salary
manager_id manager_id manager_id
department_id department_id department_id

In the output from the source Region, us-east-1, the employee table has 12 rows.

In the output from the target Regions, eu-west-2 has the social_security column removed, and eu-west-1 has the social_security and age column removed via the AWS DMS transformation rule.

Clean up

To clean up any provisioned resources used in this post, delete the following items:

  • The CloudFormation stack if you used the Quick Start
  • Any Aurora PostgreSQL databases created if you didn’t use the Quick Start
  • AWS DMS resources in the following order:
    • Any ongoing migration tasks
    • The migration tasks once it has stopped
    • The replication instance

Conclusion

In this post, we showed you how to use AWS DMS to replicate an Aurora PostgreSQL database to multiple Regions while performing a transformation on a specific table. Also, by using an AWS DMS CDC migration task, all changes on the source database are replicated to the target endpoints.

Let us know what you think by sharing your comments.

For further reading, see Best practices with Amazon Aurora PostgreSQL.


About the Authors

Raj Dandu is a Solutions Architect at Amazon Web Services. He helps customers build solutions using AWS services. His favorite technical domains are data management and containers. In his spare time Raj likes running and spending time with his family.

Tom Santuccio is a Solutions Architect at Amazon Web Services and is based out of Charlotte, NC. He has a focus on security and enjoys leveraging technology to solve business problems for large enterprise organizations. When he’s not working, he enjoys running and spending time with his family.

Suresh Patnam is a Solutions Architect at AWS. He helps customers innovate on the AWS platform by building highly available, scalable, and secure architectures on Big Data and AI/ML. In his spare time, Suresh enjoys playing tennis and spending time with his family.