AWS Database Blog
Migrate to an Amazon Aurora PostgreSQL instance from another PostgreSQL source
January 2024: This post was reviewed and updated for accuracy.
Amazon Aurora PostgreSQL-Compatible Edition combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases. Aurora provides this by scaling storage across three Availability Zones in the same Region, and supports up to 15 read replica instances for scaling out read workloads and high availability within a single Region. With Amazon Aurora Global Database, you can now replicate PostgreSQL databases in up to five Regions for remote read access and disaster recovery in the event of failure of an entire Region.
Due to these features, many customers are migrating their databases to Aurora PostgreSQL-Compatible. This post talks about migrating an Aurora PostgreSQL-Compatible instance from another PostgreSQL source, which can be an existing Amazon RDS for PostgreSQL instance or a self-managed PostgreSQL instance installed on an Amazon Elastic Compute Cloud (Amazon EC2) instance or on-premises.
Prerequisites
To reduce the time of the migration process, you can possibly reduce the size of the database before you start the migration. You have the following options:
- Run VACUUM FULL/REINDEX or
pg_repack
on the database - Archive or remove any old or temporary data
Running VACUUM FULL or pg_repack on the database
You can check the bloat on the database using a bloat query, which shows the sizes of tables and indexes with actual size and the size that can be reduced by running VACUUM FULL or REINDEX on a particular table or index. These maintenance activities need the ACCESS EXCLUSIVE lock on the objects, which doesn’t allow any SQL on the object, including SELECT.
You can use the pg_repack
extension to remove bloat from tables and indexes. This extension is supported on Amazon RDS for PostgreSQL versions 9.6.3 and later. For more information, see Working with the pg_repack Extension.
Also, you can remove duplicate or unused indexes in your database, to identify unused indexes use the following query:
To identify duplicate indexes use the following query:
Archiving or removing old or temporary data
If you notice that some data has been created temporarily for testing purposes during some development process and not cleaned up yet, you can remove that old or temporary data, reducing the size of the database and therefore snapshot time. You can use the DROP or TRUNCATE table options to get rid of unused or temporary tables.
An efficient archiving/purging strategy can be implemented leveraging the aws_s3 extension or the AWS Database Migration Service (AWS DMS). Describe the different archiving or purging strategies is beyond the scope of this post, but you can find details in the following blog posts:
- Archive and Purge Data for Amazon RDS for PostgreSQL and Amazon Aurora with PostgreSQL Compatibility using pg_partman and Amazon S3
- Amazon Aurora PostgreSQL backups and long-term data retention methods
- How to archive data from relational databases to Amazon Glacier using AWS DMS
Migrating from RDS PostgreSQL to Aurora PostgreSQL
RDS offers you have two native options to migrate from RDS PostgreSQL to Aurora PostgreSQL-Compatible:
- Use an RDS for PostgreSQL snapshot
- Use an Aurora Read Replica for RDS for PostgreSQL
Your migration option depends on your use case and allowed application downtime. In this section, we discuss the purposes of these options and how to complete the migration.
Using an RDS PostgreSQL snapshot
In this option, you use the snapshot created for your existing RDS for PostgreSQL instance to create an Aurora PostgreSQL-Compatible instance. You may want to use this migration strategy with the following use cases:
- You want your Aurora PostgreSQL-Compatible instance in a different Region than your existing RDS PostgreSQL instance
- You want your Aurora PostgreSQL-Compatible instance in a different AWS account than your existing account
- Your existing RDS for PostgreSQL is version 9.5 or older
For our use case, you have the RDS for PostgreSQL instance in a us-east Region and want to create an Aurora PostgreSQL-Compatible instance in a us-west Region. For this, you need to complete the following high-level steps:
- Create a snapshot of RDS for PostgreSQL.
- Copy the snapshot.
- Share the snapshot to the required AWS Region or account.
- Create Aurora PostgreSQL-Compatible using the snapshot.
Creating a snapshot for RDS PostgreSQL
To create your snapshot, complete the following steps:
- On the Amazon RDS console, choose Databases.
- Select the RDS for PostgreSQL instance that you want to migrate.
- From the Actions drop-down menu, choose Take snapshot.
- For Snapshot name, enter a name for your snapshot (for example,
to-create-aurorapg
). - Choose Take Snapshot.
To see your snapshot, choose Snapshots from the navigation pane. The snapshot is listed on the Manual tab.
For full instructions, see Creating a DB Snapshot.
Copying the snapshot
In this step, you copy an encrypted or unencrypted DB snapshot, in the same Region or across Regions.
- On the Amazon RDS console, choose Snapshots.
- On the Manual tab, select the snapshot you want to copy.
- From the Actions drop-down menu, choose Copy Snapshot.
For this post, we copy the snapshot from us-east
to us-west
. The destination Region must have the same database engine version available as the source Region.
- For Destination Region, choose US West (N. California).
- For New DB Snapshot Identifier, enter
to-create-aurorapg-from-useast
. - For Encryption, select Disable encryption, or choose Enable encryption if you want to encrypt the snapshot irrespective of if your source snapshot is encrypted or not.
- Then for Master Key, specify the AWS Key Management Service (AWS KMS) key identifier to use to encrypt the DB snapshot copy.
- Choose Copy snapshot.
You can view the progress of the snapshot on the Snapshots page. In the following screenshot, the snapshot is in the process of being copied in us-west
.
Sharing a DB snapshot
You can share a DB snapshot (which is taken manually) to 20 AWS accounts, and can also stop sharing the snapshot for one or more accounts.
- On the Amazon RDS console, choose Snapshots.
- Select the snapshot you want to share.
- From the Actions drop-down menu, choose Share Snapshot.
- For DB snapshot visibility, select Public or Private.
- For AWS account ID, enter the account you want to share the snapshot with and choose Add.
- Repeat to include additional AWS accounts, up to 20.
- Choose Save.
To stop sharing a manual DB snapshot with an AWS account, repeat steps 1-3. On the Snapshot permissions page, enter the ID of the account you want to delete and choose Delete.
Creating an Aurora PostgreSQL instance using the snapshot
After you create the snapshot of your RDS for PostgreSQL instance and share with the Region or account that you want, you can create an Aurora PostgreSQL-Compatible instance using that snapshot.
- On the Amazon RDS console, choose Snapshots.
- Select the snapshot you want to migrate to an Aurora PostgreSQL-Compatible DB cluster.
- From the Actions drop-down menu, choose Migrate snapshot.
- For Migrate to DB Engine, choose your database engine (for this post, aurora-postgresql).
- For DB Engine Version, enter the version number of your database engine (for this post, 4).
For more information about setting those values, see Migrating an RDS PostgreSQL DB Snapshot to an Aurora PostgreSQL DB Cluster.
Using an Aurora Read Replica for RDS PostgreSQL
As a second option, you can migrate from an RDS for PostgreSQL DB instance to an Aurora PostgreSQL-Compatible DB cluster by using an Aurora Read Replica. This is the best approach if you want to migrate your RDS for PostgreSQL to Aurora PostgreSQL-Compatible within the same Region and account. However, this migration may take some time depending on the data size, roughly several hours per tebibyte (TiB) of data. Large databases under heavy load take a long time to sync. While the migration is in progress, your RDS for PostgreSQL instance accumulates write ahead log (WAL) segments for the initial sync and catch-up to happen. Make sure that your database instance has enough free storage available to hold these WAL files.
In this case, Amazon RDS uses the PostgreSQL DB engine’s streaming replication functionality to create a special type of DB cluster for the source PostgreSQL DB instance. This type of DB cluster is called an Aurora Read Replica. Updates made to the source PostgreSQL DB instance are asynchronously replicated to the Aurora Read Replica.
This migration option has two steps:
- Create an Aurora Read Replica for RDS for PostgreSQL.
- Promote the Aurora Read Replica to an Aurora PostgreSQL-Compatible cluster.
Creating an Aurora Read Replica
To create an Aurora Read Replica for a PostgreSQL DB instance, complete the following steps:
- On the Amazon RDS console, choose Databases.
- Select the PostgreSQL DB instance you want to use as the source for your read replica.
- From the Actions drop-down menu, choose Create Aurora read replica.
- On the Create Aurora Read Replica page, choose your options accordingly.
- Choose Create read replica.
For full instructions, see Creating an Aurora Read Replica.
Promoting the Aurora PostgreSQL Read Replica to an Aurora PostgreSQL-Compatible Cluster
You’re now ready to promote your read replica to an Aurora PostgreSQL-Compatible cluster.
- On the Amazon RDS console, choose Instances.
- Select the DB instance for the read replica.
- From the Actions drop-down menu, choose Promote.
- Choose Promote Read Replica.
Migrating from a self-managed PostgreSQL instance to an Aurora PostgreSQL-Compatible instance
If you have an existing PostgreSQL deployment on an EC2 instance or on-premises that you want to move to an Aurora PostgreSQL-Compatible instance, then the complexity of your task depends on the size of your database and the types of database objects that you’re transferring. For example, consider a database that contains datasets on the order of gigabytes, along with stored procedures and triggers. Such a database is more complicated than a simple database with only a few megabytes of test data and no triggers or stored procedures. For this use case, you can consider using the following options:
pg_dump
andpg_restore
- Logical replication
- AWS Database Migration Service (AWS DMS) replication
Using pg_dump and pg_restore
pg_dump and pg_restore are PostgreSQL native tools to create and restore a backup, respectively. You can create this backup on a production environment because it doesn’t block readers and writers. However, because this isn’t a continuous backup, you have to stop the application or stop writing to the database from the application before you initiate the backup. For instructions, see Importing a PostgreSQL Database from an Amazon EC2 Instance.
Using Logical Replication
Logical Replication is another way to migrate data from one PostgreSQL instance to another. Logical Replication is a method of replicating data objects and their changes, based upon their replication identity (usually a primary key). It is introduced in PostgreSQL 10, but you can use the pglogical extension if you have an older version than PostgreSQL 10.
For more information about enabling and using logical replication, see Using PostgreSQL Logical Replication with Aurora. For instructions on setting up logical replication, see Migrating PostgreSQL from on-premises or Amazon EC2 to Amazon RDS using logical replication.
Using AWS DMS replication
You can use AWS DMS to migrate data from your database that is on-premises, on an RDS DB instance, or in a database on an EC2 instance to a database on an AWS service like Aurora PostgreSQL-Compatible. Because you can replicate the changes continuously using AWS DMS, you don’t need to stop your application. For instructions, see Using DMS to migrate data from PostgreSQL to PostgreSQL.
For more information about AWS DMS, see the AWS DMS User Guide.
Summary
This post talked about the different available options to import data to an Aurora PostgreSQL-Compatible instance. Although you can choose from multiple options, each one has a specific purpose or use case:
- If you want to migrate from RDS for PostgreSQL to an Aurora PostgreSQL-Compatible instance in a different AWS Region or AWS account, you can consider using a snapshot.
- If you want to migrate from RDS for PostgreSQL to Aurora PostgreSQL-Compatible within the same Region, the best option is to create an Aurora Read Replica for your RDS for PostgreSQL instance.
- If you want to migrate your RDS for PostgreSQL with a different Region or account or PostgreSQL on Amazon EC2 or on-premises instances with minimal downtime, consider using AWS DMS.
If you have comments or questions about this solution, please submit them in the comments section.
About the Authors
Baji Shaik is a Consultant with AWS ProServe, GCC India. His background spans a wide depth and breadth of expertise and experience in SQL/NoSQL database technologies. He is a Database Migration Expert and has developed many successful database solutions addressing challenging business requirements for moving databases from on-premises to Amazon RDS and Aurora PostgreSQL. He is an eminent author, having written several books on PostgreSQL. A few of his recent works include “PostgreSQL Configuration“, “Beginning PostgreSQL on the Cloud”, and “PostgreSQL Development Essentials“. Furthermore, he has delivered several conference and workshop sessions.
Domenico di Salvia is a Senior Database Specialist Solutions Architect at AWS. In his role, Domenico works with customers in EMEA region to provide guidance and technical assistance on database projects, helping them improve the value of their solutions when using or migrating to AWS, designing scalable, secure, performant, sustainable, cost-effective and robust database architectures in the AWS Cloud.
Audit History
Last reviewed and updated in January 2024 by Domenico di Salvia | Sr. RDS PostgreSQL Solutions Architect