AWS Database Blog

Cross-account Amazon Aurora PostgreSQL and Amazon RDS for PostgreSQL migration with reduced downtime using AWS DMS

Migrating an Amazon Aurora PostgreSQL-Compatible Edition or Amazon Relational Database Service (Amazon RDS) for PostgreSQL database from one AWS account to another is a common use case in scenarios when you acquire, sell, or merge operations, or if you’re reorganizing your AWS accounts and organizational structure. In this post, we discuss the steps and best practices to migrate an Amazon Aurora PostgreSQL-Compatible Edition and Amazon RDS for PostgreSQL from one AWS account to another with reduced downtime using AWS Database Migration Service (AWS DMS). You can also apply the concepts covered in this post for other use cases such as cross-VPC replication, cross-subnet replication, and converting unencrypted to encrypted database instances.

This migration approach involves two phases: the initial data load and ongoing replication (change data capture). The majority of the migration steps mentioned in this post are applicable for both Amazon Aurora PostgreSQL-Compatible Edition and Amazon RDS for PostgreSQL. For simplicity, we focus on the migration steps for Amazon Aurora PostgreSQL-Compatible Edition, but we will call out the differences for Amazon RDS for PostgreSQL in a later section of this post.

Phase 1: Initial data load

There are multiple options to perform an initial data load, such as the following:

  • Snapshot restoreRestoring from a DB cluster snapshot for Amazon Aurora PostgreSQL-Compatible Edition or restoring from a DB snapshot for Amazon RDS for PostgreSQL
  • Database cloningAurora cloning for Amazon Aurora PostgreSQL-Compatible Edition
  • Logical restore – Using a native or third-party database migration utility such as pg_dump and pg_restore
  • Managed service – Using a managed service such as AWS DMS

In this post, we use Aurora cloning to perform the initial load for Amazon Aurora PostgreSQL-Compatible Edition and RDS DB snapshot restore for Amazon RDS for PostgreSQL. Aurora cloning and RDS DB snapshot restore are fast and cost-effective options compared to other approaches.

Phase 2: Change data capture

AWS DMS is a fully managed service that supports change data capture (CDC) using PostgreSQL logical replication. AWS DMS uses logical replication slots to retain write-ahead logs (WALs) for replication until the logs are decoded. AWS DMS supports test_decoding and pglogical plugins for logical decoding.

Unlike pglogical, which can do selective decoding based on the replication scope of an AWS DMS task, the test_decoding plugin decodes all schema table updates. In comparison to test_decoding, the pglogical plugin consumes less network bandwidth, CPU, and I/O during the replication for a use case that just requires replication for selective tables. Because all of the tables must be replicated, we use test_decoding in this post.

You can only replicate specific data definition language (DDL) statements, such CREATE, ALTER, and DROP, with an AWS DMS CDC-enabled task. For unsupported DDLs and workarounds, see Limitations on using a PostgreSQL database as a DMS source. AWS DMS also allows the replication of large objects (LOBs) and validates data replicated for tables with a primary key or unique key.

Solution overview

In this post, we use Aurora cloning for the initial data load and an AWS DMS CDC-only task for ongoing replication. The following diagram illustrates this architecture.

Solution Overview

To implement this solution, we complete the following steps:

  1. Configure logical replication.
  2. Perform the initial data load and capture the log sequence number (LSN).
  3. Configure ongoing replication using AWS DMS.
  4. Monitor replication.
  5. Validate the data.
  6. Perform cutover.

For this demonstration, the source AWS account is 6144xxxxxxxx and the target account is 2634xxxxxxxx. Both AWS accounts are part of same AWS Organization.

The source Aurora PostgreSQL DB cluster is named aurora-pgsql-source-01 and the target is aurora-pgsql-target-01. Both DB clusters are running on PostgreSQL 13.7 engine version and use an AWS Key Management Service (AWS KMS) customer managed key (CMK) for encryption.

For ongoing replication, we use an AWS DMS replication instance of version 3.4.7 in the target AWS account. In this post, we also use dms_sample as a sample schema and the default PostgreSQL database postgres for migration.

Prerequisites

Make sure you meet the following prerequisites:

  • The encrypted Aurora PostgreSQL DB cluster must not use the default Amazon RDS AWS KMS key (aws/rds) to support cross-account Aurora cloning and RDS DB snapshot sharing.
  • Your source and target AWS accounts must have connectivity using either VPC peering or AWS Transit Gateway. For more information, see Create a VPC peering connection.
  • You should use the AWS primary user account for the PostgreSQL DB instance as the user account for the PostgreSQL source endpoint for AWS DMS. If you don’t use the primary user account for the DB instance, see Prerequisites for using an AWS-managed PostgreSQL database as a DMS source.
  • The VPC security group associated with the source and target Aurora PostgreSQL DB cluster should allow inbound connections from the AWS DMS replication instance. The security group associated with the replication instance should also allow all outbound connections. For more information, see Setting up a network for a replication instance.
  • Amazon Aurora PostgreSQL-Compatible Edition must be version 3.4 with PostgreSQL 11.9 compatibility (or higher) to have AWS DMS CDC support. Make sure to apply any minor version updates and patches to your Aurora PostgreSQL DB cluster before configuring replication. The aurora_volume_logical_start_lsn() function that’s used in this post is built into the highest minor version of each major Amazon Aurora PostgreSQL-Compatible Edition release.
  • Amazon RDS for PostgreSQL must be on the available versions of PostgreSQL 11.x or higher to have AWS DMS CDC support.
  • You should have the latest version of the AWS Command Line Interface (AWS CLI) installed and configured on your host such as Amazon Elastic Compute Cloud (Amazon EC2), AWS Cloud9, or a similar instance.

Limitations

Note the following limitations:

  • You can’t create a clone in a different AWS Region from the source Aurora DB cluster. For more information on additional limitations, see Limitations of cross-account cloning.
  • If your tables use sequences, then the sequences will not be advanced on the target even though data is being copied from the source during ongoing replication with AWS DMS. During cutover, you need to update the NEXTVAL of the sequences in the target database after stopping replication from the source database.
  • AWS DMS supports CDC for PostgreSQL tables with primary keys. If a table doesn’t have a primary key, the write-ahead logs don’t include a prior image of the database row, and AWS DMS ignores DELETE and UPDATE record operations for that table during ongoing replication. You can use the table-level parameter replica identity as a workaround. For more information, see Additional configuration settings when using a PostgreSQL database as a DMS source.

For more information about the limitations of having a PostgreSQL database as source and target using AWS DMS, see Limitations on using a PostgreSQL database as a DMS source and the Using PostgreSQL as a target for AWS Database Migration Service.

Prepare the source database for replication

Before you start the initial data load, you must enable logical replication and create a replication slot for the source database. For more details, see Enabling CDC with an AWS-managed PostgreSQL DB instance with AWS DMS.

Enable logical replication

By default, logical replication is turned off in AWS-managed PostgreSQL DB instances. You need to set rds.logical_replication to 1 in the custom parameter group associated with source Aurora PostgreSQL DB cluster. This static parameter requires the reboot of the writer instance of the Aurora PostgreSQL DB cluster to take effect. For more information, see Setting up logical replication for your Aurora PostgreSQL DB cluster.

You can verify the status of logical replication and the replication slot using the following command:

source|postgres=> show rds.logical_replication;
rds.logical_replication
-------------------------
on
(1 row)

Create a PostgreSQL replication slot

Create a PostgreSQL replication slot using the test_decoding plugin for each database that requires ongoing replication. This allows you to retain the WAL logs needed for ongoing replication using AWS DMS. When you create a replication slot, we recommend that you commit or cancel long-running transactions. Use following command to check for transactions:

source|postgres=> SELECT pid,datname,usename,query FROM pg_stat_activity where state <> 'idle' AND pid<>pg_backend_pid();
pid | datname | usename | query
-----+---------+---------+-------
(0 rows)

In this post, you create the replication slot for example, replication_slot with the test_decoding plugin using the following command:

source|postgres=> SELECT * FROM pg_create_logical_replication_slot('replication_slot', 'test_decoding');
slot_name        |   lsn      
-----------------+------------
replication_slot | 1/87F5A6A0 

We recommend creating the replication slot during a low traffic window because the source database generates WAL logs for all changes. Up until we start utilizing generated WAL logs, WAL logs take up storage space. The volume of transactions in the source database determines how much storage is needed. We recommend having enough storage in the source database instance and monitoring the disk usage with Amazon CloudWatch. You can also create an alarm.

Perform the initial data load and capture the LSN

In this section, you perform the initial data load in the target database and capture the LSN of the WAL records to configure ongoing replication. Performing the initial data load for Amazon Aurora PostgreSQL-Compatible Edition consists of several steps.

Allow the target AWS account to clone the DB cluster

For Aurora PostgreSQL DB cluster cloning in the target AWS account, you must grant permission to the target AWS account. Use AWS Resource Access Manager (AWS RAM) to set the sharing permissions to allow the target AWS account to clone the Aurora PostgreSQL DB cluster that is owned by source AWS account. For more information, see Share resources owned by you. In this post, we use the create-resource-share AWS CLI command to grant permission to clone the DB cluster in the target AWS account:

source-aws-account>aws ram create-resource-share --name aurora_share \
--region us-east-1 \
--resource-arns arn:aws:rds:us-east-1:6144xxxxxxxx:cluster:aurora-pgsql-source-01 \
--principals 2634xxxxxxxx

Open the AWS RAM console in the target AWS account, and in the navigation pane, choose Shared with me and Resource shares to accept the invitation. For more details, see Accepting invitations to share clusters owned by other AWS accounts.

You also share the AWS KMS key that is used to encrypt the source DB cluster with the target AWS account by adding the target account to the KMS key policy. For details on updating a key policy, see Key policies in AWS KMS. For an example of creating a key policy, see Allowing access to an AWS KMS key.

Create the target Aurora PostgreSQL DB cluster using an Aurora clone

Create an Aurora clone of the source DB cluster in the target AWS account. For more information, see Cloning an Aurora cluster that is owned by another AWS account. The source DB cluster remains available for all the transactions during the cloning process with no impact to performance.

To create the target Aurora PostgreSQL DB cluster, complete the following steps:

  1. On the Amazon RDS console, choose Databases in the navigation pane.
  2. Select your DB cluster (shared from the source account) and on the Actions menu, choose Create clone.

create clone

  1. Provide a unique name (for example, aurora-pgsql-target-01). Most of the remaining settings are defaults from the source instance.
  1. After you make the required changes for the target instance, choose Create clone.

While the target instance is initiating, the Status column of the writer node displays Creating. When the instance is ready, the status changes to Available.

Aurora Status

Capture the LSN

When your Aurora PostgreSQL DB cluster is in Available state, connect to your new target Aurora PostgreSQL DB cluster to obtain the LSN at which the target cluster is consistent:

target|postgres=> SELECT aurora_volume_logical_start_lsn();
aurora_volume_logical_start_lsn
---------------------------------
1/87F75990

Save the output to use for ongoing replication at a later step.

Drop the replication slot from the target Aurora PostgreSQL cluster

The target Aurora PostgreSQL cluster doesn’t need the replication slots. Drop the replication slots, and turn off logical replication by setting rds.logical_replication to 0 in the target DB cluster. In this post, you have one replication slot called replication_slot. You can use the following commands to list and drop the replication slot:

target|postgres=> SELECT slot_name,plugin FROM pg_replication_slots;
slot_name        | plugin
-----------------+--------------
replication_slot | test_decoding

target|postgres=> SELECT pg_drop_replication_slot('replication_slot');
pg_drop_replication_slot
-------------------------------------

Configure ongoing replication

In this step, you prepare the source and target PostgreSQL databases for ongoing replication and configure ongoing replication using AWS DMS.

Create a schema for AWS DMS control tables in the source and target database

Control tables provide information about an AWS DMS task. In this post, you create a schema called dms_control_schema in both the source and target databases. While creating the AWS DMS migration task, you specify this schema in the AWS DMS task setting ControlSchema. You don’t need to replicate the control tables in the target database. Use the following code to create a schema in both the source and the target databases:

postgres=> CREATE SCHEMA dms_control_schema;
CREATE SCHEMA

Disable all the triggers in the target PostgreSQL database schema

Disable the triggers of the schema that you intend to replicate in the target database. Alternatively, you can set the database parameter session_replication_role to replica at global level in a custom parameter group or at session level using an AWS DMS extra connection attribute in the target endpoint.

In this post, you generate DDL statements to disable and enable the triggers of the dms_sample schema using the following sample script. This script is an example. Before running, you should test and change it according to your requirements.

This script only generates DDL statements for triggers that are set to Enabled:

CREATE OR REPLACE FUNCTION disable_trigger_sql(_schema TEXT,_triggerState boolean) RETURNS SETOF TEXT AS $$
DECLARE
  triggerRecord RECORD;
  triggerSQL TEXT;
  triggerStatus CHARACTER VARYING;
BEGIN
  if(_triggerState is true) then
    triggerStatus = 'DISABLE';
  else
    triggerStatus = 'ENABLE';
  end if;
  FOR triggerRecord IN
    SELECT tgname AS trigger_name,
          nspname AS schema_name,
          relname AS table_name
    FROM pg_trigger
    JOIN pg_class ON pg_trigger.tgrelid = pg_class.oid
    JOIN pg_namespace ON pg_class.relnamespace = pg_namespace.oid
    where  pg_namespace.nspname = _schema
    AND    pg_trigger.tgisinternal is false
    AND    pg_trigger.tgenabled !='D'
    AND    pg_class.relkind = 'r'         
  LOOP
    triggerSQL := 'ALTER TABLE "' || triggerRecord.schema_name ||'"."'|| triggerRecord.table_name || '" ' || triggerStatus || ' TRIGGER "' || triggerRecord.trigger_name||'";' ;
    RETURN NEXT triggerSQL;
  END LOOP;
END;
$$ LANGUAGE plpgsql;

To generate the DDL statements to disable the triggers of sample schema dms_sample, call the function with triggerState set to true as follows:

target|postgres=> select disable_trigger_sql('dms_sample',true) as SQL;
SQL
---------------------------------------------------------------------
alter table dms_sample.emp DISABLE trigger emp_update_audit_trigger;

Before disabling the triggers, call the function again with triggerState set to false to generate the DDL statements to enable the triggers of the sample schema dms_sample. You must save the output because you use it to enable the triggers during the cutover stage.

target|postgres=> select disable_trigger_sql('dms_sample',false) as SQL;
SQL
------------------------------------------------------------------------
alter table dms_sample.emp ENABLE trigger emp_update_audit_trigger;

Now, disable the triggers using the generated DDL statements.

Create an AWS DMS replication instance

Create an AWS DMS replication instance called dms-repl-instance-pgsql-01 in the target AWS account. For more information on replication instance sizing, see Choosing the right AWS DMS replication instance for your migration.

Create an AWS DMS source and target endpoints

Create an AWS DMS source endpoint called source-ep in the target AWS account for the source database. You can use extra connection attributes to configure the source PostgreSQL database while creating the source endpoint. For more information, see Endpoint settings when using PostgreSQL as a DMS source. In this post, we use the following endpoint settings for the AWS DMS source endpoint source-ep:

{
"DdlArtifactsSchema": "dms_control_schema",
"HeartbeatSchema": "dms_control_schema",
"HeartbeatEnable": true,
"HeartbeatFrequency": 1,
"SlotName": "replication_slot",
"PluginName": "test_decoding"
}

Create an AWS DMS target endpoint called target-ep for the target PostgreSQL database.

Create an AWS DMS migration task

Create an AWS DMS database migration task in the target AWS account to specify the schema for ongoing replication. The task details depend on your migration project. For more information, see Working with AWS DMS tasks. Note the following key steps while creating the AWS DMS database migration task in this post:

  1. For Task identifier, enter migration-task-cdc-01.
  2. For Replication instance, choose dms-repl-instance-pgsql-01.
  3. For Source database endpoint, choose source-ep.
  4. For Target database endpoint, choose target-ep.
  5. For Migration type, choose Replicate data changes only (CDC).
  6. For CDC start mode for source transactions, select Enable custom CDC start mode.
  7. For Custom CDC start point for source transactions, choose Specify a log sequence number and specify the LSN for System change number. We provide detailed guidance in the next section on how to acquire the LSN.
  8. For Custom CDC stop mode for source transactions, choose Disable custom CDC stop mode.
  9. For Target table preparation mode, leave as default.
  10. For Include LOB columns in replication, choose Limited LOB mode.
  11. For Turn on validation, choose Enable.
  12. In the Advanced task settings section, under Control table settings, specify dms_control_schema for Create control table in target using schema.

The AWS DMS control tables such as awsdms_apply_exceptions and awsdms_validation_failures_v1 will reside in schema dms_control_schema of the target database. If you don’t enter any information for this option, then AWS DMS uses the Public schema of the database for control tables.

DMS advance task setting

  1. For Table mappings, specify the source database schema under Selection rules. In this post, we migrate all tables from the dms_sample schema.

DMS Table Mapping

Custom CDC start point for source transactions

Connect to the source database to check if the LSN you captured earlier is still available in the replication slot. If you find the specified LSN in the replication slot, use the captured LSN as the system change number for the CDC start point.

If you don’t find the details of the captured LSN in the replication slot, use pg_logical_slot_peek_changes() to check the replication slot and use the very first LSN that is greater than the LSN you captured earlier. We do not recommend using pg_logical_slot_get_changes() to view changes in a replication slot because this will consume the changes and cause the AWS DMS operation to fail.

In this post, we have captured the LSN 1/87F75990. Use the following query in the source database for verification:

source|postgres=> SELECT * FROM pg_logical_slot_peek_changes('replication_slot', null, null) where lsn='1/87F75990'; -- Specify the LSN captured in step2
lsn | xid | data
-----+-----+------
(0 rows)

We don’t have the captured LSN in the replication slot. Use the following query in the source database to find the LSN greater than the captured LSN:

source|postgres=> select * from pg_logical_slot_peek_changes('replication_slot', null, null) where lsn>='1/87F75990'; -- Specify the LSN captured in step2
lsn         | xid    |  data                                                          
------------+--------+----------------------------------------------------------------
1/87F75B28 | 152158 | BEGIN 152158
1/87F75B28 | 152158 | table dms_sample.xxxx
1/87F75B98 | 152158 | COMMIT 152158
<Data Truncated for Post>

In this post, we don’t have the captured LSN 1/87F75990 in the source database replication slot and 1/87F75B28 is the first available in the replication slot, which is greater than 1/87F75990. Therefore, we use 1/87F75B28 as the system change number for the CDC start point.

Monitor replication

You can monitor your AWS DMS task using CloudWatch metrics. AWS DMS provides statistics for replication tasks, including incoming and committed changes, and latency between the replication host and both the source and target databases. For a complete list of the available metrics, see Replication task metrics. You can determine the total latency, or replica lag, for a task by combining the CDCLatencySource and CDCLatencyTarget metric values.

You can also use the following command on the source PostgreSQL database to monitor replication lag:

source|postgres=> SELECT now() AS CURRENT_TIME,
slot_name,
active,
active_pid,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(),
confirmed_flush_lsn)) AS diff_size,
pg_wal_lsn_diff(pg_current_wal_lsn(),
confirmed_flush_lsn) AS diff_bytes
FROM pg_replication_slots WHERE slot_type = 'logical';
current_time | slot_name | active | active_pid | diff_size | diff_bytes
-------------------+------------------+--------+------------+-----------+------------
2023-04-08 16:09:18.255294+00 | replication_slot | t | 5685 | 2351 kB | 2407352

Validate the data

You use AWS DMS data validation to make sure all the pre-existing data in the tables is migrated accurately and it also compares the incremental changes for a CDC-enabled task as they occur. During data validation, AWS DMS compares each row in the source with its corresponding row at the target, verifies the rows contain the same data, and reports any mismatches. See AWS DMS data validation for more details.

You can increase ThreadCount to a higher number so AWS DMS can complete the validation faster. With increased ThreadCount, AWS DMS runs more simultaneous queries, consuming more resources on the source and the target. We recommend monitoring the memory, CPU, and database performance metrics to ensure the source, target, and replication instance have sufficient resources when increasing ThreadCount.

To troubleshoot the tables shown as Mismatched records, Suspended records, Table error, or Error, check the awsdms_validation_failures_v1 table in the target database schema dms_control_schema. For more information, see Troubleshooting. You can also look for any error or warning messages in the migration task log.

Perform cutover

You can plan the cutover of the application to the target Aurora PostgreSQL database when the replication lag is minimal. We recommend planning your cutover during a low traffic window. The following are some of key cutover steps:

  1. Stop your application.
  2. Stop accepting all connections except the one related to AWS DMS on the source Aurora PostgreSQL database by updating the VPC security group associated with the source and removing inbound rules that allow connectivity.
  3. Make sure that there are no change events that are waiting to be applied to the target. Check pg_stat_activity and pg_prepared_xacts from the source PostgreSQL database to ensure no open transactions. Use the CDCIncomingChanges metric to monitor your AWS DMS migration task.
  4. Make sure that CDC latency is 0. You can use CDCLatencySource and CDCLatencyTarget metrics to monitor the replication task.
  5. Make sure that there are no data validation error in AWS DMS.
  6. Stop the AWS DMS database migration task. For more information, see Stop Replication Task.
  7. Update the sequences in the target PostgreSQL database based on the last_value of those sequences in the source PostgreSQL database. In this post, we use the following sample script in the source PostgreSQL database to generate the SQL statements to update the sequences in the target PostgreSQL database. This script is an example. Before running, you should test and change this according to your requirements.
CREATE OR REPLACE FUNCTION generate_recede_sequence_sql(_schema TEXT) RETURNS SETOF TEXT AS $$
DECLARE
 sequenceRecord RECORD;
 recedeSQL TEXT;
 start_with INTEGER;
BEGIN
 FOR sequenceRecord IN
   SELECT sequencename,schemaname
   FROM pg_sequences
   WHERE schemaname=_schema
 LOOP
   EXECUTE 'select last_value from "'|| sequenceRecord.schemaname||'"."'||sequenceRecord.sequencename||'"' into start_with;
   recedeSQL := 'SELECT SETVAL(''"' || sequenceRecord.schemaname ||'"."' || sequenceRecord.sequencename ||'"'','|| start_with||');';
   RETURN NEXT recedeSQL;
 END LOOP;
END;
$$ LANGUAGE plpgsql;

Call the function to generate the SQL statements to update sequences in dms_sample schema:

source|postgres=> select generate_recede_sequence_sql('dms_sample') as SQL;
SQL
------------------------------------------------------------------
select setval('dms_sample.audit_record_seq',103);

Now you should review and run the generated SQL statements in the target database to update the sequence.

  1. Enable the triggers in the target PostgreSQL database schema that were previously disabled before starting the AWS DMS CDC-only task. In this post, we use the DDL statements we saved earlier to enable the triggers in the dms_sample schema. We recommend reviewing the SQL statements before running them in the target database.
  2. Update the application configuration or DNS CNAME records with the target Aurora PostgreSQL endpoints.
  3. After the application has successfully connected to the target DB instances, you may decide to temporarily stop or delete the Amazon RDS resources in the primary AWS account. To stop the Aurora cluster, refer to Stopping an Aurora DB cluster and to delete the Aurora cluster, refer to Deleting Aurora DB clusters and DB instances. Aurora allows you to delete the source cluster that has one or more clones associated with it without affecting the clones.
  4. Start your application.

Amazon RDS for PostgreSQL

We can use the migration steps mentioned in this post for Amazon RDS for PostgreSQL as well. In this example, we use an RDS for PostgreSQL DB snapshot for the initial data load and an AWS DMS CDC-only task for ongoing replication. The following diagram illustrates the architecture.

RDS Solution Overview

In this section, we only highlight the steps that are different for Amazon RDS for PostgreSQL.

Enable logical replication

For Amazon RDS for PostgreSQL, modify rds.logical_replication to 1 in the custom DB parameter group associated with your DB instance and reboot the DB instance.

Perform the initial data load and capture the LSN

We perform initial data loading into the target RDS for PostgreSQL DB instance by restoring the RDS DB snapshot. Performing the initial data load for Amazon RDS for PostgreSQL consists of the following steps:

  1. Create a DB snapshot of the RDS for PostgreSQL DB instance in the source AWS account.
  2. Share the DB snapshot with the target AWS account.
  3. Copy the DB snapshot, because the shared DB snapshot is encrypted with AWS KMS CMK in this post. This step is optional if the DB snapshot is not encrypted.
  4. Restore the DB instance using the copy of the shared DB snapshot to create the target RDS for PostgreSQL DB instance.

When you restore the snapshot, provide the DB instance identifier rds-pgsql-target-01 and DB instance settings based on your workload. For more information, see Tutorial: Restore an Amazon RDS DB instance from a DB snapshot.

rds-available

  1. When your RDS for PostgreSQL DB instance is in Available state, obtain the LSN from the instance log of the target RDS for PostgreSQL instance.

In this post, we use following dblogs.sh script to list the database logs for the target and find the value after for the log entry redo done:

echo "Enter DB Instance Name:"
read db_instance_name
for logfileName in $(aws rds describe-db-log-files --db-instance-identifier $db_instance_name --query DescribeDBLogFiles[*].LogFileName --output text);
do
	aws rds download-db-log-file-portion --db-instance-identifier $db_instance_name  --log-file-name ${logfileName} --output text | egrep -i "redo (starts|done)"
done

target>> ./dblogs.sh
Enter DB Instance Name:
rds-pgsql-target-01

2023-04-12 17:24:34 UTC::@:[383]:LOG:  redo starts at 15/600001F8
2023-04-12 17:24:34 UTC::@:[383]:LOG:  redo done at 15/600002A8 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s

You must save this output to find the custom CDC start point for source transactions as mentioned in earlier step for AWS DMS ongoing replication. In this post, the captured LSN from the instance log of the target is 15/600002A8:

source|postgres=> select * from pg_logical_slot_peek_changes('replication_slot', null, null) where lsn >='15/600002A8';
lsn | xid | data
-------------+------+----------------------------------------------------------
15/600002E0 | 7147 | BEGIN 7147
15/600002E0 | 7147 | table dms_sample.xxxxxxxx
15/60000788 | 7147 | COMMIT 7147
15/600007C0 | 7148 | BEGIN 7148
<Data Truncated for Post>

In this post, we don’t find the captured LSN in the source database replication slot. Therefore, we use 15/600002E0 for Amazon RDS for PostgreSQL as the system change number for the CDC start point in the AWS DMS migration task for ongoing replication. You should follow the steps mentioned earlier to start the AWS DMS task and perform the cutover.

Clean up

As part of this migration, you have deployed AWS DMS-related resources in your AWS account to replicate data. These resources will incur costs as long as they are in use. Be sure to remove any resources you no longer need.

Conclusion

In this post, we discussed the various steps involved in migrating your Aurora PostgreSQL or RDS for PostgreSQL database from one AWS account to another. We also used AWS DMS for ongoing replication to reduce the downtime. With the complexity involved in database migrations, we highly recommend testing the migration steps in non-production environments prior to making changes in production.

We welcome your feedback. If you have any questions or suggestions, leave them in the comments section.


About the Authors


Alok Srivastava
is a Senior Database Consultant at AWS. He works with AWS customers and partners in their journey to the AWS Cloud with a focus on database migration and modernization programs.

Wanchen Zhao is a Senior Database Specialist Solutions Architect at AWS. Wanchen specializes in Amazon RDS and Amazon Aurora, and is a subject matter expert for AWS DMS. Wanchen works with ISV partners to design and implement database migration and modernization strategies and provides assistance to customers for building scalable, secure, performant, and robust database architectures in the AWS Cloud.