Encrypt Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL database with minimal downtime

Recently one of our customers, asked us to help them encrypt their unencrypted Amazon Relational Database Service (Amazon RDS) for PostgreSQL. In this post, we show a solution to create an encrypted database from their existing unencrypted database and cut over with the least disruption to applications. This solution uses database Snapshot and PostgreSQL logical replication.

Amazon RDS offers two types of snapshots automatic and manual. This solution uses a manual snapshot. PostgreSQL logical replication uses a publish and subscribe model with one or more subscribers, subscribing to one or more publications on a publisher node. Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL-Compatible Edition support logical replication with decoding plugins such as pglogical and pgoutput. Logical replication is commonly used for data migration from on premises to Amazon RDS or Amazon Aurora PostgreSQL-Compatible Edition.

As an Amazon RDS and Aurora security best practice, you must encrypt your databases and snapshots at rest and in transit. The encryption of database at rest provides an additional layer of protection from unauthorized access to data.

Converting an unencrypted RDS for PostgreSQL or Amazon Aurora PostgreSQL database to encrypted

For an encrypted database instance data on storage, transaction logs, backups, and snapshots are encrypted. There are two ways you can accomplish. One is by using the Amazon RDS console, which is a straightforward approach but requires some down time. If your applications are sensitive to down time duration, then you can follow the second approach and use logical replication to minimize downtime.

You can use the same solution for switching the encryption key from an AWS managed key to a customer managed key for an existing encrypted Amazon RDS for PostgreSQL or Amazon Aurora database.

Overview of solution

This solution carries out the conversion in four high-level steps:

Enable native PostgreSQL replication on the unencrypted source database and take a manual snapshot.
Restore the snapshot to a new database instance with encryption enabled.
Perform logical replication on the restored database for change data capture.
Cut over your application to the encrypted database and retire (or delete) the unencrypted database.

Prerequisites

You can try out the hands-on steps using Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition. For this post, we demonstrate the steps with Amazon RDS for PostgreSQL.

You must complete the following steps before starting:

Have an AWS account with permissions to create resources.
Have an unencrypted RDS for PostgreSQL DB cluster and PostgreSQL client utility to try out the steps discussed in this post. You can use an existing unencrypted database or create a new database. Make sure that Enable Encryption is unchecked (see the following screenshot).

Set up an Amazon Elastic Compute Cloud (Amazon EC2) instance with connectivity to the DB instance. For more information, refer to Creating and connecting to a PostgreSQL DB instance.

Set up publication on the source database

In this step, you set up a publication, which is set of changes generated from a table or group of tables. The source database is referred to as the publisher node.

To enable logical replication in Amazon RDS for PostgreSQL, modify a custom parameter group to set rds.logical_replication to 1 and attach the custom group to the database.

This step requires a reboot of the database instance. If your database instance is already setup with a custom parameter group, modify the aforementioned parameter and reboot the instance.

Verify the setup by running the following SQL statement:

# Verify the publication
SELECT name, setting FROM pg_settings WHERE name IN ('wal_level','rds.logical_replication');
# You should see:
rds.logical_replication | on
wal_level | logical

Now you can create the publication. You can select the tables for which you would like to capture the changes on the replica. For this use case, you capture changes for all tables. Because a publication is associated with a single database, you need to create one publication per database. In this post, we are assuming that there is a single database.

Start a client session to the database using any PostgreSQL client application (such as psql) and run the following command:

# This need to be run using SQL client connected to the source DB

CREATE PUBLICATION encryptdb_post FOR ALL TABLES;

# Verify the publication

SELECT * FROM pg_publication;

Next, you create the replication slot, which is a where the publisher nodes retain the WAL logs that are needed by the replicas. The subscriber nodes read the streamed changes from the replication slot to update the data. You run the system function pg_replication_logical_replication_slot() to create the replication slot. For logical decoding, you use the pgoutput decoding plugin that comes prepackaged with PostgreSQL.

Create the replication slot with the following code:

# Run the System function for creating the slot

SELECT * FROM pg_create_logical_replication_slot('encryptdb_post','pgoutput' );

# Verify the publication

SELECT * FROM pg_replication_slots;

Backup and restore to create an encrypted database

In this step, take a backup and restore it to a new database instance. The new instance will have encryption enabled. Note that there is a minor difference between Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL-Compatible Edition from a restoration perspective. To create an encrypted Aurora cluster, you can restore from an unencrypted snapshot with encryption enabled. In the case of Amazon RDS, you need to carry out an additional step before restoration. This step involves creation of an encrypted snapshot from the unencrypted snapshot. The difference is depicted in the following figure.

Note that the steps listed here are for Amazon RDS. The following diagram depicts the steps carried out in this phase.

Complete the following steps:

Create an RDS snapshot using the Amazon RDS console or the following code via the AWS Command Line Interface (AWS CLI):

# Wait for snapshot to be created

aws rds create-db-snapshot \
--db-instance-identifier <<DB Identifier>> \
--db-snapshot-identifier unencrypted-snapshot;

In this next step, you need to provide the Amazon Key Management Service (AWS KMS) key for database encryption. The KMS key you use can be AWS managed key or a customer managed key. In this post, you use the AWS managed default RDS encryption key (alias – aws/rds).

Create an encrypted snapshot using the following code:

# Get the ARN for default RDS KMS key

aws kms list-aliases --output text \
--query 'Aliases[?AliasName == `alias/aws/rds`].AliasArn'

# Create the encrypted copy of the snapshot

aws rds copy-db-snapshot \
--source-db-snapshot-identifier unencrypted-snapshot \
--target-db-snapshot-identifier encrypted-snapshot \
--kms-key-id <<Paste KMS Key ARN from last command>>

Now you can restore the database from the encrypted snapshot. Make sure to enable the PostgreSQL logs; you need the log sequence number (LSN) in a later step. You can carry out this step using the Amazon RDS console or the AWS CLI, as documented in this post.

Restore the database with the following code. Specify the VPC, subnet group and security group for your encrypted database. To ensure that the rds.logical_replication parameter is set up in the new database, use the same custom parameter group as your source database.

# This will create the encrypted database
# You can use the parameter group used for encrypted database as it already has the replication parameters setup

aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier encrypted-db \
--db-snapshot-identifier encrypted-snapshot \
--db-parameter-group-name <<Paste name of parameter group>> \
--vpc-security-group-ids <<security group>> \
--db-subnet-group-name <<subnet group name>>

Set up replication between unencrypted and encrypted databases

At this stage your database is updated to the latest LSN available in the snapshot. Now you set up the subscription on the restored database instance to receive the database changes. This will replicate the changes that occurred after the snapshot creation and will continue to stream all changes. The following diagram depicts the steps taken in this phase.

Start by verifying the replication setup on the encrypted database:

# Verify the publication

SELECT name, setting FROM pg_settings WHERE name IN ('wal_level','rds.logical_replication');

# You should see:
rds.logical_replication | on
wal_level | logical

Create a subscription that will receive changes from the encrypted database:

CREATE SUBSCRIPTION encryptdb_post
CONNECTION 'host=<<Unencrypted DB endpoint>> user=<<User name>> password=<<DB password>> dbname=<<DB name>>
PUBLICATION encryptdb_post
WITH (
       copy_data = false,
       create_slot = false,
       enabled = false,
       synchronous_commit = false,
       connect = true,
       slot_name = 'encryptdb_post'
);

By default, the subscription will be disabled; you enable it in a later step.

Get the LSN from the PostgreSQL logs

You can get the LSN by opening the PostgreSQL logs in the Amazon RDS console, Amazon CloudWatch console, or AWS CLI. After the instance is restored, navigate to the AWS console and go to the Logs and Events tab. Scroll down to the most recent log file. Open the log file and look for “:LOG: invalid record length”. The following example snippet shows how to find the LSN.

# OR search the pattern in CloudWatch logs

# Double check the name of the database in the log-group-name below

aws logs filter-log-events \
--log-group-name /aws/rds/instance/encrypted-db/postgresql \
--filter-pattern 'invalid record length'

You now use the PostgreSQL system administration function pg_replication_origin_advances to advance the current confirmed position of the replication slot.

Connect to the encrypted database using a PostgreSQL client such as psql and run the following commands:

# First get the replication origin.
# You will use the attribute roname in the next statement

SELECT * FROM pg_replication_origin;

# Advance the slot

SELECT pg_replication_origin_advance(<<Replace with roname>>', '<<Replace with LSN>>’);

# E.g.,

SELECT pg_replication_origin_advance ('pg_2457', '0/20000110’);

Enable the replication:

ALTER SUBSCRIPTION encryptdb_post ENABLE;

To Verify that replication is working, use the SQL client to ensure that changes are showing up in the encrypted database:

SELECT slot_name, confirmed_flush_lsn AS flushed, pg_current_wal_lsn(), (pg_current_wal_lsn() - confirmed_flush_lsn) AS lsn_distance FROM pg_catalog.pg_replication_slots WHERE slot_type = 'logical';

You should see lsn_distance = 0.


slot_name      | flushed    | pg_current_wal_lsn | lsn_distance

---------------+------------+--------------------+--------------

encryptdb_post | 1/18000358 | 1/18000358         | 0

Keep in mind that the duration of full synchronization will depend on the amount of data changes made between the creation of the snapshot and enabling of replication.

When your new encrypted database is in sync with old unencrypted database, you’re ready for cutover.

Switch over applications to the encrypted database

At this time, you can repoint your applications to encrypted database using the guidance below.

Disconnect your applications from the database and stop the unencrypted database to prevent accidental writes.
Configure your applications to connect to the encrypted database instance
Restart and verify the applications functionality
Keep the unencrypted instance for a period of time before removing it.

Clean up

To avoid incurring future charges, delete the resources you created.

Remove the publication

DROP PUBLICATION encryptdb_post;

Remove the replication slot:

DROP SUBSCRIPTION encryptdb_post;

Disable the PostgreSQL logs and log publishing to CloudWatch (optional).
Delete the CloudWatch group:

# Double check the name of the database in the log-group-name below

aws logs delete-log-group --log-group-name /aws/rds/instance/encrypted-db/postgresql

Delete the unencrypted database if you no longer need it.

Conclusion

In this post, we showed how to convert an unencrypted RDS for PostgreSQL database to an encrypted database with minimum downtime. The process involves using an encrypted snapshot for restoration to an encrypted database instance followed by using native replication for CDC between the unencrypted and encrypted database instances. The benefit of this approach is that the overall time taken for this activity for especially large database is less. After the encrypted database has caught up with the changes on the unencrypted database, you can switch your applications to the encrypted database with minimal downtime.

If you have questions, comments, or feedback, leave them in the comments section.

About the Authors

Santosh Bhupathi is a Senior Database Specialist Solution Architect based in Philadelphia. With a focus on Relational and No-SQL databases, he provides guidance and technical assistance to customers to help them design, deploy, and optimize database workloads on AWS.

Rajeev Sakhuja is a Solution Architect based out of New York City. He enjoys partnering with customers to solve complex business problems using AWS services. In his free time, he likes to hike, and create video courses on application architecture & emerging technologies; check him out on Udemy.

AWS Database Blog

Encrypt Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL database with minimal downtime

Overview of solution

Prerequisites

Set up publication on the source database

Backup and restore to create an encrypted database

Set up replication between unencrypted and encrypted databases

Switch over applications to the encrypted database

Clean up

Conclusion

About the Authors

Resources

Blog Topics

Follow

Learn

Resources

Developers

Help