AWS Database Blog

Disaster recovery on Amazon RDS for Oracle using AWS DMS

AWS Database Migration Service (AWS DMS) helps you migrate data from databases on-premises to Amazon Relational Database Service (RDS). You can also use it to migrate data between heterogeneous or homogeneous database engines, among other things. Businesses of all sizes use AWS to enable faster disaster recovery (DR) of their critical IT systems without having to set up a second physical site. A DR solution depends upon the RTO/RPO. For more information about best practices, see New Whitepaper: Using AWS for Disaster Recovery.

This post explores how to use AWS DMS to set up a DR solution for Oracle databases running on the RDS platform.

Why AWS DMS for disaster recovery?

A DR solution in which the primary database instance is running on AWS, requires a cross-region replication mechanism. AWS DMS supports live migration of data from RDS to anywhere, including a different Region. You can use this feature to set up a separate RDS instance in a different Region to serve as a DR database. While there are other options to set up a DR site for Oracle on RDS, such as Oracle Golden Gate, AWS DMS provides a low-cost, simple, all-native approach. Additionally, there are no dependencies on using the Enterprise or Standard edition. AWS DMS supports full load migration and change data capture (CDC), thus enabling continuous data replication. Another possibility when using AWS DMS is that unlike physical replication, which works on the database instance level, AWS DMS is a logical replication solution and you can migrate a subset of your database such as schemas or a set of tables. In instances using Oracle, AWS DMS determines and tracks data changes by reading the transaction log using the Oracle LogMiner API or binary reader API. AWS DMS reads ongoing changes from the online or archive redo logs based on the system change number (SCN). For more information, see Working with an Amazon-Managed Oracle Database as a Source for AWS DMS.

Enable database-level supplemental logging to capture changes for replication. Similarly, you must enable supplemental logging for each table that you want to migrate. Enable supplemental logging primary key for tables with a primary key, and supplemental all column for tables without a primary key or unique key. ARCHIVELOG MODE should be on to provide information to LogMiner. DMS uses LogMiner by default to read information from the archive logs to capture changes. DMS also offers a choice of using LogMiner or Binary Reader for reading the redo logs. Both approaches have merits and flaws, so the choice depends heavily on the replication requirements. For more information, see Using Oracle LogMiner or Oracle Binary Reader for Change Data Capture (CDC).

AWS DMS allows you to do a full-load migration, CDC, or a full-load migration with CDC. However, while setting up a DR RDS instance, it may be more efficient to pre-load the target instance using RDS snapshots. This can save costs because you don’t need a replication instance for doing the full load and especially if the size of the database is large. You can copy an RDS database snapshot across Regions. This post outlines the steps required to set up a DR RDS instance and perform continuous replication to achieve near-zero RTO/RPO using a sample RDS for Oracle instance.

Things to remember before using DMS as a DR solution

AWS DMS captures ongoing changes to the source data store and applies those on the target data store. While this feature can be leveraged to fulfill a DR use case, there are some points to note.

Performance: Performance of your DR solution using AWS DMS may vary for each database, as there are a number of factors that affect the performance of your migration. These include resource availability on the source, available network throughput, resource capacity of the replication server, ability of the target to ingest changes, type and distribution of source data, number of objects to be migrated, and so on. The performance of your migration will be limited by one or more bottlenecks you encounter along the way. Please refer to AWS Database Migration Service Best Practices for more information on performance considerations.

Also note the fact that since AWS DMS is a logical replication solution, performance would depend on the nature of data and transactions on the source. AWS DMS uses Oracle LogMiner for change data capture (CDC) by default. Alternatively, you can use Oracle Binary Reader, which greatly improves performance and reduces the load on the Oracle server when compared with LogMiner for certain considerations.

Limitations of source/target: AWS DMS does not replicate DDL operations such as ADD, DROP, EXCHANGE, or TRUNCATE and data changes resulting from partition or subpartition operations. There are limitations on using the ALTER TABLE command as well. For a complete list of limitations, see Limitations on Using Oracle as a Source for AWS DMS.

Similarly, AWS DMS doesn’t create schema on the target RDS for Oracle database. So, in a DR scenario, it is recommended that a target database instance is created by restoring an RDS snapshot, which also takes care of a one-time data load. Alternately, Oracle Data Pump can be used. This blog uses the former approach. For a complete list of limitations, see Limitations of Oracle as a target for AWS DMS.

Latency: AWS DMS uses either the Oracle LogMiner API or binary reader API to read ongoing changes.  A busy source database that generates a large number of redo logs can result in significant latency. Binary reader bypasses LogMiner and reads the logs directly. Therefore, it is important to weigh the options between LogMiner and binary reader. Additionally, long running transactions can lead to source latency as AWS DMS only reads incoming changes from the transaction logs, but AWS DMS forwards only committed changes to the target during ongoing replication. The other consideration is LOB data. AWS DMS migrates LOB data for ongoing replication in two phases. First, AWS DMS creates a new row in the target table with all columns except those that have LOBs. Then, AWS DMS updates the rows that have LOBs. If you have a source database that frequently updates tables that have LOB columns, then you might see source latency.

Latency is also possible due to the target database characteristics. It could be because there are no primary keys or indexes on the target. As AWS DMS uses logical replication,  if the required indexes aren’t in place, then changes like UPDATEs and DELETEs can result in full table scans. Full table scans can cause performance issues on the target and result in target latency.

Additional considerations are resource bottlenecks on the source and the target database instances as well as the replication instance.

In situations where there are defined SLAs around RTO/RPO requirements, latency is a key consideration. Customers are encouraged to test this solution to arrive at their own benchmarks. They should weigh the different options to replicate the data from primary instance to the DR instance such as Oracle Golden Gate, Cross-region Read Replicas to see what works best for them. You can use the CloudWatch service metric for CDCLatencySource and CDCLatencyTarget to monitor the replication latency for an AWS DMS task.

Monitoring: You can use Amazon CloudWatch and metrics to monitor the progress of your DMS task, the resources used, and the network activity used. You can monitor replication host metrics, replication task metrics, and table metrics by enabling Amazon CloudWatch logging. Basic CloudWatch statistics for each task, including the task status, percent complete, elapsed time, and table statistics can be obtained from AWS DMS console. Performance metrics for Replication Instance includes CPUUtilization, FreeStorageSpace, FreeableMemory, and many more. Replication task-related metrics such as incoming and committed changes and latency between the replication host and source and target databases are included. Table metrics include those that are in the process of being migrated, the number of insert, update, delete etc.

Amazon CloudWatch alarms and events can be set up to more closely track the migration. Detailed guidance on debugging the migrations can be obtained from Debugging your AWS DMS Migrations.

Architecture overview

This architecture involves a primary RDS for Oracle instance running in Region1 in its own VPC and subnet. AWS Database Migration Service, which is used to set up the replication to the DR Region, is running in a separate EC2 instance in the same VPC. AWS DMS runs the database migration jobs that replicate data.

The target RDS for Oracle instance is running in Region2 in its own VPC and a public subnet. This instance is created from a snapshot of the primary RDS instance and replicated real time with the primary using CDC. The following diagram shows the architecture of this solution, which sets up multi-region DR on RDS for Oracle using AWS DMS.

Solution overview

This solution contains the following steps:

  1. Set up the primary RDS for Oracle instance in the Region that you choose. This step is optional. In a real-world scenario, the primary database is already set up.
  2. Configure a user for AWS DMS to use with appropriate permissions on the source database. Additionally, enable the supplemental logging on the objects.
  3. Make sure that the database is quiesced so that there are no updates. One way to do this is to perform this during the scheduled maintenance window. Alternatively, instead of RDS snapshots, consider using the Oracle export and import function. This post uses the RDS snapshot approach, in which it is assumed that you perform the steps during the maintenance window.
  4. Before creating the manual snapshot, obtain the SCN number of the database. You use a snapshot of the instance to do an initial migration of the database. You can copy the manual snapshot to DR’s Region. For more information, see Copying a Snapshot.
  5. After the snapshot is copied, a new RDS instance is provisioned in the DR Region using this snapshot.
  6. Set up the AWS DMS replication to use CDC with the SCN number captured previously as the starting point.

This post uses a Single-AZ deployment of both RDS and AWS DMS. In production environments, it is recommended that you choose a Multi-AZ configuration for both. Also, the replication instance is placed in a public subnet to access the target RDS instance. Avoid this in production scenarios, and explore other options such as peering.

Step 1: Setting up the sample primary RDS for Oracle instance

To begin, run the following CloudFormation template. This template is an example of how to set up DR on RDS for Oracle with AWS DMS.

This script provisions a new VPC and subnets along with the RDS for Oracle instance in the Region in which you execute the script. The script also provisions a Windows instance with SQL Developer installed to use with the Oracle database. The following table summarizes the parameters required for running the script.

Parameter label (name) Default Description
Stack Name Requires input Any name
RDSInstanceType Requires input Select from the drop-down box as per your needs
EC2ServerInstanceType Requires input Select from the drop-down box as per your needs
KeyName Requires Input Name of the existing key pair to enable access to instance. Select from the drop-down box.
VpcCIDR 10.0.0.0/16 CIDR block for the VPC
Subnet1CIDR
(PrivateSubnet1CIDR)
10.0.0.0/19 CIDR block for the private subnet 1
Private Subnet 2 CIDR
(PrivateSubnet2CIDR)
10.0.32.0/19 CIDR block for the private subnet 2

The following screenshot shows the parameters entry screen.

This post provides a sample database that you can use to test the configuration. You can also make modifications to the database by inserting rows in any of the tables.

You are now are ready to create a DR instance and set up a real-time replication mechanism to keep both databases closely in sync.

Navigate to the RDS instance that you created. To obtain the endpoint name of the instance, look at the SourceOracleEndpoint key in the CloudFormation Outputs section. Similarly, obtain the DNS endpoint of this instance from SourceEC2PublicDNS.

Use Remote Desktop software to connect to the Windows instance provisioned by the CloudFormation script that was run above. This instance already has SQL Developer installed on it. Next, configure a new database connection using SQL Developer to connect to the RDS for Oracle instance created above. Given below are the details:

  • Name. NewOracleSource
  • Database type. Oracle
  • Username. dbmaster
  • Password. dbmaster123
  • Connection Type. Basic
  • Hostname. <<DNS endpoint above>>
  • Port. 1521
  • SID. OracleDB

Now click on ‘Test’. It should display ‘success’ at the bottom of the screen. The following screenshot shows the connection details.

Step 2: Configuring the user for AWS DMS to use

After you are connected, enter the following code to grant the following privileges to the AWS DMS user to access the source Oracle endpoint:

GRANT SELECT ANY TABLE to DMS_USER;
GRANT SELECT on ALL_VIEWS to DMS_USER;
GRANT SELECT ANY TRANSACTION to DMS_USER;
GRANT SELECT on DBA_TABLESPACES to DMS_USER;
GRANT SELECT on ALL_TAB_PARTITIONS to DMS_USER;
GRANT SELECT on ALL_INDEXES to DMS_USER;
GRANT SELECT on ALL_OBJECTS to DMS_USER;
GRANT SELECT on ALL_TABLES to DMS_USER;
GRANT SELECT on ALL_USERS to DMS_USER;
GRANT SELECT on ALL_CATALOG to DMS_USER;
GRANT SELECT on ALL_CONSTRAINTS to DMS_USER;
GRANT SELECT on ALL_CONS_COLUMNS to DMS_USER;
GRANT SELECT on ALL_TAB_COLS to DMS_USER;
GRANT SELECT on ALL_IND_COLUMNS to DMS_USER;
GRANT SELECT on ALL_LOG_GROUPS to DMS_USER;
GRANT LOGMINING TO DMS_USER;

The following screenshot shows this code in the SQL Developer console.

Additionally, enter the following code:

exec rdsadmin.rdsadmin_util.grant_sys_object('V_$ARCHIVED_LOG','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$LOG','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$LOGFILE','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$DATABASE','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$THREAD','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$PARAMETER','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$NLS_PARAMETERS','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$TIMEZONE_NAMES','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$TRANSACTION','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('DBA_REGISTRY','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('OBJ$','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_ENCRYPTED_COLUMNS','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$LOGMNR_LOGS','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$LOGMNR_CONTENTS','DMS_USER','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('DBMS_LOGMNR','DMS_USER','EXECUTE');

To retain archived redo logs of the source Oracle database instance for 24 hours, enter the following query:

exec rdsadmin.rdsadmin_util.set_configuration('archivelog retention hours',24);

To enable database-level supplemental logging, enter the following query:

exec rdsadmin.rdsadmin_util.alter_supplemental_logging('ADD');

To enable PRIMARY KEY logging for tables that have primary keys, enter the following query:

exec rdsadmin.rdsadmin_util.alter_supplemental_logging('ADD','PRIMARY KEY');

To add supplemental logging for tables that don’t have primary keys, enter the following queries:

alter table dms_sample.nfl_stadium_data add supplemental log data (ALL) columns;
alter table dms_sample.mlb_data add supplemental log data (ALL) columns;
alter table dms_sample.nfl_data add supplemental log data (ALL) columns;

Step 3: Creating a manual snapshot of the source RDS instance

Make sure that the database is quiesced, meaning that no updates are possible on the database. To get the SCN number of the database, enter the following query:

select CURRENT_SCN from v$database;

Note the output of the preceding query. It is in the form of a number, such as 3829791. This step is required to set a starting point for the CDC while configuring the AWS DMS replication task. For more information, see Creating Tasks for Ongoing Replication Using AWS DMS.

You can now set up the DR instance and load the initial data. To do that, use the RDS Snapshots feature. Complete the following steps:

  1. On the AWS Management Console, go to the source RDS database you created.
  2. Choose Snapshots.
  3. From the Actions drop-down, choose Copy Snapshot.
  4. Under Source DB Snapshot, for Destination Region, choose your DR.
    This post chooses Asia Pacific (Sydney).
  5. Give a name in the field New DB Snapshot Identifier. In this case it is source-snapshot.
  6. Leave defaults for all the other fields.
  7. Choose Copy Snapshot.
    RDS automatically shows the snapshot in the destination Region.
  8. Make a note of the ARN of the snapshot.

Step 4: Creating an RDS instance in the DR Region

Run the following CloudFormation template. This sets up an RDS instance for DR purposes in a different Region, using the snapshot you created.

The following table summarizes the parameters needed for the template.

Parameter label (name) Default Description
Stack Name Requires input Any name
ClientIP Requires input The IP address of the client machine that is used to connect to the RDS for Oracle Database
DBSnapshotId Requires input The RDS snapshot that is available within the Region; created by the previous steps

The following screenshot shows the parameters entry screen.

You can obtain the endpoint details of the target RDS instance from the Outputs section of the CloudFormation screen on the AWS Management Console.

After you provision the RDS instance in the DR Region, verify the connection to the new instance from the same Windows client instance that you provisioned in Step 1. The SID is TargetDB and username and password remain the same as the source instance. Also check that both databases are in sync by running some test SQLs.

The next step is to capture and replicate the changes from the source to the target using AWS DMS. The following CloudFormation template takes the details of the source and target RDS instances as parameters and creates the AWS DMS replication resources required to perform the CDC.

You can set up the replication instance in a Region in which the primary database instance is running or in a DR Region. If you are only migrating or replicating a subset of data using filters or transformations, you should keep the replication instance on the source side, because the amount of data transferred over the network to the DR Region is less. For this post, and in other cases of full database migration and ongoing replication, you can keep the AWS DMS replication instance on either side.

The following table summarizes the parameters needed for the template.

Parameter label (name) Default Description
Stack Name Requires input Any name
ExistsDMSCloudwatchRole N If the dms-cloudwatch-logs-role is already present in your account, enter Y; otherwise leave the default
ExistsDMSVPCRole N If the dms-vpc-role exists in your account, enter Y; otherwise leave the default
OracleRDSSourceEndpoint Requires input The endpoint of the source RDS for Oracle Database
OracleRDSSourcePort 1521 The port of the source RDS for Oracle Database
OracleRDSSourceUser Requires input The user for the target RDS for Oracle Database
OracleRDSSourcePassword Requires input The password for the preceding user
OracleRDSTargetEndpoint Requires input The endpoint of the target RDS for Oracle Database
OracleRDSTargetPort 1521 The port of the source RDS for Oracle Database
OracleRDSTargetUser Requires input The user for the target RDS for Oracle Database
OracleRDSTargetPassword Requires input The password for the preceding user
SCNNumber Requires input System change number of Oracle; follow the instructions in the document to obtain this
SourceDatabase Requires input The source RDS for Oracle Database name
TargetDatabase Requires input The target RDS for Oracle Database name
VPC Requires input VPC ID of the existing Virtual Private Cloud that you set up in the previous step
Subnets Requires input Select the subnets for the DMS instance from the drop-down

The following screenshot shows the parameters entry screen.

After you execute the script successfully, you have created all the AWS DMS resources you must perform the CDC, and the replication task should be running. You can find these resources by navigating to the AWS DMS screen within the console. It takes a few minutes before the status of the replication task changes from Ready to Replication Ongoing. If it doesn’t automatically change, complete the following steps:

  1. On the DMS console, select the task.
  2. From the Actions drop-down, choose Restart/Resume.

Step 5: Testing the CDC

You can now test if your replication is working. To insert dummy rows on the source and verify that they are instantly replicated to the target, run the following SQLs:

INSERT ALL
INTO dms_sample.sport_type (name,description) VALUES ('hockey', 'A sport in which two
teams play against each other by trying to more a puck into the opponents goal using a
hockey stick')
INTO dms_sample.sport_type (name,description) VALUES ('basketball', 'A sport in which
two teams of five players each that oppose one another shoot a basketball through the
defenders hoop')
INTO dms_sample.sport_type (name,description) VALUES ('soccer','A sport played with a
spherical ball between two teams of eleven players')
INTO dms_sample.sport_type (name,description) VALUES ('volleyball','two teams of six
players are separated by a net and each team tries to score by grounding a ball on the
others court')
INTO dms_sample.sport_type (name,description) VALUES ('cricket','A bat-and-ball game
between two teams of eleven players on a field with a wicket at each end')
SELECT * FROM dual;
COMMIT;
SELECT * FROM dms_sample.sport_type;

The AWS DMS task keeps the target Oracle database up to date with source database changes.

The latency is close to zero when the target has caught up to the source.

Triggering the DR

With the preceding setup in place, in case of disaster, you would typically make the following changes so that applications can reach the secondary database (the new primary) and serve the requests efficiently:

  1. Modify DNS configurations or use the Amazon Route 53 Active-Passive failover feature. When failover occurs, the secondary database instance in a DR Region is now a new primary database instance.
  2. Scale up the new primary RDS instance to the capacity required by the application (consider licensing restrictions that apply for your Oracle database if you are using a BYOL model).
  3. Turn on the Multi-AZ option for the new primary database instance.
  4. Copy the RDS snapshot to another Region and note the SCN number, using the technique discussed previously.
  5. Launch the CloudFormation template to create a new secondary database in a target Region.
  6. Launch the CloudFormation template to set up an AWS DMS Replication instance, create, and run AWS DMS tasks from an SCN number you specified from the new primary database instance.

For more information about reducing replication lag and making DMS replication highly available in case of AZ failure, see Best Practices for AWS Database Migration Service.

Summary

This post discussed how to set up a DR solution for your RDS for Oracle databases using DMS cost-effectively. To familiarize yourself more on AWS Database Migration Service, please visit Getting started with AWS Database Migration Service. Please share your thoughts in the comments section on how this approach worked for you or your customers.

 


About the Authors

Madhuri Susarla is a Solutions Architect in the partner team of AWS. She is passionate about working with large Global System Integrators to create value in the era of cloud computing. She dabbled in multiple cloud platforms. She believes in enabling reusability by way of content creation and public speaking. She enjoys playing with her three boys, while struggling to keep up with their soccer, knowledge, and gastronomic choices.

 

 

Ejaz Sayyed is a Partner Solutions Architect with the Global System Integrator (GSI) team at Amazon Web Services. His focus areas include AWS database services as well as database and data warehouse migrations on AWS. Recently, he is also supporting GSIs building data lakes on AWS for our customers.