AWS Database Blog
A quick introduction to migrating from an Oracle database to an Amazon Aurora PostgreSQL database
In this post, we build an AWS CloudFormation stack to deploy resources to help demonstrate the process of migrating from an Oracle database to an Amazon Aurora PostgreSQL database. Because this is a heterogeneous migration, we follow a two-phase approach similar to what is detailed in How to Migrate Your Oracle Database to PostgreSQL.
The post focuses on building a migration stack to help you get familiar with AWS Schema Conversion Tool (AWS SCT) and AWS Data Migration Service (AWS DMS) core concepts. We also show you how to disable triggers in the target Amazon Aurora PostgreSQL database for migration using database parameter groups.
We use AWS CloudFormation to deploy an Amazon EC2 instance with the Oracle database (HRDATA) pre-installed, an Amazon Aurora PostgreSQL cluster, and a replication instance in AWS DMS. We use other necessary components like Amazon Virtual Private Cloud (Amazon VPC), Amazon Simple Storage Service (Amazon S3), and AWS Identity and Access Management (IAM) roles and policies to aid in the migration process.
Considerations
Before you proceed with building the components for this migration, note the following key points:
- During conversion, AWS SCT converts the schemas to uppercase in the target Aurora PostgreSQL database. Because the Aurora PostgreSQL engine is case sensitive, you use transformation rules during the migration step.
- For bulk loading, you must disable all triggers in the target Aurora PostgreSQL database. Otherwise, bulk loading fails due to foreign key violations. We show how to do this by using a database parameter group and setting the
session_replication_role
parameter toreplica
. This action disables all triggers. While you are creating the Aurora cluster, the AWS CloudFormation stack sets theDBParameterGroup
resource as needed. Ideally, after the bulk loading is complete, you change thesession_replication_role
parameter to the original value to enable triggers again on the target database.
Note: Do not execute any DDL statements on the target database while foreign key checks are disabled (for example, throughout the entire AWS DMS migration). Doing so can lead to data dictionary corruption.
The following code snippet is from the AWS CloudFormation script for the database parameter group:
Section 1: Use AWS CloudFormation to deploy resources needed for migration
Note: The script works in the Oregon (us-west-2) and Ohio (us-east-2) Regions only.
Prerequisites
To get started with this migration, do the following:
- Download and install AWS SCT.
- Download the Oracle and Aurora PostgreSQL database drivers.
- Follow the instructions in Installing, Verifying, and Updating the AWS Schema Conversion Tool to configure the AWS SCT global settings.
- Locate an Amazon EC2 key pair to use in the US West-2 (Oregon) or US East-2 (Ohio) Region. Keys are Region-specific. If you don’t have one, create an Amazon EC2 key pair.
- Navigate to Identity and Access Management (IAM) in the console. Under Roles, check whether a role named dms-vpc-role This is an API/AWS CLI-specific role that is needed for launching AWS DMS tasks programmatically. If the role is not present, then during the AWS CloudFormation launch section of this post, choose no for the DMSVPCRoleExists exists parameter so that AWS CloudFormation can create this role.
Launch the stack
Note: Some of the resources deployed by the stack incur costs while they are in use.
To get started with deploying the AWS CloudFormation template:
- Choose Launch Stack. This button automatically launches the AWS CloudFormation service in your AWS account with a template to launch. You are prompted to sign in if needed.
- Ensure that you choose the Ohio (us-east-2) or Oregon (us-west-2) Region to create the stack. Then choose Next.
- For the DMSVPCRoleExists parameter, choose yes or no based on step 5 of the PrerequisitesNote: If you have used AWS DMS previously in the account, the dms-vpc-role is already created in IAM. To prevent the AWS CloudFormation stack from failing because the role already exists, choose yes for the DMSVPCRoleExists input parameter. Otherwise, leave it at the default no.
- For the MyIP parameter, the default 0.0.0.0/0 allows the port 1521 (for Oracle) and 5432 for open access. Hence, we highly recommend that you use the public IP of your machine where you are launching the stack from. (You can check your IP using org.)
- Specify a key pair for the EC2 instance, and other parameters as needed. All parameters are required, and most have default values. Then choose Next.
- On the Review page, acknowledge that AWS CloudFormation will create IAM roles as a result of launching the stack. Choose Create.
- To see the stack creation progress, choose Refresh. Then select the stack to see the launch events in the Events section below.The stack creation takes 7–10 minutes to finish. When the stack is successfully launched, the Status changes from CREATE_IN_PROGRESS to CREATE_COMPLETE. The Outputs tab contains information about the resources that are deployed by AWS CloudFormation that you will need later in this post. You can copy these values into a file for easy lookup.At this point, optionally, you can check the Aurora PostgreSQL database parameter group settings.
- Sign in to the AWS Management Console and open Amazon Relational Database Service (Amazon RDS). In the left navigation pane, choose Parameter groups. Then choose the instance parameter group (not the DB cluster parameter group) that was created by the AWS CloudFormation stack, as shown following:
- In the search box, type session_replication_role. The value appears as replica for that parameter.
Section 2: Use AWS SCT to convert the source database schema
To proceed with this section, you must install the AWS Schema Conversion Tool (AWS SCT) as stated in the Prerequisites section. AWS SCT should have access to the internet to connect to resources deployed by AWS CloudFormation.
- Start AWS SCT.
- Choose File, New Project Wizard.
- In the wizard, provide a name for the project. Choose Oracle as the source, and choose Next.
- Provide the Oracle database connection information. You can find all the required parameters for this step in the Output of the AWS CloudFormation stack that you deployed previously. Choose Test Connection to ensure that AWS SCT can connect to the Oracle source successfully. Then choose Next.
- Choose HRDATA as the schema to be migrated, and choose Next.
- Next, you are presented with a Database Migration Assessment Report. The report provides insights into the migration challenges for different target database engines. This report can help you understand the challenges and choose a target database that is most suitable for this migration.In this post, we are using Aurora PostgreSQL as the target engine. You can download the report as needed. Choose Next.
- In the Target section, provide the Aurora PostgreSQL cluster endpoint in the server name. Specify 5432 as the port, and provide the user name and password of the Aurora cluster that was provisioned by the AWS CloudFormation stack.
- Choose Test Connection to ensure that AWS SCT can connect to the Aurora PostgreSQL target successfully. Then choose Finish.After the wizard finishes, you are shown the main view with two panes. The left pane is the source Oracle DB, and the right is the target Aurora PostgreSQL DB. To generate a more specific and detailed assessment report, open the context (right-click) menu for the HRDATA schema in the source, and choose Create Report.The report provides Action Items that can help your database administrator (DBA) resolve the schema discrepancies in the target for a successful migration. For example: In the source panel, choose Views, and then choose EMP_DETAILS_VIEW. There are two action items for resolving issues during the automatic creation of EMP_DETAILS_VIEW in the target database. AWS SCT is a powerful tool that helps by converting the schema, functions, and stored procedures from the source database to the target and provides action items.For the scope of this post, the database is simple, and no major action needs to be taken.
- Open the context (right-click) menu for the HRDATA schema in the source Oracle database, and choose Convert Schema. The HRDATA schema is created in the target Aurora MySQL database. You are prompted with a warning that objects might already exist in the target and be replaced. Choose Yes to proceed.
- In the target Aurora PostgreSQL database pane, open the context (right-click) menu for the HRDATA Choose Apply to database.
At this point, you have successfully used AWS SCT to generate a database migration assessment report, and you’ve written the HRDATA schema in the target Aurora PostgreSQL database.
AWS SCT automatically converts the source database schema and most of the custom code to a format that is compatible with the target database. Any code that the tool cannot convert automatically is clearly marked so that you can convert it yourself. For best results, save your converted schema to a file as a SQL script, and review the converted schema with a target database expert. The schema should be reviewed and optimized for the target database before you proceed with data migration.
For the purposes of this post, we did not explore the optimization steps for the target database objects. We did a simple conversion of schema from source to target in AWS SCT.
Now you can proceed with the data migration.
Section 3: Use AWS DMS to migrate your data to the target database
In this section, you use AWS DMS to do bulk loading from Oracle to Amazon Aurora PostgreSQL. AWS DMS now offers free use for 6 months per instance if you’re migrating to Amazon Aurora, Amazon Redshift, or Amazon DynamoDB.
- Sign in to the AWS Management Console, and open the AWS DMS console.
- Ensure that you are in the same Region (us-east-2 or us-west-2) as where you deployed the AWS CloudFormation stack in Section 2.
- In the left navigation pane, choose Replication instances. Notice that a new replication instance has already been created for you by the AWS CloudFormation stack. Note the virtual private cloud (VPC) of the replication instance. It is the same VPC that was created for this post by the AWS CloudFormation stack.
- In the left navigation pane, choose Endpoints.
- Choose Create endpoint.
- On the Create endpoint page, choose Source, and provide the necessary information about the Oracle source database. The AWS CloudFormation Output section has the necessary information to complete this step. In the Test endpoint connection section, choose the VPC replication instance that was created by the AWS CloudFormation stack. Ensure that the test is successful, and then choose Create endpoint to save the endpoint details.Note: Don’t skip the Test endpoint step. It helps ensure that AWS DMS has the required connectivity to the endpoints to refresh the schemas.
- Similarly, on the Create endpoint page, choose Target, and provide the necessary information about the Aurora PostgreSQL target database. Provide the Aurora PostgreSQL cluster endpoint, port, user name, password, and database name.
- In the Test endpoint connection section, choose the AWS DMS replication instance that was created by the AWS CloudFormation stack. Ensure that the test is successful, and then choose Create endpoint to save the endpoint details.Note: Don’t skip the Test endpoint step. It helps ensure that AWS DMS has the required connectivity to the endpoint.The AWS CloudFormation Output section has the necessary information to complete this step.
- In the left navigation pane, choose Tasks, and then choose Create task.
- In the task settings:
- Choose the replication instance deployed by the AWS CloudFormation stack.
- Choose the Oracle endpoint as the source.
- Choose the Aurora PostgreSQL endpoint as the target.
- For Migration type, choose Migrate existing data. This option allows you to do bulk loading without change data capture (CDC). You can select other options, but you need to enable supplemental logging to allow for CDC. For more information, see Using an Oracle Database as a Source for AWS DMS.
- Select the Start task on create check box to start loading as soon as the task is created.
- For Target table preparation mode, choose Do nothing, as we already used AWS SCT to convert the schemas.
- Select Enable validation to compare each row in the source with its corresponding row at the target and verify that those rows contain the same data.
- Optionally, select Enable logging to log the task in CloudWatch for analysis if needed.
- In the Table mappings section, choose the source HRDATA schema to be migrated. For the Table name option, use the % wildcard for all tables in the schema. Choose Add selection rule to add the table mapping rule.
- After you add the selection rule, add two transformation rules:
- The first rule is to change the schema name HRDATA from uppercase to lowercase. This step is necessary because the Aurora PostgreSQL engine is case sensitive, and during schema conversion, AWS SCT created the schema and tables in uppercase. Choose Add transformation rule to add the rule to the task.
- The second rule is to change the table names in the HRDATA schema from uppercase to lowercase. Choose Add transformation rule to add the rule to the task.After you add the selection and transformation rules, your Table mappings section should look like this:
- Choose Create task to proceed.After the task is created, it automatically starts loading data from the source to the target database.
- Select the task, and in the properties pane below, choose the Table statistics tab to see the progress. The validation state should change from Pending records to Validated. Also, ensure that the Total count matches the Full Load Rows
You can optionally check the result of the migration in Aurora PostgreSQL by using a PostgreSQL client like pgAdmin.
Use the Aurora PostgreSQL connection information provided in the output of the AWS CloudFormation stack to establish a connection in a SQL client.
At this point, you have successfully migrated the data from Oracle to an Amazon Aurora PostgreSQL database.
Cleanup steps
In this section, you execute clean up tasks to remove the AWS resources deployed as part of the post. You can continue to run the resources, but note that resources continue to accrue charges as long as they are running.
Follow these steps to delete the resources that were created:
- In the console, open AWS DMS. In the left navigation pane, choose Tasks. Select the task that you created earlier in Section 3, and choose Delete.
- After the task is deleted, choose Endpoints in the left navigation pane. Delete the Oracle source and Aurora PostgreSQL target endpoints that you created earlier in Section 2.
- In the console, open AWS CloudFormation. Select the stack that was created in Section 1, and choose Delete Stack.
The deletion process will take 5–10 minutes to complete. It deletes all the resources created by the stack. The status shows DELETE_IN_PROGRESS. After the deletion is successful, the stack disappears from the stack list.
At this point, all the resources deployed as part of this post have been successfully deleted.
Conclusion
In this post, we walked you through the core steps of a heterogeneous database migration in AWS. We focused on a migration from Oracle to Amazon Aurora PostgreSQL using AWS Schema Conversion Tool and AWS Database Migration Service. You can customize the AWS CloudFormation script to deploy other database engine types easily and test other migrations as well.
About the author
Sona Rajamani is a solutions architect at AWS. She lives in the San Francisco Bay area and helps customers architect and optimize applications on AWS. In her spare time, she enjoys hiking and traveling.
Ballu Singh is a solutions architect at AWS. He lives in the San Francisco Bay area and helps customers architect and optimize applications on AWS. In his spare time, he enjoys reading and spending time with his family.