AWS Database Blog
How AEVI substituted Oracle materialized views fast refresh with AWS DMS when migrating to Amazon Aurora
AEVI took the journey to migrate to AWS. In their own words, “AEVI provides merchant-facing businesses—banks, acquirers, PSPs, ISOs, ISVs, and others—with a platform that enables merchants to run their business smarter by giving them easy access to any payment technology and business solution for the best in-store customer experience. AEVI’s Platform for Smart Merchant Enablement empowers them to upgrade to more agile and data-driven propositions, driving greater efficiency, more innovation, and an improved customer journey across all channels.”
AEVI is always looking to innovate, improve efficiency, and optimize its cost savings. In this post, we describe the solution AEVI used to migrate the data from Oracle Database with fast refresh materialized views to Amazon Aurora PostgreSQL-Compatible Edition with AWS Database Migration Service (AWS DMS). We share a specific test case scenario we encountered during AWS DMS testing to help prevent data loss. This scenario covers a configuration of complex materialized views with a deferred primary key.
Architecture overview
AEVI uses an Oracle Online Back Office (BO) database, which replicates some data to an Oracle reporting database using materialized views with fast refresh. A materialized view is a table segment whose contents are periodically refreshed based on a query, either against a local or remote table. The Oracle BO database was using commercially available off-the-shelf (COTS) products, which were only supported on Oracle, so this database was moved to Amazon Relational Database Service (Amazon RDS) for Oracle Standard Edition. For the transactional database, they chose Aurora PostgreSQL-Compatible. Both of the original databases required Oracle Enterprise Edition (EE) licenses so one of the goals was to minimize costs by removing the need for these licenses.
The following diagram illustrates the original architecture.
The following diagram illustrates the updated architecture.
The challenges
The first step during the migration was to find a solution for the materialized views fast refresh feature that was used in the Oracle database. PostgreSQL has the materialized views complete refresh option, but not the fast refresh feature. AEVI decided to move materialized views out from the transactional database to the BO RDS for Oracle database. Then they used AWS DMS to replicate materialized views to the Aurora PostgreSQL destination database, where they created the tables with the same names as the source materialized views. This solution worked perfectly, but some scenarios required specific configurations to make sure that there was no data loss or corruption.
Scenarios tested
Schemas of the same name in both the RDS for Oracle and RDS for PostgreSQL databases were created. For testing purposes, 13 base tables and 3 materialized views were created within the RDS for Oracle database. In the RDS for PostgreSQL database, three tables were created, which were based on the three materialized views created in the Oracle database.
Of the materialized views, one is a simple materialized view created with a primary key and several outer-joined tables. The other two are identically created complex materialized views, one of which has a deferred primary key and the other has a deferred unique key based on the same Not Null
column.
The issue
When you create an AWS DMS task against a base table, or materialized view, that has a primary key, AWS DMS chooses to use primary key supplemental logging, which is what ultimately causes the issue when you’re using a deferred primary key. This is because there is a short period of time during which multiple records with the same primary key exist, because of the way deferred primary keys work with complex materialized views.
With a complex materialized view, an update is translated into an update and an insert followed by a delete. AWS DMS faithfully replicates this behavior in the destination database (RDS for PostgreSQL in this case), but because it’s using a primary key for record identification, the updated and inserted record are both deleted by the final delete, resulting in data loss.
Therefore, to avoid this issue, if you create a complex materialized view, don’t use a deferred primary key and enable all column supplemental logging.
This means that when the updated materialized view record is transferred to the destination database by AWS DMS, it’s as an update followed by an insert and then followed by a delete. The delete only removes the original record, leaving the modified record behind as expected and required.
Solution test case
The following diagram illustrates the architecture for our solution test case.
Create the environment with AWS CloudFormation
Before you can run the test case to reproduce the materialized views solution, you need to create an environment for it within your AWS account. The following AWS CloudFormation template, Oracle-PG-MV-blog.json, creates an Amazon Elastic Compute Cloud (Amazon EC2) instance used as a bastion host, and RDS for Oracle and RDS for PostgreSQL databases. This template also creates an AWS DMS instance and copies the MaterializedViewTestCase.zip
folder to the bastion host.
Deploy the CloudFormation stack in your account. For instructions, refer to Get started. On average, it takes around 15 minutes to deploy the CloudFormation template.
The MaterialisedViewTestCase
folder has all the commands shown in this post, which are issued from the bastion host. To connect to the bastion host, you need to obtain the public IP address. This is provided as an output parameter in the CloudFormation template, or you can locate it on the Amazon EC2 console. After you connect to the bastion host, make sure that you can log in to both the Oracle and PostgreSQL databases. The clients for the Oracle and PostgreSQL databases are already installed by the CloudFormation template. You also need to review the Oracle tnsnames.ora
file.
The /home/oracle/amazon-oracle-postgres-mv-blog/MaterializedViewTestCase
directory contains the following folders and files:
- Oracle – The folder that stores scripts to run on the Oracle database (create user, tables, and so on)
- Postgres – The folder that stores PostgreSQL scripts (create user, tables, and so on)
- DMS – The folder that stores JSON files to create the AWS DMS task with the correct mappings and settings
- RunOrcltestsLocal.sh – The control script to run the test case
- RunOrcltestCleanup.sh – The control script to remove created resources during the test
Run the test case automated script
To reproduce the test case, run the runOrclTestsLocal.sh
command script with the -h
flag.
A help message appears with explanation about the input parameters. Another way to run this script is to modify the constants in the script so that you don’t have to provide parameters at runtime. Make sure you specify the correct hardcoded values, because it can potentially cause an issue if it’s run on the wrong environment.
The script first creates Oracle and PostgreSQL AWS DMS endpoints and an AWS DMS replication task to copy data from Oracle to PostgreSQL.
Then it creates a new schema in both databases. On the Oracle side, it creates 13 base tables with associated sequences and primary keys with three materialized views based on those tables. On the PostgreSQL side, it creates three new tables to accept the data from the Oracle materialized views.
Next, it enables supplemental logging on the Oracle database. Make sure that the ARCHIVELOG MODE is enabled on Amazon RDS for Oracle. Enable automatic backup on the Oracle database, which enables the archive log mode.
The next step is to start the AWS DMS replication task, which populates the PostgreSQL tables. The following screenshot shows that data is replicated from Oracle materialized views to PostgreSQL tables. The queries are all run against a nominated row that we use to demonstrate the problem and solution. The value in the t10_order_date
column is updated in all cases.
The next step updates the value in the order_date
column in the t10 base table in the Oracle source database to 01‑Jan‑2021
and refreshes the materialized views.
AWS DMS replicates the effect of this to PostgreSQL. Both the simple materialized view and the materialized view based on the unique key have updates as expected, but the materialized view based on the primary key failed to update with the following error: ERROR: duplicate key value violates unique constraint
. In the first run, this is caused by the primary key on the table in PostgreSQL, because that is the way AWS DMS creates the table, if you don’t create it yourself.
The next step is to replace the primary key with an index on the PostgreSQL table in the destination database and perform a full refresh of the materialized view in the source Oracle database to restore the data before rerunning the test, this time setting the value in the t10_order_date
column to 02‑Jan‑2021
.
In this test, again the simple materialized view and the materialized view based on the unique key have refreshed as expected, but the materialized view based on the primary key is again blank. The only difference this time is that no errors were recorded in the AWS DMS error table.
The scripts we provided also include a debug option. This option creates a debug table in PostgreSQL and associated triggers on the tables used to receive the complex materialized view data. This table contains the data in which the DML activities are applied to the PostgreSQL tables by AWS DMS. We used this script to work out the root cause of the issue, and turned on enhanced logging in the PostgreSQL database, which allowed us to view all DML activities taking place along with the values used.
This use case highlighted a specific situation with complex materialized views where deferred primary keys were used. After we removed the deferred primary keys and enabled all column supplemental logging, AWS DMS successfully replicated materialized views with fast refresh to the destination database.
Clean up
To avoid incurring ongoing charges, delete the resources for this solution. You can do this by removing the CloudFormation stack from your account.
Conclusion
In this post, we discussed the architecture AEVI used to move an Oracle database with materialized views fast refresh to an Aurora PostgreSQL database to optimize their license costs.
We walked you through the process of replicating Oracle materialized views to an Aurora PostgreSQL table using AWS DMS and potential specific configurations that help prevent data loss or corruption. This solution helped AEVI accelerate their database migration journey and take advantage of AWS services.
If you have any questions or comments about this post, please use the comments section.
About the Authors
Nataliya Birch is a Database Migration Solutions Architect working on the DMA team at AWS. She has over 15 years of experience in relational and non-relational databases. She helps AWS customers with their cloud database migration strategy and migration journey.
Mike Revitt is a Database Migration Specialist working on the Database Freedom team at AWS, with over 30 years of experience in database and application migration, modernization, design, development, and integration.