AWS Database Blog

Validating database objects after migration using AWS SCT and AWS DMS

Database migration can be a complicated task. It presents all the challenges of changing your software platform, understanding source data complexity, data loss checks, thoroughly testing existing functionality, comparing application performance, and validating your data.

AWS provides several tools and services that provide a pre-migration checklist and migration assessments. You can use the AWS Schema Conversion Tool (AWS SCT) to convert your existing database schema from one database engine to another. AWS Database Migration Service (AWS DMS) makes it easy to migrate relational databases, data warehouses, NoSQL databases, and other types of data stores. You can use AWS DMS to migrate your data into the AWS Cloud, between on-premises instances (through an AWS Cloud setup), or between combinations of cloud and on-premises setups.

In addition, AWS offers a wide range of documentation to guide you through your database migration. For more information, see Migrating an Oracle Database to PostgreSQL.

AWS SCT helps convert your schema objects and builds a report showing the percentage of Oracle code that it converted to PostgreSQL and how much code requires manual efforts for conversion. During database objects migration, you always risk a possibility of missing objects or creating new objects at the target database, or ignoring an entire source object intentionally. Validation is the process where you demonstrate to your customers that everything intended for migration is migrated successfully.

This post provides an outline on how to validate objects between a source Oracle database and a PostgreSQL target after completing database object migration and code conversion.

Validating objects

The question is what to validate? To understand this, you must know the different types of Oracle database objects and their equivalent PostgreSQL database object type.

The following table shows Oracle (source) database objects and the corresponding PostgreSQL (target) object type. To know that your DB conversion succeeded, you must thoroughly validate these objects.

Oracle Objects  Migrated to PostgreSQL as
TABLE TABLE
VIEW VIEW
MATERIALIZED VIEW MATERIALIZED VIEW
SEQUENCE SEQUENCE
SYNONYM Not available in PostgreSQL. Ignored in PostgreSQL, but can be partially achieved through Views. SET search_path can also be a work-around for synonyms.
TYPE DOMAIN/ TYPE
FUNCTION FUNCTION
PROCEDURE Not available in PostgreSQL. Achieved through PostgreSQL Functions.
PACKAGE Not available in PostgreSQL. Achieved through PostgreSQL Schemas to organize functions into groups.
TRIGGER TRIGGER

How to perform schema validation

Database migration involves lot many tasks, tools and processes. The size and type of Oracle database migration you want to do greatly determines the tools you should use. For example, a heterogeneous migration, where you are migrating from an Oracle database to a different database engine on AWS, is best accomplished using AWS DMS. A homogeneous migration, where you are migrating from an Oracle database to an Oracle database on AWS, is best accomplished using native Oracle tools.

Following task list shows when you should conduct schema validation during your migration process.

  • Task 1: Configure Your Oracle Source Database
  • Task 2: Configure Your PostgreSQL Target Database
  • Task 3: Use the AWS SCT to Convert the Oracle Schema objects to PostgreSQL
  • Task 4: Manually convert Oracle schema objects those AWS SCT couldn’t convert
  • Task 5: Use AWS DMS to migrate data from Oracle to PostgreSQL
  • Task 6: Perform the schema object validation

To validate the schema conversion, compare the objects found in the Oracle database and PostgreSQL database using any query editor.

The standard validation method compares object counts in the source database against the counts in the target database. You can perform count verification for any schema object, but count validation alone does not always satisfy end users. Users often look for object definition-level verification. This is why you must write your custom queries to retrieve the data definition language (DDL) from the database and compare them.

Schema objects validation queries

Oracle uses the LONG data type to store large character sets, with a maximum size of up to 2 GB. Oracle Long data type appears in many Oracle metadata tables or views. Few Oracle metadata tables used in validation queries that are explained later in this topic contains LONG data type. Unlike other data types, LONG data type in Oracle database doesn’t behave the same way and are subject to some restrictions.

The limitations to the LONG type include the following:

  • LONG columns can’t appear in WHERE clauses.
  • You can’t specify LONG data in regular expressions.
  • A stored function can’t return a LONG value.
  • You can declare a variable or argument of a PL/SQL program unit using the LONG data type. However, you can’t then call the program unit from SQL.

The Oracle validation queries described in this topic use the DBMS_XMLGEN package to handle Oracle LONG data type differently and helps bypass few restrictions. DBMS_XMLGEN package takes an arbitrary SQL query as input, converts the entire record set into XML format and returns the result as CLOB data type, which then becomes easier to process and manipulate the data.

Step 1: Validate Oracle packages

PostgreSQL schemas let you store all related functions in the same way as package groups functions in Oracle. Migrating Oracle packages as PostgreSQL schemas helps avoid unnecessary application changes, because applications accessing the Oracle database functions can access PostgreSQL individual functions in the same way. For example:

  • Source Oracle database before migration: The application accesses Oracle package functions as PackageName.FunctionName.
  • Target PostgreSQL database after migration: The application can access PostgreSQL functions as SchemaName.FunctionName.

Assume that you have one user in Oracle that contains 10 packages and other objects. You migrate all 10 packages as PostgreSQL schemas, and the Oracle user itself is another schema in PostgreSQL that stores all the other objects. So, in PostgreSQL after migration you will have 11 (10 package = 10 schemas; 1 user = 1 schema) schemas.

Use the following query to validate whether all the Oracle packages migrated as schemas in PostgreSQL:

Oracle

SELECT object_name AS package_name
  FROM all_objects 
 WHERE object_type = 'PACKAGE'
   AND owner = upper('your_schema')
ORDER BY upper(object_name);

PostgreSQL

SELECT upper(schema_name) AS package_name
  FROM information_schema.schemata
 WHERE schema_name not in('pg_catalog','information_schema', lower('your_schema'))
ORDER BY upper(schema_name);
  • Put your Oracle user name in place of ‘your_schema‘ in the example queries.
  • If you exclude any object from migration in the source database or introduce one in the target database, then exclude those objects from the count by adding a WHERE clause filter in the preceding and subsequent queries.

Step 2: Validate tables

AWS SCT converts Oracle tables with proper structure and provides the option to either deploy to a target PostgreSQL database or save the converted code as .sql files. You must verify whether all tables are migrated to target PostgreSQL database by checking the count on both source and target database. The following script performs counts verification for all tables:

 Oracle

SELECT count(1) AS tables_cnt
  FROM all_tables
 WHERE owner = upper('your_schema');

PostgreSQL

SELECT count(1) AS tables_cnt
  FROM pg_tables
 WHERE schemaname = lower('your_schema');

Step 3: Validate views

AWS SCT helps in migrating Oracle views code to views in PostgreSQL. Validate the count of views after migration in Oracle source and PostgreSQL target database using the following script:

Oracle

SELECT count(1) AS views_cnt
  FROM all_views
 WHERE owner = upper('your_schema');

PostgreSQL

SELECT count(1) AS views_cnt
  FROM pg_views
 WHERE schemaname = lower('your_schema');

Step 4: Validate sequences

The sequences count must match at both source and target to be confident that the migration didn’t miss any sequences. Validate the sequences count after migration into PostgreSQL using the following script:

Oracle

SELECT count(1) AS sequence_cnt
  FROM all_sequences
 WHERE sequence_owner = upper('your_schema');

PostgreSQL

SELECT count(1) AS sequence_cnt
  FROM information_schema.sequences
 WHERE sequence_schema = lower('your_schema');

Step 5: Validate triggers

You can perform a count validation of any database object. Because database triggers are distinctly important, it is essential to verify all attributes of the triggers. Validate the triggers after migration into PostgreSQL using the following script:

Oracle

SELECT 
       owner AS schema_name, 
       trigger_name, 
       table_name, 
       triggering_event, 
       trigger_type
  FROM ALL_TRIGGERS
 WHERE owner = upper('your_schema')
ORDER BY trigger_name;

PostgreSQL

SELECT
       upper(trigger_schema)     AS schema_name,
       upper(trigger_name)       AS trigger_name,
       upper(event_object_table) AS table_name,
       string_agg(upper(event_manipulation), ' OR ' ORDER BY CASE WHEN  event_manipulation = 'INSERT' THEN 1 WHEN event_manipulation = 'UPDATE' THEN 2 ELSE 3 END) AS triggering_event,
       upper(action_timing) || ' ' || CASE WHEN action_orientation = 'ROW' THEN 'EACH ROW' ELSE action_orientation END AS trigger_type
  FROM information_schema.triggers
 WHERE trigger_schema = lower('your_schema')
GROUP BY trigger_schema, trigger_name, event_object_table, action_timing, action_orientation
ORDER BY upper(trigger_name);

Step 6: Validate primary keys

Constraints help to maintain data integrity in a relational database. During database migration, it is crucial to validate constraints carefully.

You can validate primary keys by checking if you have created a similar “primary key” constraint on the target table, which means column(s) and column order must be same on both source Oracle and target PostgreSQL database. The validation explained he re does not check the primary key columns and column order. Primary keys by default create a UNIQUE index in both Oracle and PostgreSQL. So, during index validation, you can verify primary key columns and the column order. Validate the primary keys after migration into PostgreSQL using the following script:

Oracle

SELECT owner           AS schema_name, 
       table_name,
       constraint_name AS object_name,
       'PRIMARY KEY'   AS object_type
  FROM all_constraints
 WHERE owner = upper('your_schema') 
   AND constraint_type = 'P';

PostgreSQL

SELECT upper(n.nspname)        AS schema_name, 
       trim(upper(split_part(conrelid::regclass::varchar, '.', 2)), '"') AS table_name,
       upper(conname::varchar) AS object_name,
       'PRIMARY KEY'           AS object_type
  FROM pg_constraint c
       JOIN pg_namespace n ON n.oid = c.connamespace
 WHERE contype in ('p') 
   AND n.nspname = lower('your_schema')

Step 7: Validate indexes

Indexes speed up searching the database, and a bad index migration or missing an index creation can severely impact target database queries. Comparing indexes is not simple because the order of columns in each index matters. If the column order changes, the index behaves differently and considered as a new index completely.

The primary key in a relational database creates an index internally. While validating indexes, you must include the indexes created implicitly for primary keys and indexes created explicitly through CREATE INDEX syntax. The following query validates the type of index and column order for all tables for all indexes:

Oracle

WITH cols AS (
    SELECT idx.owner AS schema_name, idx.table_name, idx.index_name, cols.column_name, cols.column_position, idx.uniqueness, decode(cols.descend, 'ASC', '', ' '||cols.descend) descend
      FROM ALL_INDEXES idx, ALL_IND_COLUMNS cols
     WHERE idx.owner = cols.index_owner AND idx.table_name = cols.table_name AND idx.index_name = cols.index_name
       AND idx.owner = upper('your_schema')
),
expr AS (
    SELECT extractValue(xs.object_value, '/ROW/TABLE_NAME')         AS table_name
    ,      extractValue(xs.object_value, '/ROW/INDEX_NAME')         AS index_name
    ,      extractValue(xs.object_value, '/ROW/COLUMN_EXPRESSION')  AS column_expression
    ,      extractValue(xs.object_value, '/ROW/COLUMN_POSITION')    AS column_position
    FROM (
         SELECT XMLTYPE(
              DBMS_XMLGEN.GETXML( 'SELECT table_name, index_name, column_expression, column_position FROM ALL_IND_EXPRESSIONS WHERE index_owner = upper(''your_schema'') '
                                ||' union all SELECT null, null, null, null FROM dual '
              )
           ) AS xml FROM DUAL
       ) x
    , TABLE(XMLSEQUENCE(EXTRACT(x.xml, '/ROWSET/ROW'))) xs
)
SELECT 
       cols.schema_name, 
       cols.table_name, 
       cols.index_name    AS object_name, 
       'INDEX'            AS object_type,
       replace('CREATE'|| decode(cols.uniqueness, 'UNIQUE', ' '||cols.uniqueness) || ' INDEX ' || cols.index_name || ' ON your_schema.' || cols.table_name || ' USING BTREE (' ||
            listagg(CASE WHEN cols.column_name LIKE 'SYS_N%' THEN expr.column_expression || cols.descend ELSE cols.column_name || cols.descend END, ', ') within group(order by cols.column_position) || ')', '"', '') AS condition_column
FROM cols
     LEFT OUTER JOIN expr ON cols.table_name = expr.table_name
           AND cols.index_name      = expr.index_name
           AND cols.column_position = expr.column_position
GROUP BY cols.schema_name, cols.table_name, cols.index_name, cols.uniqueness;

PostgreSQL

SELECT upper(schemaname) AS schema_name, 
       upper(tablename)  AS table_name, 
       upper(indexname)  AS object_name, 
       'INDEX'           AS object_type, 
       upper(replace(indexdef, '"', ''))   AS condition_column
  FROM pg_indexes
 WHERE schemaname = lower('your_schema');

Step 8: Validate check constraints

When you perform CHECK constraint validation, make sure that you include constraints created with condition checking as well as columns declared NOT NULL during table creation. You can validate check constraints by verifying the specific condition on the respective table column. Validate the check constraints after migration into PostgreSQL using the following script:

Oracle

WITH ref AS (
 SELECT   extractValue(xs.object_value, '/ROW/OWNER')            AS schema_name
   ,      extractValue(xs.object_value, '/ROW/TABLE_NAME')       AS table_name
   ,      extractValue(xs.object_value, '/ROW/CONSTRAINT_NAME')  AS object_name
   ,      extractValue(xs.object_value, '/ROW/SEARCH_CONDITION') AS condition_column
   ,      extractValue(xs.object_value, '/ROW/COLUMN_NAME')      AS column_name
  FROM (
         SELECT XMLTYPE(
              DBMS_XMLGEN.GETXML('SELECT cons.owner, cons.table_name, cons.constraint_name, cons.search_condition, cols.column_name
                           FROM ALL_CONSTRAINTS cons, ALL_CONS_COLUMNS cols
                           WHERE cons.owner =  cols.owner AND cons.table_name =  cols.table_name AND cons.constraint_name = cols.constraint_name
                             AND cons.owner = upper(''your_schema'') AND cons.constraint_type = ''C'' '
                             )
           ) AS xml FROM DUAL
       ) x
   , TABLE(XMLSEQUENCE(EXTRACT(x.xml, '/ROWSET/ROW'))) xs
)
SELECT schema_name||'.'||table_name             AS table_name,
       object_name,
       constraint_type,
       trim(upper(replace(check_condition, '"', ''))) AS check_condition
  FROM (
         SELECT 
                schema_name, 
                table_name, 
                object_name, 
                'CHECK'                AS constraint_type, 
                condition_column       AS check_condition
         FROM ref
         UNION
         SELECT 
                owner                  AS schema_name, 
                table_name, 
                'SYS_C0000'||column_id AS object_name, 
                'CHECK'                AS constraint_type, 
                '"'||column_name||'" IS NOT NULL' AS check_condition
         FROM all_tab_columns tcols where owner = upper('your_schema') and nullable = 'N'
         AND NOT EXISTS ( SELECT 1 FROM ref WHERE ref.table_name = tcols.table_name
                                    AND ref.schema_name = tcols.owner
                                    AND ref.column_name = tcols.column_name
                                    AND ref.condition_column = '"'||tcols.column_name||'" IS NOT NULL')
        /* ALL_TAB_COLUMNS contains Tables and Views. Add below to exclude Views NOT NULL constraints */
        AND NOT EXISTS ( SELECT 1 FROM ALL_VIEWS vw WHERE vw.view_name = tcols.table_name
                                   AND vw.owner = tcols.owner
                       )
);

PostgreSQL

SELECT 
       table_name,
       object_name,
       constraint_type,
       upper(trim(replace(replace(replace(replace(check_condition, '"', ''), '(', ''), ')', ''), 'CHECK', ''))) check_condition
  FROM (
    SELECT 
           upper(n.nspname)||'.'||upper(c.relname) AS table_name, 
           'SYS_C0000'||attnum                     AS object_name, 
           'CHECK'                                 AS constraint_type, 
           a.attname||' IS NOT NULL' AS check_condition
      FROM pg_attribute a, pg_class c, pg_namespace n
     WHERE a.attrelid = c.oid and c.relnamespace = n.oid AND n.nspname = lower('your_schema')
       AND attnotnull AND attstattarget <> 0
     UNION
    SELECT 
           upper(conrelid::regclass::varchar)     AS table_name, 
           upper(conname::varchar)                AS object_name, 
           'CHECK'                                AS constraint_type,
           pg_get_constraintdef(c.oid)            AS check_condition
      FROM pg_constraint c
           JOIN pg_namespace n ON n.oid = c.connamespace
     WHERE  contype in ('c')
       AND conrelid::regclass::varchar <> '-'
       AND n.nspname = lower('your_schema')
) a;

Step 9: Validate foreign keys

Foreign keys help maintain links between relational tables by referring to the primary or unique key of another table. PostgreSQL allows users to create foreign keys with the NOT VALID option. If the constraint is marked NOT VALID, the potentially lengthy initial check to verify that all rows in the table satisfy the constraint is skipped. The constraint will still be enforced against subsequent inserts or updates (that is, they will fail unless there is a matching row in the referenced table).

Migrate your foreign keys with the NOT VALID option in PostgreSQL, if you are sure that source existing data conforms to foreign keys validation rules and you don’t want the same checks again in the target PostgreSQL database.

The following foreign keys validation query takes both parent and child tables into consideration and assumes you have created foreign keys with NOT VALID option in PostgreSQL during migration:

Oracle

SELECT 
       c.child_tab_owner           AS schema_name, 
       c.table_name, 
       c.constraint_name           AS object_name, 
       'FOREIGN KEY'               AS object_type,
       'FOREIGN KEY ('|| cc.fk_column || ') REFERENCES ' || p.parent_tab_owner || '.' || p.table_name || '('|| pc.ref_column ||') NOT VALID' AS condition_column
FROM ( SELECT owner child_tab_owner, table_name, constraint_name, r_constraint_name FROM ALL_CONSTRAINTS WHERE owner = upper('your_schema') AND constraint_type = 'R') c,
     ( SELECT owner parent_tab_owner, table_name, constraint_name FROM ALL_CONSTRAINTS WHERE owner = upper('your_schema') AND constraint_type IN('P', 'U') ) p,
     ( SELECT owner, table_name, constraint_name, listagg(column_name, ', ') WITHIN group(ORDER BY position) fk_column
         FROM ALL_CONS_COLUMNS WHERE owner = upper('your_schema') GROUP BY owner, table_name, constraint_name ) cc,
     ( SELECT owner, table_name, constraint_name, listagg(column_name, ', ') WITHIN group(ORDER BY position) ref_column
         FROM ALL_CONS_COLUMNS WHERE owner = upper('your_schema') GROUP BY owner, table_name, constraint_name ) pc
WHERE c.r_constraint_name = p.constraint_name
  AND c.table_name = cc.table_name AND c.constraint_name = cc.constraint_name AND c.child_tab_owner = cc.owner
  AND p.table_name = pc.table_name AND p.constraint_name = pc.constraint_name AND p.parent_tab_owner = pc.owner;

PostgreSQL

SELECT upper(n.nspname::text)     AS schema_name,
       trim(upper(split_part(conrelid::regclass::varchar, '.', 2)), '"') AS table_name, upper(conname::varchar)    AS object_name,
       'FOREIGN KEY'       AS object_type,
       CASE contype WHEN 'f' THEN upper(pg_get_constraintdef(c.oid)) END AS  condition_column
  FROM pg_constraint c
       JOIN pg_namespace n ON n.oid = c.connamespace
 WHERE contype in ('f')
   AND n.nspname::text = lower('your_schema');

Useful PostgreSQL catalog tables

The following PostgreSQL catalog tables and views help you build custom queries.

Source Oracle Metadata Target PostgreSQL Metadata
all_tables pg_tables
information_schema.tables
all_views pg_views
information_schema.views
all_sequences information_schema.sequences
all_triggers information_schema.triggers
all_indexes pg_indexes
all_constraints pg_constraint join to pg_namespace
all_procedures information_schema.routines

Conclusion

Validating database objects is essentially a view of the database migration accuracy and confirms that all objects migrated successfully with proper attributes. Validating various constraints helps you double-check target database integrity and keep bad data out of your database. Indexes speed up data retrieval operations, and validation of indexes prevents existing queries from slowing down at the target database.

You can use the queries in this post as-is in any programming language to retrieve record sets from your Oracle and PostgreSQL databases. You can always write your custom validation queries based on your requirement, but this post demonstrates one possible way to conduct post-migration object validation.

The following related resources help you understand more about database migration.

If you have any questions or suggestions about this post, feel free to leave a comment.

 


About the Author

 

Sashikanta Pattanayak is an Associate Consultant with Amazon Web Services. He works with customers to build scalable, highly available and secure solutions in the AWS cloud. His focus area is homogenous and heterogeneous database migrations.