AWS Database Blog
Migrate Oracle collections to Amazon Aurora PostgreSQL-Compatible Edition or Amazon RDS for PostgreSQL
Migrating from Oracle to PostgreSQL can involve several steps to ensure a smooth transition. A notable example is an Oracle collection, which is a feature that Oracle database developers commonly use. PostgreSQL does not directly support Oracle collections, and there currently isn’t an extension that adds this capability. There are several decisions you have to make when migrating Oracle collections to PostgreSQL to successfully make the transition.
Oracle database developers use collections in many scenarios, such as the following:
- Caching static values to reduce the number of database calls, consequently enhancing speed and efficiency
- Repeatedly accessing data within a block or subprogram, often involving conditional modification and storage for future retrieval within the same transaction
- Utilizing collections as function or procedure parameters or return types to pass multiple values to other calling subprograms
- Passing values to an application where the application layer defines a list, array, or map to process values
In this post, we discuss Oracle simple collections, composite collections, and multi-dimensional collections, and how you can migrate these Oracle features to Amazon Aurora PostgreSQL-Compatible Edition or Amazon Relational Database Service (Amazon RDS) for PostgreSQL using PostgreSQL’s native custom types and arrays while following recommended best practices.
Overview of representing Oracle collections in PostgreSQL
PostgreSQL database provides arrays, which you can use as an option for porting Oracle collections. Each array element is a data type similar to that of the collection. You can create an array for any built-in or user-defined data type.
Declaring an array variable in PostgreSQL is done using the []
(array) operator. Here’s how you can declare arrays and use them in PostgreSQL:
- Simple array creation – You can create a simple array in PostgreSQL using the array
[]
operator of any predefined data type. For example, you can create an array of varchar(20) and assign values using following SQL statements:
- Composite array creation – You can create a composite array in PostgreSQL by first creating a user-defined type and then using the type to create an array. The following statements create a custom type named
person_type
, define an array variable of the custom type created, and assign values to array elements:
You can also use a domain in PostgreSQL for defining array variables. You create a domain when you intend to utilize the same existing structure. For example, you can create a domain for person_type[]
and use it as is, without needing to declare arrays with the operator []
:
Now that you have learned about arrays in PostgreSQL, let’s dive deep into the use cases and challenges while migrating collections from an Oracle database to PostgreSQL.
Prerequisites
To get started with the examples explained in this post, you must have the following prerequisites:
- An active AWS account
- A source Oracle database (on premises or using Amazon RDS for Oracle)
- A target RDS for PostgreSQL or Aurora PostgreSQL database
- A database user with the following privileges on the target database (provide your own schema and user):
Create the following table and insert data using the following script in order to try out the examples described in this post. Run the corresponding statements in Oracle and PostgreSQL databases:
For Oracle, use the following code:
For PostgreSQL, use the following code:
Migrate a simple collection using BULK COLLECT to PostgreSQL
Oracle PL/SQL developers use the BULK COLLECT feature in SELECT statements that retrieve multiple rows with a single fetch, improving the speed of data retrieval. BULK COLLECT is currently not available on PostgreSQL databases. If you convert BULK COLLECT with a traditional row-by-row processing approach, then efficiency might become an issue. Therefore, it’s better to use SQL statements rather than looping each row. Moreover, you may learn more about how to migrate BULK COLLECT in PostgreSQL in this post.
The PostgreSQL database offers the array function array_agg, which enables you to achieve a processing approach similar to Oracle. To illustrate, consider a basic collection scenario involving BULK COLLECT. We can then adjust the SQL query in PostgreSQL to retrieve all rows within a single fetch operation.
The following PL/SQL program fetches student marks data from the Student_Marks
table and stores them in memory as an OUT parameter, which can be passed to other calling routines and applications for further processing. This following Oracle example demonstrates how BULK COLLECT is used to fetch all rows in a single database call.
Let’s run the package procedure through an anonymous block and check the result:
We get the following output:
Let’s now convert this Oracle package function to an equivalent PL/pgSQL code. There are a few things to keep in mind when migrating such code to PostgreSQL:
- PostgreSQL does not support Oracle’s PACKAGE syntax. For this post, we used PostgreSQL schemas to help convert Oracle packages.
- All functions or procedures are standalone in nature that you create in a schema in PostgreSQL.
- It’s not recommended to create types or domains in a procedure or function because they’re created as persistent objects within the database and if you run the PL/pgSQL code a second time, it reports the error message “Object already exists.”
- You should create types and domains in PostgreSQL for collection types defined in the Oracle package specification or body.
- You should declare an array variable as a domain and use the array_agg function to retrieve all rows as a single unit to populate the array.
With this understanding, let’s convert the Oracle code to PostgreSQL.
Let’s run the PostgreSQL function through an anonymous block and check the result:
We get the following output:
Things to note:
- We use the
array_agg
function in PostgreSQL as an alternate to BULK COLLECT in Oracle. - You should plan well if you need to create a domain for the type defined in Oracle. If the type is a global object, you must create a domain or type in PostgreSQL.
Let’s now explore another scenario with composite collections, and understand how to handle them in PostgreSQL.
Migrate a composite collection to PostgreSQL
Oracle Database allows collections to be composite, where the elements of the collection can contain more than one column. Because all elements of a collection always have the same data type, this type of collection always refers to a collection created from a user-defined data type.
The following PL/SQL program declares a record type named R_MarkGradeRec
, comprised of four attributes. An array named MarkGradeArray
of type R_MarkGradeRec
is created that can store student marks. The procedure get_grades
retrieves and populates an output array (IO_MarkGradeArray
) with grade-related information for a specified student ID.
We use the following code in Oracle:
Let’s run the script that retrieves and displays grade information for a specific student (identified by ID 10102) using the custom array type MarkGradeArray
. It prints the SubjectID
and TotalMarks
for each grade entry.
We get the following output:
You can migrate the aforementioned Oracle code to PostgreSQL by completing the following steps:
- Create a user-defined type with the same structure as the Oracle user-defined type.
- Create a domain in PostgreSQL using the created type.
- Declare an array variable and modify the cursor SELECT statement using
array_agg
and the row operator.
We use the following code in PostgreSQL:
Let’s run the script:
We get the following output:
Migrate a nested composite collection to PostgreSQL
Both the Oracle and PostgreSQL database support nested types, but let’s see the operational differences with respect to collections or arrays in both databases.
You can include a record as a field within another record. This is called a nested record. The record that contains the nested record as a field is called the enclosing record. The following code illustrates how you can port nested composite collections to PostgreSQL and carry out the following actions:
- Declare two custom types,
school_info
andstudent_info
:school_info
is a record type that represents an address and all attributes are of built-in types.student_info
is a nested record type because it contains one attribute of theschool_info
record type for student’s school details.
- The
students_address_info
procedure populates theIO_student_info_array
with student information, including IDs, names, and school details.
We use the following code in Oracle:
Let’s run the code:
We get the following output:
The following PostgreSQL code snippet is the equivalent code for the preceding nested composite collection:
Let’s run the code:
We get the following output:
Things to note:
- When values are enclosed in parentheses, PostgreSQL understands them as a single unit and treats them as a record type and implicitly type casts to the proper type. In this case, the ROW constructor is optional.
- The current version of PostgreSQL doesn’t allow multi-level reference even with the record type, which means
v_rec.school_det.city := 'HYD CITY, DC'
is not allowed. Therefore, the correct method is modifying the record type, for examplev_rec.school_det := ROW('F 30, ABC Apartments', 'XYZ Street', 'HYD CITY, DC');
.
Migrate a multi-dimensional collection
You can create a collection whose elements are also collections. For example, you can create a varray of varrays, a nested table of varrays, and so on. PostgreSQL also allows you to create arrays of arrays. This is referred as a multi-dimensional collection or multi-dimensional array.
Let’s take an example of a two-dimensional array, which is the simplest form of a multi-dimensional array. The following anonymous block declares a two-dimensional array CLASS_STUD_JOINDT_ARRAY
, populates the arrays, and provides the array access and modification mechanism.
We use the following code in Oracle:
We get the following output:
To migrate the Oracle multi-dimensional collection code to PostgreSQL’s PL/pgSQL, we create equivalent arrays as domain types. PostgreSQL uses a similar loop structure, but the syntax for accessing and modifying arrays is slightly different.
We use the following code in PostgreSQL:
We get the following output:
Things to note:
- You can fetch multi-dimensional element values in PostgreSQL similar to Oracle. For example,
V_CLASS_STUD_JOINDT_ARRAY(i)(j)
can be converted toV_CLASS_STUD_JOINDT_ARRAY[i][j]
in PostgreSQL when fetching or accessing values. - However, you can’t assign values directly to elements of a multi-dimensional array in PostgreSQL as you do in Oracle. In PostgreSQL, you need another temporary record variable to hold values and assign back the array element.
The following SQL for multi-dimensional collection assignment works fine in Oracle PL/SQL:
The following multi-dimensional array assignment statement throws an error in PL/pgSQL because it’s not possible to write to a multi-dimensional array element directly in PostgreSQL:
In PostgreSQL, you have to break the multi-dimensional array to a single dimensional array and assign values:
Multi-dimensional structures can also be extended when table/type columns become arrays. In this case, each record can have multiple values of each with different data types, which are grouped as a single unit and make the multi-dimensional elements.
Developers often come across such scenarios when migrating Oracle PL/SQL code. Operating with a correct reasonable data structure in PostgreSQL becomes troublesome because the syntax, Oracle functions, and operators don’t work similarly in PostgreSQL.
The following code is an extension of our previous example. The collection discussed in this example is multi-dimensional as well as composite consisting of more than one attribute.
We use the following code in Oracle:
We get the following output:
The following code block in PostgreSQL creates the types and domains required to implement the multi-dimensional composite array structure:
The following code block is equivalent to an Oracle multi-dimensional composite collection and describes how to operate on a multi-dimensional composite array in a PostgreSQL database:
We get the following output:
Clean up
When you’re finished experimenting with the scenarios described in this post, you can clean up your PostgreSQL database by using following code to delete all the objects created in this example.
Conclusion
In this post, we showed how to use PostgreSQL-native features to efficiently migrate Oracle simple collections, composite collections, and multi-dimensional collections to PostgreSQL. It’s important to note that the specifics of the migration process will depend on the complexity of your Oracle collections, the size of your dataset, and the intricacies of your application. We shared instructions for migrating Oracle collections to Amazon Aurora PostgreSQL or Amazon RDS for PostgreSQL and also provided code examples, SQL scripts, and best practices that you can use while migrating Oracle collections.
If you have any feedback, comments, or questions, share them in the comments section.
About the Authors
Sashikanta Pattanayak works as a Lead Consultant with the Professional Services team at AWS, specializing in database modernization, optimization, and migration strategies. His role involves collaborating with clients to architect scalable, resilient, and secure solutions within the AWS cloud environment. He specializes in homogeneous and heterogeneous database migrations.
Vinay Paladi is a Lead Database Consultant with the Professional Services team at Amazon Web Services. He works as a database migration specialist, helping and enabling customers to build highly available, cost-effective database solutions and migrate their commercial engines to the AWS Cloud. He is passionate about building innovative solutions to accelerate the database journey to cloud.
Anuradha Chintha is a Lead Consultant with Amazon Web Services. She works with customers to build scalable, highly available, and secure solutions in the AWS Cloud. Her focus area is homogeneous and heterogeneous database migrations.