AWS Database Blog
Integrating Amazon RDS for Oracle with Amazon S3 using S3_integration
Amazon RDS for Oracle gives you the full benefits of a managed service solution. You can use the lift-and-shift approach to migrate your legacy Oracle database to Amazon RDS for Oracle and, as a result, reduce the need to refactor and change existing application components.
Data warehouse (DW) extraction is an integral part of most databases. It is common practice to keep extracted files in a shared file system, such as a Network File System (NFS), between the database host and the DW server for in-house databases.
Dealing with migration from on-premises Oracle to Amazon RDS for Oracle with shared file systems for DW loads requires extra attention for a smooth transition of existing shared transfer mechanisms. With that in mind, AWS enhanced Amazon RDS for Oracle in February 2019 with Amazon S3 integration. This option enables the seamless integration of Amazon RDS for Oracle database to Amazon Simple Storage Service (Amazon S3).
This post discusses Amazon S3 integration as an alternative to shared file systems between an application host and Amazon RDS for Oracle host and how you can use external tables to access data stored in Amazon S3 as an alternate solution. This post used an Oracle database running on an Amazon EC2 instance in place of a source Oracle in an on-premises database.
Prerequisites
Make sure that Amazon RDS for Oracle has sufficient space to accommodate your database as well as additional files to demonstrate uploading and downloading files from Amazon S3.
This post uses an Amazon S3 bucket accessible through the IAM role from an RDS instance.
Amazon S3 integration as an alternative to shared file systems
Many Oracle database applications use flat files to load data into the database. These files are generally stored in a shared file system volume that is mounted on both the application and Oracle database server.
Amazon S3 integration provides a secure way to use your S3 bucket and share files as an alternative to a shared file system, which you can access from both Amazon RDS for Oracle and other application hosts. Amazon S3 is a durable and reliable object storage service, and its lifecycle rules help you save costs by automating object transition from one storage class to another.
Solution overview
This post demonstrates loading a file from an application server into database landing tables. These landing tables are later transformed, and data is saved as an output flat file on a shared file system volume via an S3 bucket.
The solution contains the following steps:
- Configure Amazon S3 integration on the Amazon RDS for Oracle database.
- Configure an IAM role in the Amazon RDS for Oracle database.
- Load data into a table using sqlloader.
- Modify the data into the Oracle table.
- Create a text file on local Amazon RDS for Oracle storage.
- Verify that the text file has the correct data.
- Execute a script to upload the text file from local storage to an S3 bucket and download the file from the S3 bucket to EC2 instance file system.
- Verify that the file downloaded correctly.
The following diagram shows the configuration of using Amazon S3 integration in place of a shared file system. The diagram shows an on-premises solution design using a shared file system. It explains how data is received as input file from end users and saved as input file in a shared file system. The data is then loaded into the database, transformed, saved as output file in shared file system, and finally sent back to the end users again.
The following diagram shows an Oracle RDS solution design using the Amazon S3 integration option. It explains how data is received as an input file from the end user and saved as input file in shared file system. The data is then loaded into the RDS Oracle database, transformed, and saved as an output file in RDS Oracle local file system. It then copies the output file from RDS local storage to the shared file system, and sends that data to the end users again.
Configuring Amazon S3 integration on the Amazon RDS for Oracle database
To configure your Amazon S3 integration, complete the following steps:
- Open the Amazon RDS console.
- Choose Option groups, as shown in the screenshot preview below.
- Choose Create group.
- For Name, enter a name for your group.
This post enters the nameS3-Integration-group
. - For Description, enter a brief description of the group.
This post enters the descriptionIntegrate RDS-Oracle with S3 to download and upload files from S3 bucket
. - For Engine, from the drop-down menu, choose Oracle engine
- For Major engine revision, from the drop-down menu, choose database Oracle version.
- Choose Create.
- On the RDS console, choose Option groups.
- Choose the group you made.
- Choose Add option.
- For Option, choose S3_INTEGRATION.
- For Version, choose 0.
- For Apply Immediately, select Yes.
- Choose Add Option.
- Verify that the option group has the correct information. Check if the correct oracle version is listed in “Option Group Properties”, and S3_integration option is added in “Options”
Add option group to database
- Choose your database by clicking on database identifier on RDS Dashboard.
- Choose Modify.
- Scroll down the page and in “Database Options” section, Select s3_integration option group in “Option Group”.
- Scroll down to the end of the page and click “Continue”
- Choose Modify DB Instance.
- Once the Instance modification is completed successful and database is available, select “Configuration” tab to confirm that the option group is listed in the instance configuration.
Configuring an IAM role in the Amazon RDS for Oracle database
To configure your IAM role, complete the following steps:
- From the RDS console, choose your DB instance.
- Choose Manage IAM roles.
- For Add IAM roles to this instance, choose rds-s3-integration-role.
- For Feature, choose S3_INTEGRATION.
Loading the input data into a table using sqlloader
Log in to your EC2 instance as a user with access to the Oracle client binary files sqlplus
. Below is a sample input file that we can load as an input file in the database.
Log in to sqlplus to create sample data load table using following commands:
Create .dat file to load data into the database from input file:
Use below sqlldr command to load data into the database:
Log in to SQL*PLUS and confirm that the data is loaded successfully into the table using below SQL command:
Update the data into the database table.
Update the owner as SYSTEM
and status as INVALID
using below SQL*PLUS command:
Creating a text file on local Amazon RDS for Oracle storage
Use below SQL*PLUS command to create local directory in RDS Oracle instance.
Use below SQL*PLUS code to Create the output text file on RDS Oracle local storage using UTL_FILE
.
Verifying the text file is correct
Verify that the text file contains the correct data with the following code:
The following code shows the RDS Oracle local output text file content:
Uploading the text file to an S3 bucket and downloading it to an EC2 instance file system
Execute the below script which uploads the output text file from RDS Oracle local storage to the S3 bucket, and downloads that file from the bucket to an EC2 instance file system.
Use the following code example to configure the parameter values:
The following code is the actual base script that uploads to Amazon S3 and downloads to the file system:
Grant additional execute permission to the above created .sh file, and execute it using the command below.
Uploading to Amazon S3 using Amazon S3 integration is a serialized process; you can upload only one file at a time. If a concurrent process attempts to upload the files, one process waits until the other upload is complete.
Verifying that the file is uploaded to Amazon S3 and downloaded correctly to file system
You can confirm that the file uploaded to Amazon S3 through the S3 console.
To confirm that the file downloaded to your local EC2 file system, enter the following code:
Conclusion
This post described how to use Amazon S3 integration as a secure alternative to shared file systems. The shared script is not restricted to any specific Oracle directory; you can use the same script by changing the directory and file name. For example, you can use it to migrate Amazon RDS for Oracle data pump export dumps to an S3 bucket, and vice versa. As always, AWS welcomes your feedback, so please leave any comments below.
About the Authors
Bhavesh Rathod is an Oracle Database Cloud Architect with the Professional Services team at Amazon Web Services. He works as database migration specialist to help internal Amazon customers to move their on-premises database environment to AWS cloud database solutions.
Samujjwal Roy is a Database Specialty Architect with the Professional Services team at Amazon Web Services. He has been with Amazon for 15+ years and has led migration projects for internal and external Amazon customers to move their on-premises database environment to AWS cloud database solutions.
Amishi Shah is a DevOps Consultant with the Professional Services team at Amazon Web Services. She works with customers to build scalable, highly available and secure solutions in AWS cloud. Her focus area is leading enterprise customers through large technical, organizational, and SDLC transformations.