Use the DBMS_CLOUD package in Amazon RDS Custom for Oracle for direct Amazon S3 integration

Amazon Relational Database Service (Amazon RDS) Custom is a managed database service for legacy, custom, and packaged applications that require access to the underlying operating system (OS) and database environment. Amazon RDS Custom automates database administration and operational tasks while allowing you as administrator to access and customize the underlying OS and database to support third-party applications that require privileged access.

While working with our customers, we realized that at times they need to directly access data stored in Amazon Simple Storage Service (Amazon S3) from Oracle Database using SQL or PL/SQL. Beginning with Oracle Database 19c (19.9 onwards) and Oracle Database 21c (21.3 onwards), you can install the DBMS_CLOUD package, which provides comprehensive support for working with data in object storage.

In this post, we demonstrate how to use the DBMS_CLOUD package to transfer files between S3 buckets and directories in an RDS Custom for Oracle database. We also show how you can access data from Amazon S3 directly using Oracle features such as external tables and hybrid partition tables. The features provided by DBMS_CLOUD could vary between different Oracle releases, so pay close attention to the steps in the post and make sure you reference DBMS_CLOUD in the Oracle Database 19c documentation. To avoid confusion, the option discussed in this post is for RDS Custom for Oracle, not for RDS for Oracle. RDS for Oracle offers S3 integration.

Solution overview

The high-level steps to implement this solution are as follows:

Install the DBMS_CLOUD package and configure access controls.
Configure credentials for the DBMS_CLOUD package.
Create a sample schema for demonstration purposes.
Upload data to an S3 bucket using dbms_cloud.put_object.
Download data from the S3 bucket using dbms_cloud.get_object.
Load data from the S3 bucket directly using dbms_cloud.copy_data.
Access external data on the S3 bucket using dbms_cloud.create_external_table.
Create a hybrid partition table with data in the S3 bucket.
Delete data in the S3 bucket using dbms_cloud.delete_object.
Use DBMS_CLOUD on RDS Custom for Oracle read replicas.

The following diagram illustrates the solution architecture.

solution overview

Prerequisites

This post assumes you have the following prerequisites in place prior to installing and using the DBMS_CLOUD package:

Oracle 19c (19.9 onwards) multi-tenant database running on RDS Custom for Oracle
An S3 bucket
An AWS Identity and Access Management (IAM) user with access key enabled

The following multitenant RDS Custom for Oracle instance orcl was created for this post. As of this writing, DBMS_CLOUD only supports a multitenant container database (CDB) architecture.

For more information on RDS Custom for Oracle database creation, refer to Working with RDS Custom for Oracle. The database details are as follows:

Database version: 19.16 (July 2022 RU)
CDB name: <your CDB name>
PDB name: <your PDB name>
Admin user: admin
Database endpoint: <your RDS endpoint>
Region: <your region>

The S3 bucket <your bucket name> has been created for this post with Amazon S3 server-side encryption in the <your region> Region. For more information on bucket creation, refer to Step 1: Create your first S3 bucket.

To access the S3 bucket from the database, you must have the proper IAM credentials and an IAM policy. Create an IAM user with an IAM policy attached to it. For more information, refer to Creating an IAM user in your AWS account and Managing access keys for IAM users. The role and policy details used in this post are as follows:

IAM user name: <your iam user>
Access key ID: <your access key id>
Secret access key: <your secret access key>
IAM policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "RDSCustomS3Policy",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::<your bucket name>",
                "arn:aws:s3:::<your bucket name>/*"
            ]
        }
    ]
}

This solution involves the creation and utilization of AWS resources. Therefore, it will incur costs on your account, such as RDS Custom for Oracle instance and database storage cost, Amazon S3 storage and data transfer cost, and more. Refer to AWS Pricing for more information.

Install the DBMS_CLOUD package and configure access controls

As of this writing, the DBMS_CLOUD package must deployed by manually running scripts provided by Oracle with sysdba privilege, because it’s not natively integrated by default. We cover the installation and configuration required for feature demonstration in this post. Refer to the My Oracle Support (MOS) note How To Setup And Use DBMS_CLOUD Package (Doc ID 2748362.1) for the latest installation and configuration instructions.

Although there is no database or listener restart required for this step, we recommend you pause RDS Custom automation. The pause helps prevent the modifications from interfering with RDS Custom automation.

Prepare the schema creation script

This step creates the user C##CLOUD$SERVICE and grants appropriate privileges. The user is locked by default so that no connections are directly made as this user. Save the following script into a file named dbms_cloud_install.sql and upload it to the Oracle software owner’s home directory in your RDS Custom database server. In this post, the directory is /home/rdsdb.

@$ORACLE_HOME/rdbms/admin/sqlsessstart.sql
set verify off
-- you must not change the owner of the functionality to avoid future issues
define username='C##CLOUD$SERVICE'
create user &username no authentication account lock;
REM Grant Common User Privileges
grant INHERIT PRIVILEGES on user &username to sys;
grant INHERIT PRIVILEGES on user sys to &username;
grant RESOURCE, UNLIMITED TABLESPACE, SELECT_CATALOG_ROLE to &username;
grant CREATE ANY TABLE, DROP ANY TABLE, INSERT ANY TABLE, SELECT ANY TABLE,
CREATE ANY CREDENTIAL, CREATE PUBLIC SYNONYM, CREATE PROCEDURE, ALTER SESSION, CREATE JOB to &username;
grant CREATE SESSION, SET CONTAINER to &username;
grant SELECT on SYS.V_$MYSTAT to &username;
grant SELECT on SYS.SERVICE$ to &username;
grant SELECT on SYS.V_$ENCRYPTION_WALLET to &username;
grant read, write on directory DATA_PUMP_DIR to &username;
grant EXECUTE on SYS.DBMS_PRIV_CAPTURE to &username;
grant EXECUTE on SYS.DBMS_PDB_LIB to &username;
grant EXECUTE on SYS.DBMS_CRYPTO to &username;
grant EXECUTE on SYS.DBMS_SYS_ERROR to &username;
grant EXECUTE ON SYS.DBMS_ISCHED to &username;
grant EXECUTE ON SYS.DBMS_PDB_LIB to &username;
grant EXECUTE on SYS.DBMS_PDB to &username;
grant EXECUTE on SYS.DBMS_SERVICE to &username;
grant EXECUTE on SYS.DBMS_PDB to &username;
grant EXECUTE on SYS.CONFIGURE_DV to &username;
grant EXECUTE on SYS.DBMS_SYS_ERROR to &username;
grant EXECUTE on SYS.DBMS_CREDENTIAL to &username;
grant EXECUTE on SYS.DBMS_RANDOM to &username;
grant EXECUTE on SYS.DBMS_SYS_SQL to &username;
grant EXECUTE on SYS.DBMS_LOCK to &username;
grant EXECUTE on SYS.DBMS_AQADM to &username;
grant EXECUTE on SYS.DBMS_AQ to &username;
grant EXECUTE on SYS.DBMS_SYSTEM to &username;
grant EXECUTE on SYS.SCHED$_LOG_ON_ERRORS_CLASS to &username;
grant SELECT on SYS.DBA_DATA_FILES to &username;
grant SELECT on SYS.DBA_EXTENTS to &username;
grant SELECT on SYS.DBA_CREDENTIALS to &username;
grant SELECT on SYS.AUDIT_UNIFIED_ENABLED_POLICIES to &username;
grant SELECT on SYS.DBA_ROLES to &username;
grant SELECT on SYS.V_$ENCRYPTION_KEYS to &username;
grant SELECT on SYS.DBA_DIRECTORIES to &username;
grant SELECT on SYS.DBA_USERS to &username;
grant SELECT on SYS.DBA_OBJECTS to &username;
grant SELECT on SYS.V_$PDBS to &username;
grant SELECT on SYS.V_$SESSION to &username;
grant SELECT on SYS.GV_$SESSION to &username;
grant SELECT on SYS.DBA_REGISTRY to &username;
grant SELECT on SYS.DBA_DV_STATUS to &username;
alter session set current_schema=&username;
REM Create the Catalog objects
@$ORACLE_HOME/rdbms/admin/dbms_cloud_task_catalog.sql
@$ORACLE_HOME/rdbms/admin/dbms_cloud_task_views.sql
@$ORACLE_HOME/rdbms/admin/dbms_cloud_catalog.sql
@$ORACLE_HOME/rdbms/admin/dbms_cloud_types.sql
REM Create the Package Spec
@$ORACLE_HOME/rdbms/admin/prvt_cloud_core.plb
@$ORACLE_HOME/rdbms/admin/prvt_cloud_task.plb
@$ORACLE_HOME/rdbms/admin/dbms_cloud_capability.sql
@$ORACLE_HOME/rdbms/admin/prvt_cloud_request.plb
@$ORACLE_HOME/rdbms/admin/prvt_cloud_internal.plb
@$ORACLE_HOME/rdbms/admin/dbms_cloud.sql
@$ORACLE_HOME/rdbms/admin/prvt_cloud_admin_int.plb
REM Create the Package Body
@$ORACLE_HOME/rdbms/admin/prvt_cloud_core_body.plb
@$ORACLE_HOME/rdbms/admin/prvt_cloud_task_body.plb
@$ORACLE_HOME/rdbms/admin/prvt_cloud_capability_body.plb
@$ORACLE_HOME/rdbms/admin/prvt_cloud_request_body.plb
@$ORACLE_HOME/rdbms/admin/prvt_cloud_internal_body.plb
@$ORACLE_HOME/rdbms/admin/prvt_cloud_body.plb
@$ORACLE_HOME/rdbms/admin/prvt_cloud_admin_int_body.plb
-- Create the metadata
@$ORACLE_HOME/rdbms/admin/dbms_cloud_metadata.sql
alter session set current_schema=sys;
@$ORACLE_HOME/rdbms/admin/sqlsessend.sql

Run the install script

Replace the SYS password with your actual SYS password and run the following command.

$ORACLE_HOME/perl/bin/perl $ORACLE_HOME/rdbms/admin/catcon.pl -u sys/<your_sys_password> --force_pdb_mode 'READ WRITE' -b dbms_cloud_install -d /home/rdsdb -l /home/rdsdb dbms_cloud_install.sql

The command runs the dbms_cloud_install.sql script in both the ROOT container and PDBs for Amazon RDS Custom for Oracle. You can retrieve your SYS password from AWS Secrets Manager; the secret name for your SYS user starts with do-not-delete-rds-custom-db-ABCDEFGHIJKLMNOPQRS0123456-, in which db-ABCDEFGHIJKLMNOPQRS0123456 is the RDS resource ID.

After the command has completed successfully, check the logs for any errors. Log in to the CDB ROOT container using the SYS user to verify the package status:

column name format a10
column owner format a20
column object_name format a12
column status format a10
set tab off

select v.name, o.owner, o.object_name, o.status, o.sharing, o.oracle_maintained 
from cdb_objects o join v$containers v on o.con_id=v.con_id 
where o.object_name = 'DBMS_CLOUD' 
order by name;

NAME       OWNER                OBJECT_NAME  STATUS     SHARING            O
---------- -------------------- ------------ ---------- ------------------ -
CDB$ROOT   PUBLIC               DBMS_CLOUD   VALID      METADATA LINK      Y
CDB$ROOT   C##CLOUD$SERVICE     DBMS_CLOUD   VALID      METADATA LINK      Y
CDB$ROOT   C##CLOUD$SERVICE     DBMS_CLOUD   VALID      METADATA LINK      Y
ORCL       PUBLIC               DBMS_CLOUD   VALID      METADATA LINK      Y
ORCL       C##CLOUD$SERVICE     DBMS_CLOUD   VALID      METADATA LINK      Y
ORCL       C##CLOUD$SERVICE     DBMS_CLOUD   VALID      METADATA LINK      Y

6 rows selected

Create an SSL wallet with certificates

DBMS_CLOUD interacts with Amazon S3 using HTTPS protocol, so a wallet is required with these trusted certificates installed. Go to Amazon Trust Services to download Root CAs for the CN name from Amazon Root CA 1 to Amazon Root CA 4, and save them to the Oracle software owner’s home directory ($HOME; the default value is /home/rdsdb) in your RDS Custom database server.

cd $HOME

curl -o AmazonRootCA1.pem https://www.amazontrust.com/repository/AmazonRootCA1.pem
curl -o AmazonRootCA2.pem https://www.amazontrust.com/repository/AmazonRootCA2.pem
curl -o AmazonRootCA3.pem https://www.amazontrust.com/repository/AmazonRootCA3.pem
curl -o AmazonRootCA4.pem https://www.amazontrust.com/repository/AmazonRootCA4.pem

If you have already created an Oracle wallet, skip this step and proceed to the next step to import these Root CAs into the wallet. Otherwise, run following commands to create the wallet using orapki.

You will be prompted with a wallet password. Note down the password; you will need this to modify the wallet in subsequent steps.

$ mkdir -p /rdsdbdata/config/security/wallet/
$ orapki wallet create -wallet /rdsdbdata/config/security/wallet/ -auto_login

Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Enter password:
Enter password again:
Operation is successfully completed.

Run the following commands to import these Amazon Root CAs into the wallet. Input the same wallet password when prompted.

orapki wallet add -wallet /rdsdbdata/config/security/wallet/ -trusted_cert -cert AmazonRootCA1.pem

orapki wallet add -wallet /rdsdbdata/config/security/wallet/ -trusted_cert -cert AmazonRootCA2.pem

orapki wallet add -wallet /rdsdbdata/config/security/wallet/ -trusted_cert -cert AmazonRootCA3.pem

orapki wallet add -wallet /rdsdbdata/config/security/wallet/ -trusted_cert -cert AmazonRootCA4.pem

You can verify these imported certificates as follows:

$ orapki wallet display -wallet /rdsdbdata/config/security/wallet/

Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Trusted Certificates:
Subject:        CN=Amazon Root CA 4,O=Amazon,C=US
Subject:        CN=Amazon Root CA 3,O=Amazon,C=US
Subject:        CN=Amazon Root CA 2,O=Amazon,C=US
Subject:        CN=Amazon Root CA 1,O=Amazon,C=US

Configure the Oracle database to use the SSL wallet

In order for the Oracle database to use this newly create SSL wallet, you need to add the following entries to your $ORACLE_HOME/network/admin/sqlnet.ora file:

WALLET_LOCATION=
  (SOURCE=(METHOD=FILE)(METHOD_DATA=
  (DIRECTORY=/rdsdbdata/config/security/wallet)))

During RDS Custom for Oracle database restoration from snapshot or restoration to a point in time, the content of SQLNET.ORA is reset to default. In these scenarios, follow the instructions in this step to reconfigure SQLNET.ORA.

Configure database ACEs for DBMS_CLOUD

By default, Oracle Database doesn’t allow outgoing communication, and this is controlled by Access Control Entries (ACEs). Save the following script in a file named setup_aces.sql and upload it to the Oracle software owner’s home directory in your RDS Custom database server. Make sure you have modified sslwaletdir to match with your wallet directory.

@$ORACLE_HOME/rdbms/admin/sqlsessstart.sql

-- SSL Wallet directory
define sslwalletdir=/rdsdbdata/config/security/wallet

define clouduser=C##CLOUD$SERVICE

-- Create New ACL / ACEs
begin
    -- Allow all hosts for HTTP/HTTP_PROXY
    dbms_network_acl_admin.append_host_ace(
    host =>'*',
    lower_port => 443,
    upper_port => 443,
    ace => xs$ace_type(
    privilege_list => xs$name_list('http', 'http_proxy'),
    principal_name => upper('&clouduser'),
    principal_type => xs_acl.ptype_db));

    -- Allow wallet access
    dbms_network_acl_admin.append_wallet_ace(
    wallet_path => 'file:&sslwalletdir',
    ace => xs$ace_type(privilege_list =>
    xs$name_list('use_client_certificates', 'use_passwords'),
    principal_name => upper('&clouduser'),
    principal_type => xs_acl.ptype_db));
end;
/

-- Setting SSL_WALLET database property
begin
    if sys_context('userenv', 'con_name') = 'CDB$ROOT' then
        execute immediate 'alter database property set ssl_wallet=''&sslwalletdir''';
    end if;
end;
/

@$ORACLE_HOME/rdbms/admin/sqlsessend.sql

Run the script in the CDB ROOT container with the SYS user:

$ sqlplus sys/<your_sys_password> as sysdba

SQL> @setup_aces.sql

Configure credentials for the DBMS_CLOUD package

In this section, we create a role (<your db role name>) and a user (<your db user name>) with minimal privileges sufficient to demonstrate DBMS_CLOUD functionality. We also configure the user name and password for DBMS_CLOUD to allow it to access Amazon S3 with valid IAM credentials.

Create the database role and user

Create the database role and user with the following SQL commands in the pluggable database (PDB) <your PDB name> with a privileged administrator user, such as ADMIN or SYSTEM. Using a role makes privilege and access control management simpler.

create user <your db user name> identified by <your_password>
default tablespace users;
alter user <your db user name> quota unlimited on users;

create role <your db role name>;
grant create session, create table to <your db role name>;
grant read, write on directory DATA_PUMP_DIR to <your db role name>;
grant execute on dbms_cloud to <your db role name>;

grant <your db role name> to <your db user name>;

Configure ACEs for the demorole in ORCL PDB

DBMS_CLOUD is a package with invoker’s right privilege. You need to enable ACEs as you have done previously for user C##CLOUD$SERVICE. Run the following commands with a privileged administrator user (SYSTEM or SYS) in the <your PDB name> pluggable database:

@$ORACLE_HOME/rdbms/admin/sqlsessstart.sql

-- SSL Wallet directory
define sslwalletdir=/rdsdbdata/config/security/wallet

define cloudrole=<your db role name>

-- Create New ACL / ACEs
begin
    -- Allow all hosts for HTTP/HTTP_PROXY
    dbms_network_acl_admin.append_host_ace(
    host =>'*',
    lower_port => 443,
    upper_port => 443,
    ace => xs$ace_type(
    privilege_list => xs$name_list('http', 'http_proxy'),
    principal_name => upper('&cloudrole'),
    principal_type => xs_acl.ptype_db));

    -- Allow wallet access
    dbms_network_acl_admin.append_wallet_ace(
    wallet_path => 'file:&sslwalletdir',
    ace => xs$ace_type(privilege_list =>
    xs$name_list('use_client_certificates', 'use_passwords'),
    principal_name => upper('&cloudrole'),
    principal_type => xs_acl.ptype_db));
end;
/

@$ORACLE_HOME/rdbms/admin/sqlsessend.sql

Create DBMS_CLOUD credentials

To access data from the S3 bucket, which is not publicly accessible, you need to use a credential to authenticate with Amazon S3. Log in to <your PDB name> with the database user <your db user name> and run the following script to create credentials based on the access key ID and secret access key for the IAM user <your iam user>:

begin
  dbms_cloud.create_credential (
    credential_name => 'cred_demouser',
    username => '<your access key id>',
    password => '<your secret access key>'
  );
end;
/

After the credential is created, you can use the following query to list all credentials:

select credential_name, username, comments from user_credentials;

To verify the setup you have done so far, run the following query. Here, s3.ap-southeast-1.amazonaws.com is the endpoint for the ap-southeast-1 Region. Find the S3 endpoint for your Region and replace it accordingly.

select * from dbms_cloud.list_objects(
    credential_name=> 'cred_demouser',
    location_uri=> 'https://s3.ap-southeast-1.amazonaws.com/<your bucket name>/'
);

Because our S3 bucket is empty, it returns “no rows selected” without other error messages.

Create a sample schema

Complete the following steps to create a sample schema for demonstration purposes:

Connect with the database user <your db user name> to the <your PDB name> database:

sqlplus <your db user name>/<your_password>@<your RDS endpoint>/<your PDB name>

Run the following script to create the emp table with sample data. We keep the dataset small to produce a brief query output in our examples.

create table emp (id int primary key, name varchar2(10), salary int, last_join date);
insert into emp values(1,'user 1', 1000, to_date('2000-01-01','YYYY-MM-DD'));
insert into emp values(2,'user 2', 2000, to_date('2001-01-01','YYYY-MM-DD'));
commit;

Use the following sample dataset:

select * from emp;

        ID NAME           SALARY LAST_JOIN
---------- ---------- ---------- ---------
         1 user 1           1000 01-JAN-00
         2 user 2           2000 01-JAN-01

Write the data into a CSV file named emp.csv in the DATA_PUMP_DIR directory:

declare
  cursor c is select id,name,salary, to_char(last_join,'YYYY-MM-DD') last_join from emp;
  filename varchar2(50) default 'emp.csv';
  f utl_file.file_type;
begin
  f := utl_file.fopen ('DATA_PUMP_DIR',filename,'w');
  for c1 in c loop
    utl_file.put_line(f, c1.id||','||c1.name||','||c1.salary||','||c1.last_join);
  end loop;
  utl_file.fclose(f);
end;
/

Upload data to the S3 bucket using dbms_cloud.put_object

You can use the dbms_cloud.put_object procedure to upload data to your S3 bucket.

Run the following PL/SQL statements to upload the file emp.csv from the RDS Custom database server to the S3 bucket with path s3://<your bucket name>/orcl/emp.csv:

begin
  dbms_cloud.put_object (
    credential_name => 'cred_demouser',
    object_uri      => 'https://s3.ap-southeast-1.amazonaws.com/<your bucket name>/orcl/emp.csv',
    directory_name  => 'DATA_PUMP_DIR',
    file_name       => 'emp.csv');
end;
/

List the objects uploaded to the S3 bucket with the following SQL statements:

column object_name for a20
column last_modified format a35

select object_name, bytes, last_modified 
from dbms_cloud.list_objects(
    credential_name=> 'cred_demouser',
    location_uri=> 'https://s3.ap-southeast-1.amazonaws.com/<your bucket name>'
) 
where object_name = 'orcl/emp.csv';

OBJECT_NAME               BYTES LAST_MODIFIED
-------------------- ---------- -----------------------------------
orcl/emp.csv                 50 25-MAR-23 12.56.56.000000 PM +00:00

You can also write BLOB data directly to the S3 bucket after processing it inside the database, as demonstrated in the following example:

declare
  my_lob blob;
begin
  my_lob := utl_raw.cast_to_raw('some data you want to write to S3 file directly');

  dbms_cloud.put_object (
    credential_name => 'cred_demouser',
    object_uri      => 'https://s3.ap-southeast-1.amazonaws.com/<your bucket name>/orcl/mylob.dat',
    contents	    => my_lob);
end;
/

Download data from the S3 bucket using dbms_cloud.get_object

You can use the dbms_cloud.get_object procedure to download data from the S3 bucket to a directory on the RDS Custom for Oracle database, or read the data directly from Amazon S3 and return BLOB to the database. The procedure has other advanced parameters, including offset and compression options; refer to GET_OBJECT Procedure and Function for more details.

Run the following PL/SQL statements to download the file and save it as emp2.csv on the database server:

begin
  dbms_cloud.get_object (
    credential_name => 'cred_demouser',
    object_uri      => 'https://s3.ap-southeast-1.amazonaws.com/<your bucket name>/orcl/emp.csv',
    directory_name  => 'DATA_PUMP_DIR',
    file_name       => 'emp2.csv');
end;
/

Run the following SQL statements to read mylob.dat from the S3 bucket, which we created earlier:

select to_clob(
  dbms_cloud.get_object (
    credential_name => 'cred_demouser',
    object_uri      => 'https://s3.ap-southeast-1.amazonaws.com/<your bucket name>/orcl/mylob.dat')
) as mylob
from dual;

MYLOB
-------------------------------------------------
some data you want to write to S3 file directly

Load data from the S3 bucket directly using dbms_cloud.copy_data

You can use the dbms_cloud.copy_data procedure to load data from the S3 bucket into existing Oracle tables. In this section, we use the previous uploaded file emp.csv as a source, which is in CSV format. For Avro and Parquet file formats, refer to COPY_DATA Procedure for Avro or Parquet Files.

Create an empty table called emp_copy prior to data loading:

create table emp_copy (id int primary key, name varchar2(10), salary int, last_join date);

Run the following PL/SQL statements to load data from the S3 bucket to the table emp_copy:

begin
  dbms_cloud.copy_data(
    table_name      => 'emp_copy',
    credential_name => 'cred_demouser',
    file_uri_list   => 'https://s3.ap-southeast-1.amazonaws.com/<your bucket name>/orcl/emp.csv',
    format          => json_object('type' value 'csv', 'dateformat' value 'YYYY-MM-DD')
  );
end;
/

Multiple format options are available to meet various file specifications and data processing needs. Refer to DBMS_CLOUD Package Format Options for more information.

During the data load, logs and bad data are written to the COPY$N_LOG and COPY$N_BAD tables, in which N is the unique number for this data load.

Query the table to verify the data loaded:

select * from emp_copy;

        ID NAME           SALARY LAST_JOIN
---------- ---------- ---------- ---------
         1 user 1           1000 01-JAN-00
         2 user 2           2000 01-JAN-01

Access external data on the S3 bucket using dbms_cloud.create_external_table

You can use the procedure dbms_cloud.create_external_table to create an external table in the Oracle database, which reads data directly from data stored in the S3 bucket. We continue using emp.csv as example; for Avro and Parquet file formats, refer to CREATE_EXTERNAL_TABLE Procedure for Avro or Parquet Files. You can create external partition tables to use the Oracle partitioning capability if you have multiple data files. Take note that table partitioning is part of the Oracle Partitioning option, which is a separate license option of Oracle Enterprise Edition.

Run the following PL/SQL statements to create the external table emp_ext. You can specify multiple source files using a comma delimited list for file_uri_list. You can also use wildcards in the file names in your URIs. The character * can be used as the wildcard for multiple characters, and the character ? can be used as the wildcard for a single character.

begin
  dbms_cloud.create_external_table(
    table_name      => 'emp_ext',
    credential_name => 'cred_demouser',
    file_uri_list   => 'https://s3.ap-southeast-1.amazonaws.com/<your bucket name>/orcl/emp.csv',
    column_list     => 'id        int,
                        name      varchar2(10),
                        salary    int,
                        last_join date',
    format          => json_object('type' value 'csv', 'dateformat' value 'YYYY-MM-DD')
 );
end;
/

Query the external table to verify the data:

select * from emp_ext;

        ID NAME           SALARY LAST_JOIN
---------- ---------- ---------- ---------
         1 user 1           1000 01-JAN-00
         2 user 2           2000 01-JAN-01

Create a hybrid partition table with data in the S3 bucket

With the hybrid partition table feature, you can combine historical data in Amazon S3 for cost-efficient long-term retention, and combine your current data in an Oracle database into a single partitioned table. Take note that partitioning tables are part of the Oracle Partitioning option.

Run the following PL/SQL statements to create the hybrid partition table emp_hpt:

begin
  dbms_cloud.create_hybrid_part_table(
    table_name      => 'emp_hpt',
    credential_name => 'cred_demouser',
    format          => json_object('type' value 'csv', 'dateformat' value 'YYYY-MM-DD'),
    column_list     => 'id        int,
                        name      varchar2(10),
                        salary    int,
                        last_join date',
    partitioning_clause=> 'partition by range(last_join)
                           (partition p1 values less than (to_date(''2002-01-01'', ''YYYY-MM-DD''))
                                external location (''https://s3.ap-southeast-1.amazonaws.com/<your bucket name>/orcl/emp.csv''),
                            partition p2 values less than (to_date(''2003-01-01'', ''YYYY-MM-DD''))
                            )'
  );
end;
/

Query the table to verify the record:

select * from emp_hpt;

        ID NAME           SALARY LAST_JOIN
---------- ---------- ---------- ---------
         1 user 1           1000 01-JAN-00
         2 user 2           2000 01-JAN-01

Insert a new record to the partition, in which data is stored inside the database:

insert into emp_hpt values(3,'user 3', 3000, to_date('2002-01-01','YYYY-MM-DD'));
commit;

Query the table again to confirm it contains both records from the S3 bucket and database:

select * from emp_hpt;

        ID NAME           SALARY LAST_JOIN
---------- ---------- ---------- ---------
         3 user 3           3000 01-JAN-02
         1 user 1           1000 01-JAN-00
         2 user 2           2000 01-JAN-01

Delete data in the S3 bucket using dbms_cloud.delete_object

To delete files from the S3 bucket, you can use the procedure dbms_cloud.delete_object to delete the specific object in the object storage.

Drop these tables using emp.csv as the external data source:

drop table emp_ext;
drop table emp_hpt;

Delete emp.csv and mylob.dat from the S3 bucket:

begin
  dbms_cloud.delete_object (
    credential_name => 'cred_demouser',
    object_uri      => 'https://s3.ap-southeast-1.amazonaws.com/<your bucket name>/orcl/emp.csv');
end;
/

begin
  dbms_cloud.delete_object (
    credential_name => 'cred_demouser',
    object_uri      => 'https://s3.ap-southeast-1.amazonaws.com/<your bucket name>/orcl/mylob.dat');
end;
/

Use DBMS_CLOUD on RDS Custom for Oracle read replicas

With Oracle Active Data Guard, you can open RDS Custom for Oracle replica databases in read-only mode and use them to run reporting applications, ad hoc queries, and data extraction jobs to offload your primary database. Oracle Active Data Guard is a separately licensed option on Oracle Enterprise Edition.

In the previous steps for DBMS_CLOUD package installation, two steps made changes outside the database. These changes are not replicated to replicas via Data Guard Redo Transportation Services:

Create an SSL wallet with certificates
Configure the Oracle database to use the SSL wallet

For your existing read replicas, you must follow the same steps to configure the SSL wallet and Oracle database to use the wallet. You can choose to create a new SSL wallet and import the ROOT CA certificates or copy the SSL wallet folder from the primary database.

In the scenario of read replicas created after the DBMS_CLOUD package is installed on the primary database, you only need to modify the SQLNET.ORA file on the replicas to configure the Oracle database to use the wallet.

You won’t be able to create an external table or hybrid partition table on the read replicas, because databases are in read-only mode. You can create these tables in the primary database and query them from read replicas to access data.

Clean up

To avoid ongoing costs, delete the resources that you created as part of this post.

Conclusion

In this post, we showed you how to install the DBMS_CLOUD package and configure it to allow the least privileged database user to access data in an S3 bucket. With this package, you can perform common operations on the S3 bucket, such as list file, upload file, download file, and delete file, directly from your Oracle database. We also showed how to copy data from the S3 bucket to an existing table, as well as how to access this data in the S3 bucket through an external table or hybrid partition table.

If you have any comments or questions, leave them in the comments section.

About the Author

Donghua Luo is a Senior RDS Specialist Solutions Architect. He works with AWS customers to migrate and achieve higher flexibility, scale, and resiliency with database services in AWS cloud.

AWS Database Blog