AWS Storage Blog

Using AWS Storage Gateway to store Oracle database backups in Amazon S3

Customers running Oracle databases often take database and schema backups using Oracle native tools like Data Pump and RMAN to satisfy data protection, disaster recovery (DR), and compliance requirements. The backups that are taken are often stored on NFS shares. Due to a lack of scalability and the higher cost of storage in on-premises environments, the backups are often deleted after a limited retention period, preventing the backups from serving their purpose of storing and restoring data.

Limited retention periods can be avoided by taking database backups directly on Amazon S3 using AWS Storage Gateway, reducing customers’ on-premises storage cost and footprint, while achieving significant benefits of scalability, durability, and cost reduction.

AWS Storage Gateway is a hybrid cloud storage service that provides cost-effective access to virtually unlimited cloud storage for on-premises environments.

Customers use AWS Storage Gateway for three main reasons:

  1. To move backups and archives into cloud storage.
  2. To reduce on-premises storage footprint by using cloud-backed file shares.
  3. To provide low-latency access to data residing in AWS from on-premises applications.

In this post, I describe a method to back up Oracle databases from on-premises or Amazon EC2 to Amazon S3 using AWS Storage Gateway.

Solution walkthrough

Oracle database backups using RMAN or Data Pump tools can be stored directly into Amazon S3 using Storage Gateway by completing the following steps:

  1. Create an Amazon S3 File Gateway using AWS Storage Gateway.
  2. Create an NFS file share and map it to an S3 bucket.
  3. Mount the NFS file share on the database host(s).
  4. Backup Oracle database(s).
  5. Restore Oracle database(s).

Using AWS Storage Gateway to store Oracle database backups in Amazon S3

Prerequisites

For this walkthrough, the following prerequisites are necessary:

  • An AWS account
  • An S3 bucket where you want to store backups
  • Oracle database instance running on on-premises environment or Amazon EC2

This post lists the steps that have been tested on EC2, however the same setup can be run on on-premises databases as well.

1. Create an Amazon S3 File Gateway using AWS Storage Gateway

With an Amazon S3 File Gateway, you store and retrieve objects in S3 with a local cache for low latency access to your most recently used data. Create an Amazon S3 File Gateway using the instructions provided in this documentation.

If you create your gateway on-premises, you deploy the hardware appliance, or download and deploy a gateway VM, and then activate the gateway. If you create your gateway on an Amazon EC2 instance, you launch an Amazon Machine Image (AMI) that contains the gateway VM image and then activate the gateway.

The Amazon S3 File Gateway caches the data locally and uploads it to S3 in the background. The local cache data gives low latency access to the data that is frequently used. The size of the cache can be set to the size of largest backup file and can reach up to 64 TB.

2. Create NFS file share and map it to S3 bucket

Following these instructions, create an NFS file share mapped to the S3 bucket where the backups will be to be stored.

To achieve low network latency, it is recommended to use an S3 bucket in the AWS Region nearest to the location where the S3 File Gateway is installed.

NFS file share creation must specify an S3 storage class; the documentation on creating an NFS file share lists the storage classes you can choose from in step number 11. For Oracle backups, choosing the S3 Standard storage class gives you the ability to quickly access the backup files without additional cost.

Use S3 Lifecycle policies to migrate the backups to lower storage class in order to save audit data that is rarely accessed. The S3 bucket used in this example is oracle-export-test, as seen in the following screenshot.

The S3 bucket used in this example is oracle-export-test

3. Mount NFS file share on database host

Using the steps described in this documentation, mount the NFS file share that you created on the database host.

You can create a mount point on the database server and use the instructions in the preceding documentation to mount the file share.

To test the backup and restore of database, I have mounted the NFS share on the database host as shown in the following screenshot.

To test the backup and restore of database, I have mounted the NFS share on the database host

4. Backup Oracle database

You can now use the NFS file share that is mounted on the database server to take backups on either using Oracle Data Pump or RMAN.

Backup using Oracle Data Pump

An Oracle directory is created that points to the new mount point, which in turns maps to the S3 bucket.

In this post, a directory ORACLE_S3_BACKUPS is created, as shown in the following command:

create directory ORACLE_S3_BACKUPS as '/exports_oracle';

The Data Pump utility is used to take a schema level backup using the following command:

expdp <username>/<password> schemas=<schema_name> dumpfile=<dumpfile_name> logfile=<logfile_name> directory=ORACLE_S3_BACKUPS

The following screenshot shows the output of the preceding command, with a successfully taken Data Pump backup.

Successfully taken Data Pump backup - command output

Go to your Amazon S3 bucket, and verify that the backup has been uploaded into the bucket.

Verify that the backup has been uploaded into your Amazon S3 bucket

Backup using Oracle RMAN

In this section, I take a database backup using Oracle RMAN and restore the schema in the following section.

You can use Oracle RMAN utility to take backups of Oracle databases. For the backups that you are storing in Amazon S3, the parameter db_recovery_file_dest must be set on the database mapping to the NFS mount point created in earlier steps. To set a hard limit on space to be used, parameter db_recovery_file_dest_size must be defined for files created in the db_recovery_file_dest location.

More details on the types of backups that can be taken using RMAN are given in the documentation on RMAN basic concepts.

As a first step, the parameter db_recovery_file_dest is set pointing to the NFS share mounted in earlier steps, and db_recovery_file_dest_size should be defined using the following command.

alter system set db_recovery_file_dest='<nfs share mount point>' scope=both;
alter system set db_recovery_file_dest_size='<size limit for nfs share mount point>' scope=both;

The following screenshot shows an example output after setting up the parameters db_recovery_file_dest and db_recovery_file_dest_size.

Example output after setting up the parameters 'db_recovery_file_dest' and 'db_recovery_file_dest_size.'

Run RMAN backup on the database server using the following command:

rman target / <<EOF
run {
CONFIGURE DEFAULT DEVICE TYPE TO DISK;
CONFIGURE DEVICE TYPE DISK PARALLELISM 3;
BACKUP DATABASE PLUS ARCHIVELOG;
}
EOF

Go to your Amazon S3 bucket, and verify that the backup has been uploaded into the bucket.

Verify that the backup has been uploaded into the bucket

When backing up Oracle database with parallel processing enabled using RMAN or Data Pump, you can use multiple cached disks that scale up the cache when creating your Amazon S3 File Gateway to improve performance leveraging parallelism, rather than using a single disk.

The documentation on configuring local disk storage describes the steps to add Amazon EBS volumes to Amazon S3 File Gateway, enabling the use multi cache disks for enhancing performance.

All the backup files are uploaded into Amazon S3 in the background as the Storage Gateway writes the files to the shares. Transport Layer Security encryption protects the data while in transit and Content-MD5 headers are used for data integrity.

Amazon CloudWatch notification can be setup using the NotifyWhenUploaded API to notify after the backup job completes and uploads to S3. For more information, refer to this documentation on getting notified about file operations.

Depending on the backup file size and network bandwidth between your Amazon S3 File Gateway and AWS, the backup file can take time to transfer from the local cache of the S3 File Gateway to S3 bucket. For example, with a dedicated 1-GB/s network link, it would take approximately 10 minutes to upload a 500-GB backup file to S3. You can use CloudWatch metrics to monitor the traffic.

5. Restore Oracle database

You can now restore the database from the backup that was taken using either Oracle Data Pump or RMAN from the NFS file share that is mounted on the database server.

Using Oracle Data Pump

Once backup is successfully completed, restore of the backup can be done on the same database instance or onto a separate database instance by using the Data Pump import utility.

To restore the backup that was taken earlier using Data Pump, you can use the following command:

impdp <username>/<password> remap_schema=<schema_name>:<new_schema_name> 
dumpfile=<dumpfile_name> logfile=<logfile_name> directory=ORACLE_S3_BACKUPS

The following screenshot shows an example of output of Data Pump restore that was successfully completed:

Example of output of Data Pump restore that was successfully completed

Using Oracle RMAN

Using RMAN restore, databases can be restored on the same server or different server. RMAN backups can also be used for disaster recovery scenarios across different Regions in AWS as well.

In this section, the restore was completed onto the same database instance with the following command:

sqlplus / as sysdba<<EOF
startup mount
exit
EOF

rman target / <<EOF
run {
CONFIGURE DEFAULT DEVICE TYPE TO DISK;
CONFIGURE DEVICE TYPE DISK PARALLELISM 3;
RESTORE DATABASE;
RECOVER DATABASE;
alter database open;
}
EOF

In some cases, depending on how recently a backup was taken and its file size, some or all the file data might be on the Amazon S3 File Gateway cache. Depending upon how recently you created the backup and its file size, some or all the file data might be on the S3 File Gateway cache. As a best practice, you should size your cache to be greater than the size of your largest backup.

The backups can also be restored across Regions onto a different database instance by one of the two following methods:

  1. Enabling Cross-Region Replication on the Amazon S3 bucket, which replicates the files across Regions providing ability to restore the backups on targeted Region.
  2. By configuring a second Amazon S3 File Gateway in the DR Region that is pointed at the original bucket as shown in following diagram:

configuring a second Amazon S3 File Gateway in the DR Region that is pointed at the original bucket

When restoring data onto a server in a different Region, the time to restore a database might increase due to data not being cached on the local cache of Amazon S3 File Gateway, in which case, data must be read from S3. The restoration time depends on the file size and network bandwidth.

Cleaning up

The services involved in this solution incur costs – clean up the example resources created while running this reference deployment to not incur additional cost. For additional information on how to clean up resources associated with AWS Storage Gateway, refer to this documentation.

Summary

In this blog post, I showed how using AWS Storage Gateway, Oracle database backups can be stored on S3 from on-premises or from Amazon EC2 database instances.

Database backups generally are huge in size and require a lot of disk space for storing of data for longer duration. Due to high cost and lack of scalability the backups often have a limited retention period in the on-premises environment.

With the approach described in this blog, you can achieve a cost-effective solution with virtually unlimited and durable storage for your database backups with much higher retentions that what is supported on on-premises environment. This method also gives you the flexibility to store the backups across multiple Regions and use them in the event of disaster recovery.

Thank you for reading this blog post on using AWS Storage Gateway to store Oracle database backups in Amazon S3. If you have any comments or questions, please leave them in the comments section.