AWS Database Blog

Implement Oracle GoldenGate high availability in the AWS Cloud

The need to move data from one location to another in an asynchronous manner is a goal for many enterprises. Use cases might include migrating data to a reporting database, moving applications from on premises to the cloud, storing a redundant copy in another data center, configuring active/active databases across geographic locations, and performing heterogeneous database migrations (migrating data from Oracle to PostgreSQL). For many of these use cases you can use database replication tools such as AWS Database Migration Service (AWS DMS) or Oracle GoldenGate.

Failures are a given and everything will eventually fail over time. In addition to the databases being highly available, the database replication tool should be highly available. In this post, we show you how to configure a high availability GoldenGate environment to replicate data between two Amazon Relational Database Service (Amazon RDS) for Oracle instances.

Oracle GoldenGate

Oracle GoldenGate asynchronously replicates data between databases. These databases can reside on premises or in the cloud. The databases can be self-managed or a cloud managed service, such as Amazon Relational Database Service (Amazon RDS). Oracle GoldenGate supports many different databases, including, but not limited to, Oracle, Amazon Aurora MySQL, Amazon Aurora PostgreSQL, Amazon RDS for Oracle, and Microsoft SQL Server.

In addition to migrating data and keeping two or more databases in sync, you can use GoldenGate to upgrade to a new major version of Oracle. GoldenGate bi-directional replication can enable database migrations and database upgrades with near-zero downtime.

This post focuses on creating a high availability, bi-directional GoldenGate hub environment running on Amazon Elastic Compute Cloud (Amazon EC2) to replicate data between two RDS for Oracle instances. You must create the Amazon RDS for Oracle instances prior to creating the GoldenGate hubs. Additionally, because this is a bi-directional replication environment, you must set up each RDS for Oracle instance as a source and a target database.

For instructions on installing and configuring the GoldenGate hubs, see Using Oracle GoldenGate with Amazon RDS. You must perform the GoldenGate installation on both EC2 instances.

You can use GoldenGate within an AWS Region or across multiple Regions. The following are some relevant use cases:

  • Replicate between an on-premises instance and RDS instance
  • Replicate between an on-premises instance and EC2-hosted instance
  • Replicate between an EC2-hosted instance and RDS instance
  • Replicate between two RDS instances
  • Replicate between two EC2-hosted instances

Solution Overview

The following diagram shows an active/passive GoldenGate high availability environment. GoldenGate binaries reside on an Amazon Elastic Block Store (Amazon EBS) volume. GoldenGate shared files are stored on an Amazon Elastic File System (Amazon EFS) volume.

During normal operating procedures, the GoldenGate EC2 instance in Availability Zone 1 processes bi-directional replication between the two Oracle databases (Flow 1). In the event that the GoldenGate EC2 instance in Availability Zone 1 is unresponsive, the instance in Availability Zone 2 should be enabled to process bi-directional replication (Flow 2).

All production data should be encrypted at rest. You should enable EBS volume encryption, Amazon EFS volume encryption, and Amazon RDS for Oracle encryption using server-side encryption (SSE) with AWS Key Management Service (AWS KMS). Encrypting the Amazon EFS volumes allows you to encrypt GoldenGate trail files. You can also encrypt Amazon RDS for Oracle with Oracle Advanced Security Option if Oracle Enterprise Edition is deployed.

Data in transit should also be encrypted to the RDS for Oracle instances. Amazon RDS for Oracle supports Transport Layer Security (TLS) versions 1.0 and 1.2. To use the Oracle Secure Sockets Layer (SSL) option, add the Oracle SSL option to the option group associated with the Oracle instance.

The RDS for Oracle instances should have Multi-AZ enabled to provide high availability for the RDS instances. In case of an infrastructure failure, Amazon RDS performs an automatic failover to the standby database, so you can resume database operations as soon as the failover is complete. Because the endpoint for the database instance remains the same after a failover, your application can resume database operation without the need for manual administrative intervention.

Oracle GoldenGate high availability architectures

If the node or instance that is hosting GoldenGate goes down, bi-directional replication between the databases halts. The source and target databases continue to service user requests; however, data isn’t replicated between the databases. The GoldenGate node or instance can fail due to inadvertently stopping a node or instance, inadvertently removing files from the GoldenGate home directory, the root file system reaches capacity, network access and connectivity problems, mistakenly killing a GoldenGate process, unknowingly unmounting an EBS volume, or other compute-related problems.

Amazon EFS is required for this solution. You can use Amazon EFS to store GoldenGate files (parameter files, trail files, report files, checkpoint files, process status files, temporary files, credential files, SQL script files, wallet files) that are shared between the two EC2 instances.

The following screenshot shows the required Amazon EFS file system associated with a GoldenGate high availability environment. The mount target ID is required when mounting Amazon EFS on the EC2 instances. Additionally, the security group listed on the Amazon EFS Network tab should point to the security group of the GoldenGate EC2 instances.

GoldenGate creates required subdirectories through the GoldenGate Software Command Interface (GGSCI). The create subdirs command is issued from GGSCI and subdirectories are created in the GoldenGate home directory (in this case, an Amazon EBS volume). However, because these files must be shared via Amazon EFS, you shouldn’t run the create subdirs GGSCI command. Instead, you must create symbolic links (also known as soft links) that point to Amazon EFS. There is a two-step process to create the required 15 symbolic links.

The following code creates the Amazon EFS mount point for shared GoldenGate files:

$ sudo mount -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-nccccnnn.efs.us-east-1.amazonaws.com:/ /u01/apps/oracle/gg-mount-point

$ cd /u01/apps/oracle/gg-mount-point

$ df -h .
Filesystem                                Size  Used Avail Use% Mounted on
fs-nccccnnn.efs.us-east-1.amazonaws.com:/ 8.0E  3.0M 8.0E  1%     /u01/apps/oracle/gg-mount-point

First, create the subdirectories on Amazon EFS. The following code shows the commands to create the subdirectories. You should run these commands using the operating system user that installs GoldenGate (typically named oracle). This operating system user is not the root user and doesn’t require root privileges. For more information about required privileges, see Operating System Privileges.

$ mkdir /u01/apps/oracle/gg-mount-point/gg/BR
$ mkdir /u01/apps/oracle/gg-mount-point/gg/br
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirchk
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirrpt
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirpcs
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirdat
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirbdb
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirprm
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirwlt
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dircrd
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirsql
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirdef
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirtmp
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirdmp
$ mkdir /u01/apps/oracle/gg-mount-point/gg/dirdsc

Second, create the symbolic links from the GoldenGate home directory:

$ ln -s /u01/apps/oracle/gg-mount-point/gg/BR BR
$ ln -s /u01/apps/oracle/gg-mount-point/gg/br br
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirchk dirchk
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirrpt dirrpt
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirpcs dirpcs
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirdat dirdat
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirbdb dirbdb
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirprm dirprm
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirwlt dirwlt
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dircrd dircrd
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirsql dirsql
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirdef dirdef
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirtmp dirtmp
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirdmp dirdmp
$ ln -s /u01/apps/oracle/gg-mount-point/gg/dirdsc dirdsc

The symbolic links appear as illustrated in the following screenshot.

$ ls -l | grep mount
lrwxrwxrwx.  1 oracle oinstall     37 Jul 19  2019 br ->  /u01/apps/oracle/gg-mount-point/gg/br
lrwxrwxrwx.  1 oracle oinstall     37 Jul 19  2019 BR -> /u01/apps/oracle/gg-mount-point/gg/BR
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirbdb -> /u01/apps/oracle/gg-mount-point/gg/dirbdb
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirchk -> /u01/apps/oracle/gg-mount-point/gg/dirchk
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dircrd -> /u01/apps/oracle/gg-mount-point/gg/dircrd
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirdat -> /u01/apps/oracle/gg-mount-point/gg/dirdat
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirdef -> /u01/apps/oracle/gg-mount-point/gg/dirdef
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirdmp -> /u01/apps/oracle/gg-mount-point/gg/dirdmp
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirdsc -> /u01/apps/oracle/gg-mount-point/gg/dirdsc
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirpcs -> /u01/apps/oracle/gg-mount-point/gg/dirpcs
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirprm -> /u01/apps/oracle/gg-mount-point/gg/dirprm
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirrpt -> /u01/apps/oracle/gg-mount-point/gg/dirrpt
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirsql -> /u01/apps/oracle/gg-mount-point/gg/dirsql
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirtmp -> /u01/apps/oracle/gg-mount-point/gg/dirtmp
lrwxrwxrwx.  1 oracle oinstall     41 Jul 19  2019 dirwlt -> /u01/apps/oracle/gg-mount-point/gg/dirwlt

The higher level gg directory can’t be symbolically linked. The symbolic links must only be created for the dir* directories shown in the preceding screenshot. There are files specific to the GoldenGate installation that must not be shared across GoldenGate environments, specifically, GoldenGate error logs, GoldenGate libraries, and GoldenGate executables.

Symbolic links are supported for GoldenGate subdirectories. Oracle has published a My Oracle Support (MOS) document validating this configuration. The MOS Doc ID is 1366597.1 and is titled “How to Modify the Location for Subdirectories Created by CREATE SUBDIRS In GoldenGate?”

Get started

You need to run the GoldenGate hub on EC2 instances. The instances must have adequate CPU, memory, and storage to handle the anticipated replication volume. After you determine the CPU and memory requirements, select a current generation EC2 instance type for the GoldenGate hub. Current generation instance types, such as instances built on the AWS Nitro System, support hardware virtual machines (HVM). HVM Amazon Machine Images (AMI) are required to take advantage of enhanced networking and also offer increased security. The following virtualized instances are built on the Nitro System and are recommended for GoldenGate hubs: C5, M5, R5, R5b, T3 and z1d.

After you create your EC2 instances in a private subnet for the GoldenGate active/passive hub, restrict network access for inbound and outbound traffic to a least privilege model. GoldenGate requires a port for the GoldenGate manager process and up to 250 dynamic ports for the process to assign to the GoldenGate collector and GoldenGate replicat processes. You should set up three inbound rules for the EC2 instance. The first port is 22 for SSH access. The second port is for the GoldenGate manager process. The third port range is for the 250 dynamic ports list. The security group source should refer to a security group in the AWS account, prefix lists, or a specific set of IP addresses (using the x.x.x.x/32 format). The security group source shouldn’t use Classless Inter-Domain Routing (CIDR).

If the minimum memory requirement for GoldenGate is not met, when a GoldenGate process starts, the process abends and errors may not be logged. Therefore, ensure that enough memory has been allocated using the following calculations:

  • The GoldenGate manager process consumes at least 75 MB of memory
  • Each GoldenGate extract process consumes at least 550 MB of memory
  • Each GoldenGate replicat process consumes at least 110 MB of memory

Therefore, to set up GoldenGate bi-directional replication on an EC2 instance, at least 1.4 GB of memory is required for GoldenGate. This calculation includes one manager process, two extract processes, and two replicat processes. Because the hub is handling the extract and replicat processes, the pump process isn’t required.

Remember, don’t run the create subdirs GGSCI command. Instead, create symbolic links that point to Amazon EFS, as we described in the previous section.

To replicate a single table (jobcode) in a specific schema (hr) using GoldenGate bi-directional replication, refer to the following DDL statements and GoldenGate parameter files. You should run all DDL and DML statements using the hr Oracle database user. Database administrative privileges are not required to create and manipulate the HR.JOBCODE table. You can connect to the Amazon RDS for Oracle instance using Oracle SQL*Developer, SQL*Plus or your preferred tool.

Create a table in the database using the following commands:

CREATE TABLE hr.jobcode
    ( job_id       number(4)
    , job_title    VARCHAR2(35) CONSTRAINT job_title_nn NOT NULL
    , min_salary   NUMBER(6)
    , max_salary   NUMBER(6)
    ) ;

ALTER TABLE hr.jobcode
         ADD ( CONSTRAINT job_id_pk
   PRIMARY KEY (job_id)
   ) ; 

On the GoldenGate EC2 instances, edit the GLOBALS parameter file in the GoldenGate home directory. The user can edit the file directly from the OS. The GLOBALS file stores parameters that relate to the Oracle GoldenGate instance as a whole. The GoldenGate admin user and GoldenGate checkpoint table must be defined. The GoldenGate admin user should have been created while installing Oracle GoldenGate.

GGSCHEMA oggadm1
CheckpointTable oggadm1.ggchkpt

Edit the GoldenGate manager parameter file from the GoldenGate home directory. The user can edit the file directly from the OS or using GGSCI with “edit param mgr”. These parameters identify the valid ports and allows GoldenGate to delete old trail files after confirming that Manager has processed all of the data within the trail file.

PORT 7809
dynamicportlist 9001-9250
PurgeOldExtracts ./dirdat/*, UseCheckpoints, MINKEEPDAYS 5

Create the first extract parameter file for the first RDS for Oracle instance. Parameter descriptions can be found in GoldenGate documentation.

EXTRACT extrds1
USERIDALIAS rds1
DDL EXCLUDE ALL
DDLOPTIONS REPORT
TRANLOGOPTIONS EXCLUDEUSER OGGADM1
TRANLOGOPTIONS INTEGRATEDPARAMS (MAX_SGA_SIZE 1024)
EXTTRAIL ./dirdat/r1
TABLE hr.jobcode;

Create the second extract parameter file for the second RDS for Oracle instance.

EXTRACT extrds2
USERIDALIAS rds2
DDL EXCLUDE ALL
DDLOPTIONS REPORT
TRANLOGOPTIONS EXCLUDEUSER OGGADM1
TRANLOGOPTIONS INTEGRATEDPARAMS (MAX_SGA_SIZE 1024)
EXTTRAIL ./dirdat/r2
TABLE hr.jobcode;

Create the first replicat parameter file for the first RDS for Oracle instance.

REPLICAT reprds1
USERIDALIAS rds2
ASSUMETARGETDEFS
DDLOPTIONS REPORT
DDLERROR DEFAULT IGNORE
DISCARDFILE ./dirdsc/reprds2.dsc, APPEND
DISCARDROLLOVER at 00:00
MAP hr.jobcode, TARGET hr.jobcode;

Create the second replicat parameter file for the second RDS for Oracle instance.

REPLICAT reprds2
USERIDALIAS rds1
ASSUMETARGETDEFS
DDLOPTIONS REPORT
DDLERROR DEFAULT IGNORE
DISCARDFILE ./dirdsc/reprds1.dsc, APPEND
DISCARDROLLOVER at 00:00
MAP hr.jobcode, TARGET hr.jobcode;

We use the USERIDALIAS command in the parameter files. We recommend using a GoldenGate credential store as an alternative to storing usernames and passwords in plain text parameter files.

Also, we use the TRANLOGOPTIONS EXCLUDEUSER OGGADM1 command to exclude transactions in the extract process if the listed user performs DML statements. This prevents continuous replication looping. This also confirms that the OGGADM1 database user DML statements in the source database aren’t replicated to the target database. However, all other database user DML statements against the source tables are replicated to the target database.

You can also configure GoldenGate automatic conflict detection and resolution (CDR) using the latest timestamp method.

GoldenGate has an OBEY command. Use OBEY to process a file that contains a list of GoldenGate commands. OBEY is useful for running commands that are frequently used in sequence. You can use OBEY to start the GoldenGate manager process and subsequent processes. You can use the following OBEY file to start the bi-directional replication processes on the GoldenGate hub.

The following code is the OBEY GoldenGate startup script (startreplication.oby). GGSCI should have already been leveraged to issue the add trandata, add extract, add exttrail, register extract and add replicat commands.

start mgr
start extract extrds1
start extract extrds2
start replicat reprds1
start replicat reprds2

You can run the OBEY script using GGSCI. However, prior to starting the GoldenGate processes, make sure GoldenGate processes aren’t running on the other GoldenGate EC2 instance (passive). You can accomplish this by running the command ps -fe | grep PARAMFILE | grep -v color | wc -l on the other GoldenGate EC2 instance (passive). If the result of this command is 0, GoldenGate isn’t running on the EC2 instance. Run the OBEY script with the following code:

GGSCI> OBEY ./dirsql/startreplication.oby

To test the active/passive GoldenGate bi-directional replication solution, perform the following steps:

  1. Verify Amazon EFS is mounted on the first GoldenGate EC2 instance (GG1).
  2. Start the replication processes (startreplication.oby) on the first GoldenGate EC2 instance (GG1):
GGSCI> OBEY ./dirsql/startreplication.oby
  1. Issue multiple DML statements (inserts, updates, deletes) against the hr.jobcode table on database one. Don’t use the GoldenGate administration user (OGGADM1) to issue the DML statements.
insert into hr.jobcode values (1,'President',20080, 40000);
insert into hr.jobcode values (2,'Administration Vice President',15000, 30000);
insert into hr.jobcode values (3,'Administration Assistant',3000, 6000);
insert into hr.jobcode values (4,'Finance Manager',8200, 16000);
insert into hr.jobcode values (5,'Accountant',4200, 9000);
commit;
  1. Issue multiple DML statements (inserts, updates, deletes) against the hr.jobcode table on database two:
insert into hr.jobcode values (6,'Accounting Manager',8200, 1600);
insert into hr.jobcode values (7,'Public Accountant',4200, 9000);
insert into hr.jobcode values (8,'Sales Manager',10000, 20800);
insert into hr.jobcode values (9,'Sales Representative',6000, 12008);
insert into hr.jobcode values (10,'Purchasing Manager',8000, 15000);
update hr.jobcode set max_salary = 50000 where job_id = 1;
commit;
  1. Validate all modifications were properly synchronized between the databases.

The following diagram shows the expected records for the hr.jobcode table in both databases.

  1. Stop the first GoldenGate EC2 instance (GG1) via the AWS Management Console, AWS API, or AWS Command Line Interface (AWS CLI).

At this point, GoldenGate isn’t running on either EC2 instance. If data is modified in either RDS for Oracle instance while the GoldenGate instances are down, the data is synchronized after the GoldenGate processes are started on one of the EC2 instances. The GoldenGate trail files are stored in Amazon EFS. Amazon EFS file systems are elastic, and automatically grow and shrink as you add and remove files. You do not provision file system size up front, and you pay only for what you use. GoldenGate knows which files were processed and trail files that have not been processed will not be purged. If GoldenGate is not started within a specific period, the trail files that have not been fully processed will still reside in Amazon EFS.

  1. Issue additional DML statements (inserts, updates, deletes) against the hr.jobcode table on database one:
    insert into hr.jobcode values (11,'Oracle DBA',88000, 140000);
    commit;
    
  1. Issue additional DML statements (inserts, updates, deletes) against the hr.jobcode table on database two:
    insert into hr.jobcode values (12,'AWS Developer',89000, 150000);
    delete from hr.jobcode where job_id = 6;
    delete from hr.jobcode where job_id = 8;
    commit;
  1. Validate that none of the modifications are synchronized.
  2. Start the second GoldenGate EC2 instance (GG2), mount Amazon EFS, and start the GoldenGate processes (startreplication.oby).
  3. Validate the DML statements in Steps 7 and 8 are synchronized in both databases.

The following diagram shows the expected records for the hr.jobcode table in both databases.

If GoldenGate abends due to data or configuration problems, don’t fail over to the passive GoldenGate hub. Resolve the problem on the active GoldenGate hub. If the extract or replicat abends, it can be automatically restarted on the GoldenGate hub. You can configure this using the AUTORESTART parameter in the GoldenGate parameter file.

Additionally, when a GoldenGate hub instance is started, Amazon EFS should be automatically mounted and the GoldenGate processes should be automatically started.

You can determine GoldenGate latency via the Oracle database (GV$% views) or via GGSCI. The following diagram illustrates GoldenGate extract and replicat latency for an Oracle database via GGSCI.

You can configure, manage, and monitor GoldenGate automatic conflict detection and resolution. For more information, see Configuring Conflict Detection and Resolution.

The GoldenGate hub should undergo a stress test to ensure that it can scale easily up to twice the amount of expected traffic. Additionally, you should monitor the GoldenGate EC2 instances via Amazon CloudWatch and publish notifications via Amazon Simple Notification Service (Amazon SNS). If CPU utilization exceeds 80% or I/O throughput is throttled, consider changing the EC2 instance type, EBS volume type, or Provisioned IOPS setting. If the EC2 instance system status check isn’t set to running, an SNS notification should be sent to the GoldenGate administrator to fail over the GoldenGate processes to the secondary EC2 instance.

You should also set up CloudWatch alarms for AWS CloudTrail events. For example, a CloudWatch alarm should be triggered when configuration changes occur on security groups. This alerts the operations team when someone attempts to gain access to the GoldenGate EC2 instances.

We recommend using CloudWatch Logs to store logs (including the GoldenGate error logs). For security purposes, make sure the HOSM (execution) role used is following the least privilege design. We suggest only giving access to PutLogEvents to the EC2 instances.

Periodically (every 2 weeks at minimum), run a scan against your EC2 instances and verify compliance. You can accomplish this by using Amazon Inspector. Amazon Inspector is an automated security assessment service that helps improve the security and compliance of applications deployed on AWS. Amazon Inspector automatically assesses applications for exposure, vulnerabilities, and deviations from best practices. After performing an assessment, Amazon Inspector produces a detailed list of security findings prioritized by level of severity. You can review these findings directly or as part of detailed assessment reports that are available via the Amazon Inspector console or API.

Conclusion

In this post, we created an active/passive GoldenGate hub environment. This architecture illustrates how to replicate two RDS for Oracle instances asynchronously and bi-directionally. If the active GoldenGate hub experiences an interruption or is inadvertently shut down, no transactions are omitted from the replication stream when the passive GoldenGate hub is activated.

If you want to try out GoldenGate with Amazon RDS, refer to Using Oracle GoldenGate with Amazon RDS.


About the Author

 

Marvin Vinson is an Oracle Database Specialist for AWS. He works in the Worldwide Public Sector (WWPS) organization. He is focused on migrating Oracle workloads to AWS.