AWS Database Blog

Automate switchover of Oracle E-Business Suite on Amazon RDS Custom for Oracle

Oracle E-Business Suite is a business application suite that includes financial, supply chain, human resources, and customer relationship management modules. Due to the application’s critical nature, it’s important to architect Oracle E-Business Suite for high availability and develop the ability to switch over the application and database as quickly as possible in order to minimize downtime and maintain business continuity.

One method for achieving high availability is to perform an application switchover. In this post, we demonstrate how to perform an Oracle E-Business Suite application and database switchover on Amazon RDS Custom for Oracle.

Amazon RDS Custom for Oracle is a managed database service for legacy, custom, and packaged applications that require access to the underlying operating system and database environment. It automates database administration tasks and operations while making it possible for database administrators to access and customize the database environment and operating system. AWS takes care of the heavy lifting such as backups and high availability, allowing database administrators to focus on maintaining their Oracle E-Business Suite application and functionality.

The following are some common use cases when an application switchover might be performed:

  • During planned maintenance of the primary database and application servers, in order to minimize application downtime
  • In the event that the primary server’s hardware fails, in order to keep the application running
  • To avoid downtime during a software upgrade on the primary database server

Switchover

A Switchover is a planned role reversal between the primary and the standby database, typically used when there is a plan outage on the primary server or database, with the aim to reduce the amount of downtime for the service.

Failover

In case of an actual disaster or severe outage, where the primary server is non-reachable or non-recoverable, a failover needs to be performed. A Failover operation cannot be reversed which turns the standby database into a primary, therefore a failover decision should be carefully made. With minor changes to the provided code such as using Data Guard failover commands, the failover of the database can be achieved.

In this post, we focus on the Switchover use cases, we do not cover Failover.

Key considerations for high availability configuration

Setting up a high availability configuration using Amazon RDS Custom with Oracle E-Business Suite involves the following:

Solution overview

For the solution we assume the application and database are set up to use logical hostnames, which reduces the number of steps required after the database switchover. Logical hostnames remove the reliance on physical hostnames, reducing the overall complexity. A read replica of RDS Custom for Oracle is part of the design, which uses Oracle Data Guard under the covers for the replication between the primary and standby database.

We use the following AWS native service offerings to automate the switchover process for our Oracle E-Business suite application and database:

The application is configured for high availability, as depicted in the following diagram, and we develop an automation to trigger the switchover process based on the architectural configuration.

This post guides you through the process of setting up and configuring your environment for automated switchover. The high-level implementation steps are as follows:

  1. Create AWS Identity and Access Management (IAM) roles or use an existing role that has Lambda run permissions and state machine run permissions.
  2. Add an additional IAM policy statement action to an existing IAM role for Amazon EC2 instances.
  3. Create a parameter store and AWS Secrets Manager
  4. Create business logic using shell scripts.
  5. Create a Lambda function that calls the shell scripts to perform the desired task in the switchover process.
  6. Create the Step Functions state machine and workflow that calls the Lambda functions in sequence.
  7. Trigger the switchover automation.

Instructions on how to deploy this automation is documented in the GitHub repo.

Prerequisites

This post assumes you have the following set up:

  • Oracle E-Business Suite running on AWS with Amazon RDS Custom for Oracle. See Migrate Oracle E-Business Suite to Amazon RDS Custom for more details.
    • Application version Oracle E-Business Suite 12.2.x.
  • A source database running on Amazon RDS Custom for Oracle.
  • These steps can also be adopted for an Oracle database running on Amazon EC2.
  • Multiple Oracle E-Business Suite application tiers, across two Availability Zones. See Set up an HA/DR architecture for Oracle E-Business Suite on Amazon RDS Custom with an active standby database for more details.
  • A database OS user, and application OS user. These can be called whatever you have defined, however in this example we have used rdsdb  for the database OS user, and applmgr for the application OS user . If your environment has a different OS user name, consult the README in the AWS Cloud Development Kit (AWS CDK) GitHub repository for the Lambda functions to be edited to reflect your environment OS user standards.
  • For the RDS Custom EC2 primary and standby instances, a tag with the key as Name and the value as a unique hostname.
  • For the application EC2 primary and standby instances, a tag with the key as Name and the value as a unique hostname.
  • SSM Agent installed and configured in all the EC2 instances. For instructions, refer to Working with SSM Agent.
  • An SSH trust relationship established between the Oracle E-Business Suite primary and secondary application nodes using applmgr or an equivalent user.

The services used to create this automation is charged on a per-use basis. The automation can be executed with small or no costs depending on the usage. For additional information on the costs of these services, see the following:

Create a new IAM role or use an existing role

For the automation to work, we need a role with Lambda run permissions and state machine run permissions. You can either use an existing role or create a new role with these permissions.

Add an additional policy statement to the IAM role for Amazon EC2 instances

Add the following policy statement actions to the role that is assigned to all Amazon EC2 instances of the Oracle E-Business Suite database and application. These actions are used in automation scripts to perform switchover activities.

  • elasticloadbalancing:DescribeLoadBalancers
  • elasticloadbalancing:ModifyListener
  • secretsmanager:GetSecretValue
  • elasticloadbalancing:DescribeListeners
  • ec2:DescribeAvailabilityZones
  • secretsmanager:PutSecretValue
  • elasticloadbalancing:DescribeTargetGroups

Create a parameter store for the metadata

You must have a parameter store in the same account and Region where the primary and standby environments are configured. The naming convention of the parameter store should be R12-<EnvName>-Nodetab with parameter store type SecureString. For example, if the environment name is VIS, the parameter store should be named R12-VIS-Nodetab.

The values in the parameter store should contain the following information in the specified order for both the primary and standby environments, as well as for each Oracle E-Business Suite virtual hostname. The values are separated by a colon (:). Color coding is used for illustrative purposes to distinguish each entry in the parameter store.

EnvName:Standby DB Unique Name:Standby DB EC2 Unique Hostname:Standby App EC2 Unique Hostname:ALB Target Group Name For Standby App EC2 Instances:ALB Name:ALB Listener Port:Application Logical Hostname:Application Base Path:(P)rimary/(S)econdary Application Node

The following are sample entries for a two-node application server configured with Amazon RDS Custom for both primary and standby configuration:

VIS:VIS_P:EBIZ-P-VIS-DB-AZ1:EBIZ-P-VIS-APP-01:TGP-VIS-PRM:ALB-VIS-PRM:443:EBSAPP01:/fh01/VIS:P
VIS:VIS_P:EBIZ-P-VIS-DB-AZ1:EBIZ-P-VIS-APP-01:TGP-VIS-PRM:ALB-VIS-PRM:443:EBSAPP02:/fh01/VIS:S
VIS:VIS_S:EBIZ-S-VIS-DB-AZ2:EBIZ-S-VIS-APP-01:TGP-VIS-STBY:ALB-VIS-STBY:443:EBSAPP01:/fh01/VIS:P
VIS:VIS_S:EBIZ-S-VIS-DB-AZ2:EBIZ-S-VIS-APP-01:TGP-VIS-STBY:ALB-VIS-STBY:443:EBSAPP02:/fh01/VIS:S

During the switchover process, the automation reads the EC2 tag keys and parameter store to retrieve the application hostname information. As a result, the accuracy of the details specified in the parameter store ensure the automation’s successful run.

Create a secret in Secrets Manager for instance passwords

The Oracle E-Business Suite environment passwords are stored in Secrets Manager. The secret is created in the same account and Region where the primary and standby environments are configured. The naming convention of the secret should be R12-<EnvName>-Secret with the secret type Other type of secret. For example, if the environment name is VIS, the secret name should be R12-VIS-Secret.

The automation reads the secret during the switchover process to stop and restart the application services. The automation logic assumes that the first password is always the system database user password, the second password is always the apps password, third password is always the sysadmin user password, and the fourth password is always the WebLogic user password. As a result, the order and sequence of storing the user name and password are critical for automation.

User names should be stored in the following order in the Secret Key field:

system:apps:sysadmin:weblogic:ebs_system:custom

Passwords for the aforementioned users should be stored in the Secret Value field in the same order. The following are some sample passwords to demonstrate how they are stored in the same order as the user name:

system_123:apps_234:sysadmin_121:weblogic_324:ebs_system_3123:custom_123

The user ebs_system is introduced after AD TXK 11 patching in Oracle E-Business Suite. If you are in a prior version of AD TXK, you may skip that user in the sequence.

Create business logic using shell scripts

In addition to the services we’ve discussed in this post, we use Unix shell scripting to incorporate the desired business logic for this automation. The following table lists the scripts we use. These scripts are located in the OS user home for both the application and database, in our case /home/rdsdb/auto_switch/ for the database user and /home/rdsdb/auto_switch/ for the application user. Note these OS users have been listed for simplicity, therefore replace with whichever users you have defined.

You have full access to these scripts, and you can modify these to suite your use case. For example, changing the command form ‘switchover’ to ‘failover’, in atsdbctl. Any changes must be tested and verified, in a test system first.

Script Name Hosting Server Location Purpose
common_env Amazon EC2 DB Server and EC2 App Server $HOME/auto_switch Common functions used by other Unix shell scripts
atsdbctl Amazon EC2 DB Servers $HOME/auto_switch Identify primary database, perform switchover, and mount the standby database
atsappctl Amazon EC2 APP Server Node 1 $HOME/auto_switch Stop and start application services, update the Application Load Balancer target group

The scripts are placed in the $HOME/auto_switch folder on all the RDS Custom for Oracle EC2 instances and the application server on EC2 instances, respectively.

The shell scripts contain the following business logic:

  • To display and write to log files, the common_env file has a function called message, which is used by both atsdbctl and atsappctl.
  • atsdbctl is stored in the RDS Custom EC2 DB servers. The script has the functions fetch_primary and switchover_db, which are used to identify the database role (primary or standby), switch over the database, and mount the standby database post-switchover. Although the switchover command ensures the changes are replicated, we highly recommend that you check the database is not far behind in terms of lag before attempting to run the switchover. This will prevent any unnecessary delays to perform the switchover. Review the following AWS documentation on how to check replication lag.
  • atsappctl is stored in the Oracle E-Business Suite application Amazon EC2 instance. The script has functions stop_apps, start_apps, and elbswitch. These functions stop and start application services by identifying the correct primary and secondary application servers from the parameter store. The elbswitch function switches the Elastic Load Balancer listener to the appropriate primary or secondary target group after the database switchover.

Create a Step Functions state machine

The following diagram shows the overall design and flow of the automation.

The Step Functions state machine orchestrates the complete flow of this switchover automation process. The program logic is written in a shell script that is housed in Amazon EC2 Linux servers. The Lambda functions written using Python invoke the shell script using Systems Manager in their respective Amazon EC2 instances to perform the required operations. Parameter Store is used to hold metadata information like container database (CDB) name, pluggable database (PDB) name, Amazon EC2 instance names of the database, base path of APPL_TOP, application server logical hostnames, the ARN of the Application Load Balancer listener, the ARN of the Application Load Balancer target group, and a flag to identity the first application node. The information stored in Parameter Store is used by the automation process to start and stop services in the correct order. Secrets Manager stores the environment credentials.

The following JSON string payload is passed as input to the Step Functions state machine. The JSON string is appended through the flow of the automation to capture the details required for the subsequent steps in the automation. The Lambda functions read the payload to identify the environment details to perform the desired task in the process of switchover.

{
"Comment": "ERP-Autoswitch-Payload",
"Region": "eu-west-1",
"Target": "VIS"
}

The switchover process triggers several Lambda functions after the state machine is triggered with the payload:

  1. The first Lambda function identifies the target database by reading the JSON string payload and identifies the respective parameter store to determine the Amazon EC2 instance of RDS Custom for Oracle database. The function connects to the RDS Custom for Oracle EC2 host to determine the SID of the primary database by invoking the shell script, which calls the dgmgrl command internally. The primary database SID returned by the function is appended to the JSON string payload.
  2. The second function stops the application services of the primary database. The application Amazon EC2 hostname is pulled by reading the parameter store. The function then connects to the Amazon EC2 application server and initiates the shell script to stop the application services by reading the virtual application host details from the parameter store.
  3. The third function switches over the database. The standby RDS Custom EC2 instance is identified by reading the parameter store. The function then invokes the shell script to switch over the database.
  4. The fourth function starts the application services. The function takes a similar approach to starting application services as it did to stopping application services.
  5. The final function repoints the target group in the Application Load Balancer by fetching the details from the parameter store.

Trigger the switchover automation

As previously stated, the automation is built with the Step Functions state machine. We’ve included two popular options for triggering this automation.

The first method is to start the state machine by choosing Start execution on the Step Functions console and passing the JSON string payload.

The second method is to schedule the state machine using Amazon EventBridge rules. You can create a schedule-based event rule with the target API as StartExecution and select the automation state machine. The following screenshot shows an example of the target details and input payload from an EventBridge rule definition.

Cleanup

In order to ensure you are not charged for any additional resources, if any, you need to follow the instructions to Destroy the CDK stack on GitHub repo.

Troubleshooting

To assist with troubleshooting issues with the commands run by the underlying scripts, check the log directory which is located in the database and application user OS home. For the database user this is located in /home/rdsdb/auto_switch/logs and for the application user /home/applmgr/auto_switch/logs. Change rdsdb and applmgr to whichever name you have defined for your users. For example, the following displays that the primary database is not running:

$ cat /home/rdsdb/auto_switch/logs/erp_app_switch_log_202304171244.log
Mon Apr 17 12:44:52 UTC 2023 :  INFO: Fetching Primary Database Information
Mon Apr 17 12:44:52 UTC 2023 :  ERROR: Database VIS is not running.

Conclusion

In this post, we walked through the process of automating the Oracle E-Business Suite  switchover process and provided the steps to perform this automation using AWS native services offerings without needing to step out of your AWS Cloud environment ecosystem.

The complete code for this automation, which includes the related IAM policies and roles, Lambda functions, and Step Functions state machine, is packaged in the form of an AWS CDK stack. For deployment instructions, refer to the GitHub repository.

We welcome your thoughts and comments in the comments section.


About the authors

Simon Cunningham holds multiple OCP and AWS certifications, and has over 20 years’ experience supporting Oracle applications and migrating to AWS.

Sridhar Mahadevan has over 19 years of experience administering Oracle Database and applications, with expertise in migrating and modernizing Oracle applications to AWS.

Saurabh Verma is a Sr Cloud Architect with Amazon Web Services. Saurabh has a keen interest in industry and cloud trends and is passionate about designing and developing Cloud native solutions for AWS customers.