AWS Partner Network (APN) Blog

Comparing Oracle Exadata Database Performance with Amazon RDS for Oracle

By Indranil Banerjee, Transformation Lead – TCS
By Sanjay Gupta, Sr. Partner Solutions Architect – AWS
By Debashish Pradhan, AWS Architect – TCS

TCS-AWS-Partners-2023
Tata Consultancy Services (TCS) 
Connect with TCS-1

Many organizations are increasingly looking for agility, elasticity, and scalability features to quickly roll out new technology features to their end customers. Database migration to Amazon Web Services (AWS) is one such priority for digital agendas and often includes Oracle Exadata migration to AWS. This is driven by digital projects to address the following challenges and limitations:

  • CapEx model where customer plans for future capacity and pays for it from day one, even though they’re not using it.
  • Overhead of managing annual licensing audits.
  • Multi-year lock-ins.
  • Limitations of buying capacity for peaks.

Customers are often worried that moving away from Oracle Exadata to AWS will impact database performance. They are looking for proof points of running on-premises Oracle databases on Exadata to AWS for their mission-critical applications.

In this post, we will explore how Tata Consultancy Services (TCS) supported a UK-based utility provider’s digitalization journey by comparing performance of their on-premises Oracle Exadata database with Amazon RDS for Oracle.

TCS is an AWS Premier Tier Services Partner and Managed Cloud Services Provider (MSP) with Migration Competency.

Use Case of a Utility Customer

A UK-based utility provider that was running its business on an old version of Oracle Utilities needed upgrading to meet future aspiration of growth and performance. The upgrade entailed moving to Oracle Utilities Customer to Meter (C2M), and the solution runs on the Oracle-native Exadata database.

While AWS was the preferred cloud platform for the customer, moving away from Oracle Exadata to Amazon RDS for Oracle was proving to be difficult due to concerns of degrading performance. In such scenarios, TCS proposes a time-bound Proof of Concept (PoC) to expedite decision-making.

The objective of the PoC was to prove the current workload can be supported on Amazon RDS for Oracle and give the customer confidence that the same level of database performance can be achieved by moving away from Oracle Exadata to Amazon RDS for Oracle.

Key principles for the PoC were:

  • Production workload should be carefully analyzed to identify simulation criteria.
  • Proper sizing analysis should be done to baseline the target environment.
  • Workload simulation needs to meet a high degree of precision.
  • Success criteria needs to be well-defined and measurable.
  • Results should be comparable in like-for-like setting.

In this scenario, only the database performance was compared between Oracle Exadata database and target Amazon RDS for Oracle and not the application layer.

Success Criteria

For this PoC, the following performance parameters were used to compare performance between Amazon RDS for Oracle with on-premises Oracle Exadata.

# Parameters Target Measure Tools Used
1 System performance of Amazon RDS for Oracle Avg. CPU < 75%, avg. mem <75%, IOPS – Less than max assigned for the storage Amazon CloudWatch
2 Replay reliability/divergence <5%, Number of rows returned by each call are compared and divergences percentage reported RAT replay report
3 Replay run time 7 hrs. 16 min RAT replay report
4 SQL efficiency – % DB change Same as on premises or improved RAT replay report and AWR report
5 SQL efficiency – common/long running Same as on premises or improved RAT replay report and AWR report
6 DB instance efficiency Same as on premises or improved Performance insights

Approach

The criteria was to capture exact production workload during peak hours and replay on target instead of using synthetic workload, which can be generated from any available market tools.

TCS leveraged the Oracle-provided Real Application Testing (RAT) tool for performance benchmarking. RAT provides a more granular level of testing, analysis, and validation of a specific aspect of technology (disc performance, CPU speed) or critical business process (payroll or billing run that needs to be completed within a specific timeframe).

With RAT, specific workloads are stress-tested for performance, capacity, speed, or other criteria. A specified process was recorded in the on-premises production environment and then replayed on Amazon RDS for Oracle with real-word demands to ensure it meets specific technical and user experience requirements.

TCS-RDC-Oracle-1.2

Figure 1 – High level steps involved in PoC.

Performance Testing Considerations

The source Exadata appliance was quarter rack, having 48 CPU cores (equivalent to 96 vCPU) for database server and configured to host multiple databases. The workload to be tested consumed 33% of resources. Therefore, as part of this PoC, the Amazon RDS instance type db.r5.12xlarge (24 CPU cores, 384 GB memory) was picked for testing.

Steps Executed on Premises

Using the datapump utility, an export of the Oracle database was taken and uploaded back up on an Amazon Simple Storage Service (Amazon S3) bucket. After backup completion, RAT was used to capture the workload for users, programs, and sessions by using inclusion/exclusion filters.

Steps Executed on AWS

Amazon RDS for Oracle instance db.r5.12xlarge was created. The database dump and RAT replayable files were downloaded to Amazon RDS database directory. Next, the database dump was imported into Amazon RDS for Oracle and a snapshot of the RDS instance was taken. This snapshot was restored for iterations 2 to 5.

A pre-defined SQL script was used to warm up the RDS instance and then RAT database client/s replayed RAT replayable files on RDS. Amazon CloudWatch, Automatic Workload Repository (AWR), Active Session History (ASH), and Automatic Database Diagnostic Monitor (ADDM) tools were used to analyze the captured output. For the next iteration, Amazon RDS snapshot was restored and parameters were tweaked to fine tune database replay runtime.

Iterative Performance Testing

Throughout testing, RDS Oracle 19c size was db.r5.12xlarge. During all iterations, application tuning was deliberately not chosen as the objective was to benchmark on-prem Oracle database on Exadata with Oracle database in Amazon RDS.

For this PoC, five testing iterations were conducted within a four-week period on Oracle 19c.

Configuration Setup and Fine Tuning

RAT Replay can be carried out under three conditions:

  1. For synchronization=SCN, replay follows the COMMIT order captured at the source workload. This synchronization method may introduce significant delays for some workloads. For such workloads, it’s recommended to use TIME as the synchronization parameter. For synchronization=TIME, replay follows the wall-clock timings captured at source instead of following the commit order. All database session login times will be replayed exactly as the capture. Likewise, all timing between transactions within database sessions will be preserved and replayed as captured. This synchronization method will produce good replays for most workloads.
    .
  2. Think time: This controls think time between database calls. The value of 100% is for exact think time between database calls, and <100% is advisable for higher request rate of database calls. Think time=100% for all iterations except the last. The last iteration simulates a scenario with reduced think time, which reduced execution time significantly.
    .
  3. Replay clients and instance size: Replay Driver is a special client program that consumes processed workload and sends requests to the replay system. It preserves timing, concurrency, and dependencies of the captured system. The TCS project team experimented with replay clients ranging between 1 and 4 with instance sizes r4.xl and r4.2xl. For high concurrency workloads, it’s beneficial to use multiple clients in parallel to drive workload.

The workload had a high number of batches with intricate dependencies. The TCS project team decided to rely more on the results based on sync mode SCN to get better quality results. Customers simulating more independent transactions can use sync mode time.

Description Iteration #1 Iteration #2 Iteration #3 Iteration #4 Iteration #5
SGA & PGA SGA: 110GB
PGA: 50GB
SGA: 180GB
PGA: 70GB
SGA: 180GB
PGA: 70GB
SGA: 180GB
PGA: 70GB
SGA: 180GB
PGA: 70GB
Sync SYNC: SCN SYNC: TIME SYNC: SCN SYNC: SCN SYNC: SCN
Think time 100% 100% 100% 100% 50%
Replay clients & instance size 1 and size=r4.2xl 4 replay clients (1st one of size r4.2xl & rest 3 of r4.xl) 1 and size=r4.2xl 4 replay clients (1st one of size r4.2xl & rest 3 of r4.xl) 4 replay clients (1st one of size r4.2xl & rest 3 of r4.xl)

Below are some parameters and their values, which were fine-tuned between iterations to achieve desired results:

  • Increasing SGA & PGA improved SQL efficiency, which was further tuned by creating indexes on worst performing SQLs.
  • Changing sync=time improved replay run time drastically but increased replay divergence, so this parameter was reverted back in the next iterations.
  • Increasing replay client count from 1 to 4 helped improve replay run time.
  • Changing think time=50% improved replay run time to 4 hrs. 55 mins.

Iterations Outcomes

Five iterations were carried out and the outcomes measured for parameters previously defined in the success criteria section.

Success Criteria Parameters Iteration #1 Iteration #2 Iteration #3 Iteration #4 Iteration #5
1. System performance Met target Met target Met target Met target Met target
2. Replay divergence Met target Did not meet target (new errors during replay) Met target Met target Met target
3. Replay run time 18 hrs. 6 mins 7 hrs. 16 min 10 hrs. 26 mins 7 hrs. 39 mins 4 hrs. 55 mins
4. SQL efficiency – % DB change Higher SQL execution time Response got better but still more than source time Met target Met target Met target
5. SQL efficiency – common/long running Higher SQL execution time Response got better but still more than source time Met target Met target Met target
6. DB instance efficiency Did not meet target Met target Met target Met target Met target

Conclusion

Migrating Exadata-based Oracle databases to AWS can provide organizations significant benefits in terms of agility, elasticity, and cost savings.

As organizations run mission-critical databases on Oracle Exadata, businesses are increasingly looking for proof points of like-to-like performance of these databases on AWS. This post can be used as a reference point to prove the performance of Amazon RDS for Oracle through a quick PoC and pave the way for enterprise-wide migration.

TCS has a proven record of migrating mission-critical databases from Exadata to AWS with associates who are trained and certified in implementing AWS services. For more information and migration of databases from Oracle Exadata to AWS, contact the TCS team.

.
TCS-APN-Blog-Connect-2022
.


TCS – AWS Partner Spotlight

TCS is an AWS Premier Tier Services Partner and MSP. An IT services, consulting, and business solutions organization, TCS has been partnering with many of the world’s largest businesses in their transformation journeys for the last 50 years.

Contact TCS | Partner Overview