Scaling and Managing TIBCO DataSynapse GridServer on AWS
By Dilip Rajan, Sr. Partner Solutions Architect – AWS
By Carlos Manzanedo Rueda, Principal HPC Solutions Architect – AWS
By Ravindra Gupta, Principal EC2 Spot GTM Specialist – AWS
Financial services customers face new industry regulations such as Fundamental Review of the Trading Book (FRTB) and IFRS17, requiring additional compute capacity up to 10 times to meet their regulatory demands.
The same demand for scale is a trend in other growing fields such as life sciences, electronic design architecture (EDA), computation fluid dynamics (CFD), weather, and autonomous vehicles, as customers need to perform scale-out computing with larger datasets to get new and valuable insights.
In this post, we describe new service integrations and functionality that make TIBCO DataSynapse GridServer on Amazon Web Services (AWS) a great match to run grid computing workloads.
These integrations help customers scale their grid workloads to take advantage of the compute elasticity and flexibility of AWS and GridServer without up-front capital expenditure.
Grid computing technology has been used as one of the solutions that customers deploy to analyzes (in parallel and at a scale) vast datasets to perform complex computations. TIBCO GridServer plays a key role in running large-scale grid computing workloads.
To illustrate this, in 2018 AWS Chief Evangelist Jeff Barr blogged about successfully testing TIBCO GridServer scheduler on AWS Spot instances totaling 1.3 million vCPUs under 10 minutes in a single AWS region. The test demonstrated that enterprise customers can create a competitive advantage by analyzing their data in a short time frame by provisioning a large number of Amazon Elastic Compute Cloud (Amazon EC2) instances.
TIBCO is an AWS Partner with Competencies in Data and Analytics, High Performance Computing (HPC), and Machine Learning. In addition to TIBCO GridServer, AWS has a close working relationship with other products in the TIBCO family such as TIBCO Spotfire, TIBCO Data Science, TIBCO Messaging, and TIBCO Cloud Integration.
TIBCO GridServer Overview
Before we explore the integrations, let’s go through a quick overview of TIBCO GridServer.
TIBCO GridServer is an infrastructure platform for grid and elastic computing. With GridServer, customer can run millions of tasks in parallel. By dynamically scaling services and allocating computing resources, multiple grid workloads can be processed simultaneously.
The GridServer architecture is comprised of three components:
- Client component that submits tasks using a flexible and performant API, allowing for synchronous and asynchronous task submissions.
- Manager component that can be configured either as a director or as a broker. GridServer Managers manage the cluster configuration, provide orchestration of workers, and schedule tasks.
- Engine or Worker component that can execute tasks in an optimized and secure way.
Figure 1 – TIBCO GridServer on AWS.
The diagram above depicts a TIBCO GridServer cluster deployed on AWS, with the client component that submits tasks from on-premises systems. Services such as AWS Certificate Manager, AWS Key Management System (KMS), AWS CloudTrail, AWS CloudFormation, and Amazon CloudWatch provide all the functionality needed to support a Well-Architected GridServer deployment on AWS.
When running on AWS, GridServer uses Auto Scaling Groups to run pools of workers that can scale dynamically according to the workload needs. Auto Scaling Groups allow customers to run a mix of On-Demand and Spot instance capacity.
Customers can, for example, configure a mix of 25% On-Demand instances for the workload to run within a time limit, and 75% on Spot instances to speed up the execution while at the same time reducing the overall cost of the run.
TIBCO GridServer AWS Integrations
AWS and TIBCO GridServer integrations have focused on adopting architectural best practices for grid computing workloads when running on the AWS. Integration improvement areas include the High-Performance Computing Cloud Adapter (HPCCA) and GridServer EC2 Spot integration.
Amazon EC2 Spot instances offers up to a 90% discount over the On-Demand instance pricing-per-second billing model. Customers can launch Spot instances on spare EC2 capacity at a steep discount in exchange for returning them when Amazon EC2 needs the capacity back. When EC2 reclaims a Spot instance, we call this event a Spot instance interruption, which provides two minutes of warning before instance is taken away.
TIBCO DataSynapse HPCCA is a Broker component that allows customers to dynamically scale capacity to the number of grid workers based on the number of tasks submitted to the grid. When there are no tasks pending, HPCCA scales-in idle workers.
The following list describes key features of the latest integration of TIBCO GridServer with AWS:
- HPCCA offers more than one instance type and size following Spot best practices. By diversifying EC2 instance selection across instance families, different generations and multiple sizes, HPCCA on AWS enables grid workloads to scale horizontally while minimizing Spot instance interruptions that, in turn, reduce time to complete and the cost of grid calculations.
- HPCCA utilizes Auto Scaling Group weights to support instances of different sizes. Configuration attributes such as DesiredCapacity, MinSize, and MaxSize drive the total number of cores required for the grid workload instead of number of servers. This simplifies the right-sizing of GridServer engine pools, further saving costs by reducing any unused portion of compute.
- GridServer HPCCA uses the capacity-optimized Auto Scaling Group Spot allocation strategy by default. Capacity-optimized selects instances from the optimal Spot pools to reduce the frequency of termination, helping grid workloads to achieve scale while lowering cost.
- In 2021, AWS launched a new signal EC2 instance rebalance recommendation, which is a signal that notifies the workload when a Spot instance is at elevated risk of interruption. The signal can arrive sooner than the two-minute Spot instance interruption notice, giving workload the opportunity to proactively manage the Spot instance.
GridServer now integrates with EC2 Spot termination notifications and rebalancing recommendations. GridServer Brokers avoid scheduling tasks on engines that have received a Spot termination or rebalance recommendation notification. Tasks that are in-flight will continue running when Spot termination occurs, and GridServer will retry the task on a different engine.
- The demand of grid computing is unpredictable. While most customers have a clear idea of baseline demand, they want to make sure they have the option to scale dynamically with a pay-as-you-go model. Customers now have the option of using AWS Marketplace for TIBCO GridServer to burst capacity to AWS.
The integrations and improvements on TIBCO GridServer create a unique opportunity for customers to move grid workloads to AWS.
With the robust processing power and scalable infrastructure of TIBCO DataSynapse GridServer on AWS, you can run millions of tasks using tens of millions of data points in parallel for faster results and more precise analysis via high volume of short running tasks, a fast asynchronous API.
The demand for grid computing capacity in financial services is being driven by the need to meet increased reporting requirements for new legislations like FRTB to satisfy the requirement of having 12 months production run prior to go.
The prevailing view of our financial services customers is that there is no appetite to keep adding new capacity on-premises, and they are looking to AWS to offer a comprehensive and dynamic HPC offering.
For further information or to discuss your workload migration requirements, please reach out to us via email@example.com.
TIBCO – AWS Partner Spotlight
TIBCO is an AWS Competency Partner that enables faster and better decisions, smarter actions by capturing data in real-time, and augmenting business intelligence through analytical insights.
*Already worked with TIBCO? Rate the Partner
*To review an AWS Partner, you must be a customer that has worked with them directly on a project.