This Quick Start sets up an AWS architecture and deploys TIBCO Data Science (TDS) on the Amazon Web Services (AWS) Cloud, using Amazon Elastic File System (Amazon EFS) for shared storage, Amazon Aurora, Application Load Balancers, and an Amazon Elastic Compute Cloud (Amazon EC2) Auto Scaling group.
TIBCO Data Science is a big data analytics platform for enterprises. The collaborative user interface allows data scientists, data engineers, and business users to work together on data science projects. These cross-functional teams can build machine learning workflows in an intuitive web interface with a minimum of code, while still leveraging the power of big data platforms.
TIBCO Data Science provides an array of tools (from visual workflows to Jupyter Python Notebooks) for the data scientist to work with data of any magnitude, and it connects natively to most data sources, including Apache Hadoop, Spark, Hive, and relational databases. The advanced analytics platform provides security and governance. It also enables the analytics team to share and deploy predictive analytics and machine learning insights with the rest of the organization.
This Quick Start was developed by TIBCO Software in collaboration with AWS. TIBCO is an
What you'll build
How to deploy
Cost and licenses
What you'll build
Use this Quick Start to automatically set up the following TIBCO Data Science environment on AWS:
- A virtual private cloud (VPC) that spans two Availability Zones and includes two public and two private subnets, for security and high availability.*
- An internet gateway to allow access to the internet.*
- In the public subnets, managed NAT gateways to allow outbound internet access for resources in the private subnets.*
- In a public subnet, a Linux bastion host to provide Secure Shell (SSH) access to the TDS instance. The bastion host is in an Auto Scaling group of one, ensuring that there will always be one host available.*
- In a private subnet, a TDS 6.4 instance in an Auto Scaling group of one, ensuring that there will always be one host available.
- Amazon EFS automatically mounted on the TDS instance to ensure high availability. If the TDS instance fails in one Availability Zone, a new server is created in the second Availability Zone and automatically connected to the existing data. Failover typically takes 3-5 minutes, but can be longer.
- Amazon Aurora (Postgres-compatible 9.6.8) automatically connected to be used as a TDS instance internal database.
- An Application Load Balancer to automatically distribute connections to the active TDS instance.
- An AWS Identity and Access Management (IAM) instance role with fine-grained permissions for access to AWS services necessary for the deployment process.
- Appropriate security groups for each instance or function to restrict access to only necessary protocols and ports.
- (Optional) Amazon Route 53 as your public Domain Name System (DNS) for resolving your TIBCO Data Science site’s domain name.
* The template that deploys the Quick Start into an existing VPC skips the tasks marked by asterisks and prompts you for your existing VPC configuration.
How to deploy
To build your TIBCO Data Science environment on AWS, follow the instructions in the deployment guide. The deployment process includes these steps:
- If you don't already have an AWS account, sign up at https://aws.amazon.com.
- Subscribe to the Amazon Machine Image (AMI) for TIBCO Data Science for AWS in AWS Marketplace.
- Launch the Quick Start. Each deployment takes about 80 minutes. You can choose from two options:
- Test the deployment by verifying that TIBCO Data Science is running and accessible, and configure the Amazon EMR data source.
Cost and licenses
You are responsible for the cost of the AWS services used while running this Quick Start reference deployment. There is no additional cost for using the Quick Start.
The AWS CloudFormation templates for this Quick Start include configuration parameters that you can customize. Some of these settings, such as instance type, will affect the cost of deployment. For cost estimates, see the pricing pages for each AWS service you will be using. Prices are subject to change.
Tip After you deploy the Quick Start, we recommend that you enable the AWS Cost and Usage Report to track costs associated with the Quick Start. This report delivers billing metrics to an S3 bucket in your account. It provides cost estimates based on usage throughout each month, and finalizes the data at the end of the month. For more information about the report, see the AWS documentation.
This Quick Start requires a subscription to the Amazon Machine Image (AMI) for TIBCO Data Science for AWS, which is available from AWS Marketplace. The AMI includes a 7-day free trial of the TDS software. After 7 days, the trial converts to a paid subscription, if not cancelled before then. The TDS software is charged at a flat rate per hour of use, as described in the listing.