Powering Enterprise Analytics at Scale Using Teradata Vantage on AWS
By Prashanth Sharma, Product Manager at Teradata
By Jobin George, Sr. Partner Solutions Architect at AWS
The amount and variety of existing and newly-generated data in today’s connected world is unparalleled. As this growth continues, so does the opportunity for organizations to extract real value from their data.
Unfortunately, most enterprise analytics solutions are assembled from multiple, disparate systems like analytics platforms, data lakes, and data warehouses. Each of these systems requires time-consuming and expensive data movement, involves multiple management teams, and typically rely on different programming skills.
What we need is an enterprise-ready, ultra-scalable platform that enables business leaders to operationalize data analytics in a way that drives real and timely business outcomes for a reasonable cost.
An end-to-end, unified analytics platform that supports mission-critical workloads and minimizes data movement would enable business leaders to shift their focus from the mechanics of analytics to the meaning and opportunities behind the answers.
In this post, we explain how Teradata Vantage leverages the scalability, speed, and power of Amazon Web Services (AWS) to help you offload infrastructure complexity and focus on answers to your business questions.
About Teradata Vantage
Teradata Vantage is a modern analytics platform that combines open source and commercial analytic technologies. It can drive autonomous decision-making by helping you to operationalize insights, solve complex business problems, and enable descriptive, predictive, and prescriptive analytics.
Vantage supports R, Python, Teradata Studio, Jupyter, RStudio, and any SQL-based tool. You can deploy Vantage across public clouds, on-premises, on optimized or commodity infrastructure, or as a service.
Teradata has decades of experience building and helping customers deploy Massively Parallel Processing (MPP) analytics databases. These can solve large business challenges involving massive size, significant concurrent usage, and strict performance requirements that other technologies can’t solve.
Teradata Vantage on AWS
Teradata Vantage on AWS delivers real-time business intelligence at scale through a comprehensive solution that combines analytics, data lakes, and data warehouse technologies.
Vantage consists of a high-speed data store mated via high-speed network fabric to:
- High-performing, massively parallel Advanced SQL Engine (formerly known as Teradata Database) running on Amazon EC2 M5 instances. M5 instances offer a balance of compute, memory, and networking resources for a broad range of workloads, including small and mid-sized databases.
- Machine learning and graph engines with 180+ advanced analytics functions running as containerized applications on Amazon EC2 M5d instances in a Kubernetes cluster on Amazon Elastic Kubernetes Service (Amazon EKS).
M5d instances are ideal for workloads that require a balance of compute and memory resources with high-speed, low-latency local block storage. They also offer as much as 3.6TB of NVMe-based solid-state drives (SSD) for local storage.
Figure 1 shows Vantage deployed on Amazon Elastic Block Storage (Amazon EBS) for the primary data store.
Figure 1 – Teradata Vantage integrates with many AWS first-party services.
Amazon EBS is block-level storage than runs on Amazon Elastic Compute Cloud (Amazon EC2). Teradata Vantage runs on general purpose (gp2) SSD volumes of network-attached storage (NAS) provisioned with high Input-Output (IO) bandwidth.
The Vantage analytical engines query not only data in primary storage, but also JSON, CSV, and Parquet data stored in object store data lakes in Amazon Simple Storage Service (Amazon S3) as if located in the data store itself.
What makes this possible is Teradata’s Native Object Store (NOS) capability. NOS, currently in preview mode, lets you read data where it lives without having to load the data into Vantage ahead of time.
You can access Teradata’s NOS through:
- Open Database Connectivity (ODBC) API that enables Windows applications to access databases through the SQL language.
- Java Database Connectivity (JDBC) API that enables Java applications to use SQL for database access.
ODBC and JDBC are the most commonly used SQL standards for accessing data and running queries.
Additionally, Vantage is compatible with the most popular data science tools and languages, including SQL, R, Python, and others, plus third-party tools commonly used for data analysis and visualization.
You can deploy Vantage on AWS using two different models.
Under this model, Teradata manages the Vantage service for you. Simply bring your data and applications to Vantage, and use the service to perform analytics and get answers to your business questions.
In the SaaS model, Teradata Cloud Operations creates an instance of Vantage in a single-tenant Amazon Virtual Private Cloud (Amazon VPC) and manages software deployment, infrastructure, monitoring, high-availability, software upgrades, and security for that VPC. Teradata provides a 99.9 percent availability SLA for its SaaS model.
Under the SaaS model, all three analytics engines in Vantage are deployed on the Nitro-based “M5” family of AWS general purpose instances.
Under this model, you subscribe to Teradata Vantage through AWS Marketplace (pay only for what you use) and deploy it in an Amazon VPC.
Amazon VPC lets you provision a logically-isolated section of the AWS Cloud where you can launch Vantage in a virtual network that you define. You have complete control over your virtual networking environment, including which IP address range you select, subnets you create, and the route tables and network gateways you configure.
You can use both IPv4 and IPv6 for secure and easy access to resources and applications.
Of course, this flexibility means you are also responsible for the configuration and management of networking, infrastructure, backups, and security management, among others.
You can use the DIY model to deploy Vantage software for the Advanced SQL Engine on the “m” family of AWS general purpose instances.
SaaS Pricing Models
Vantage on AWS (SaaS) can be priced in two different ways:
Pay only for successfully executed queries tracked as Vantage Units and storage.
Provisioned Fixed Capacity allocates a dimensioned amount of compute resources and IO capacity for deploying and running your instances of the Advanced SQL Engine. These resources and IO capacity are measured as TCores.
If instead of dimensioned capacity, you prefer to dynamically resize your compute resources and IO, your Flex Capacity resource allocation is tracked as TCore-Hours.
You can resize both the Advanced SQL Engine compute and storage resources through the Vantage Console, a single GUI for monitoring and controlling your Vantage on AWS (SaaS) service.
You can also use the Vantage Console to view the number of TCore-Hours you have used to date, referred to as a drawdown. You continue to draw down from your Flex Capacity if your system remains provisioned.
Although you can resize compute resources for the Advanced SQL Engine up, down, out, or in based on needed query capacity, you can only increase the capacity of your persistent data store.
Vantage Self-Service Console
You can manage Vantage on AWS (SaaS) instances through the Vantage Console, a single pane of glass that provides the following functionality:
Analytical Engine Monitoring and Scaling
The console gives you a picture of your Vantage instances, the different analytical engines they are using, and their size. From the console, you can resize the instances, both up and down, as well as out and in.
You can also stop and restart the Advanced SQL engine from the console. You can also monitor the entitlement and draw down of your Flex Capacity-based systems.
From the console, you can scale up persistent data storage independent of the analytical compute engines.
Each Vantage on AWS (SaaS) subscription includes two full daily backups. Schedule these backups and monitor their status via the console. They are encrypted and stored in Amazon S3.
Vantage on AWS supports multiple connectivity options that depend on where your connections originate and the type and number of applications that need connectivity.
Figure 2 – Vantage on AWS connectivity options.
To receive a Service Ready designation, APN Partners must undergo service-specific technical validation by AWS Partner Solutions Architects, including review of architecture, customer documentation, and customer case study details to ensure they follow AWS best practices.
As an AWS PrivateLink Ready Partner, Teradata connects services to its VPC through the AWS private network. This approach enhances the security and privacy of customers’ workloads by keeping all of their network traffic within the AWS network.
AWS PrivateLink is the preferred connectivity method for accessing applications that do not need to initiate connection from the Teradata VPC into a customer VPC. AWS PrivateLink also simplifies IP address planning and has higher network speeds.
One AWS PrivateLink connection is included in the price of a Vantage on AWS (SaaS) subscription.
Virtual Private Network (VPN)
VPN is the preferred option when a Vantage instance in the Teradata VPC needs to initiate connections to multiple entities in a customer’s VPC This could be to, for example, enable access through Lightweight Directory Access Protocol (LDAP), or other data sources.
You can use VPN to connect to Vantage from both the customer’s VPC or directly from their on-premises cloud.
One VPN connection is included in the price of a Vantage on AWS (SaaS) subscription.
AWS Direct Connect
Because you can partition this dedicated connection into multiple virtual interfaces, you can use the same connection to access public and private resources while maintaining network separation between the public and private environments.
Use the AWS Direct Connect option only when your data center needs to connect to the Vantage on AWS instance.
How Teradata Saved a Rental Car Company Millions
A Fortune 500 company with more than 10,000 locations had a huge inventory of roughly 500,000 cars worth about $21 billion.
To secure cash on hand, they use cars as collateral against bank loans. In turn, the banks request timely reporting to verify the location, value, and usage of hundreds of cars at a time.
Initially, the customer tracked this information using spreadsheets and was only able to deliver reports once a month. However, if they could report their inventory to the banks daily, the banks were willing to lower the interest rate and provide larger loans.
Teradata deployed an on-premises solution for this company’s treasury department in 2011 to deliver daily reports—instead of monthly reports—to their banks about the location of its cars, and whether they were rented or sitting on the lot. This speed up in reporting process gave the company the ability to borrow tens of millions of additional dollars against its fleet.
However, when data analysts from other departments began to run their own queries against the Teradata solution, reporting time slowed by at least four hours, delaying the daily reports to the banks from 8 a.m. to noon.
To address the performance impact of additional departments accessing Teradata, the customer’s treasury department evaluated two options:
- Add more capacity to the on-premises Teradata system to keep up with demand.
- Move the application to a dedicated cloud instance to enable more control over which teams accessed the data.
The company’s treasury department opted for its own AWS instance in November 2017 because it could fund the solution from its own budget, as opposed to the IT budget. As a result, the treasury department was able to speed up the generation of reports for the banks and even deliver them early.
With more disk space in the cloud, the treasury department could maintain more data history for improved compliance.
The company’s technical team advocated for running the Teradata solution on AWS because of the technical support they were already receiving from AWS. Teradata and AWS teams worked closely together to build the cloud platform and migrate the on-premises Teradata solution to AWS.
Teradata Vantage provides an end-to-end, unified analytics platform for mission-critical workloads. It enables business leaders to shift focus from the mechanics of analytics to the meaning behind the data.
Running Vantage on AWS leverages the scalability, speed, and power of AWS to help you offload infrastructure and complexity, allowing you to focus instead on getting answers to your business questions.
Vantage on AWS is available in AWS Marketplace through both public and private listings.
Learn more about how Vantage on AWS can help you achieve better performance through data.
The content and opinions in this blog are those of the third party author and AWS is not responsible for the content or accuracy of this post.
Teradata – APN Partner Spotlight
Teradata is an AWS Data & Analytics Competency Partner. Teradata leverages all of the data, all of the time, so you can analyze anything, deploy anywhere, and deliver analytics that matter.
*Already worked with Teradata? Rate this Partner
*To review an APN Partner, you must be an AWS customer that has worked with them directly on a project.