AWS Partner Network (APN) Blog
Teradata Vantage Real-Time API Integration with Amazon SageMaker Endpoints
By Mike Riordan, Director Product Management Data Science and Analytics – Teradata
By Doug Mbaya, Partner Solutions Architect – AWS
By Vjay Pawar, Solutions Architect ISV – AWS
Teradata |
Teradata, an AWS Data and Analytics Competency Partner, has expanded its collaboration with Amazon Web Services (AWS) by adding integration capabilities for Teradata Vantage, the data platform for enterprise analytics and AWS cloud services.
Vantage, with its native object store (NOS) read/write connector to Amazon Simple Storage Service (Amazon S3) data, already provides data integration with S3 data and Vantage enterprise data.
Now, Teradata introduces an API integration with Amazon SageMaker and Amazon Forecast. With this integration, customers can connect data from Vantage to these external services and return analytic results all using a Vantage query.
By integrating with Amazon SageMaker, Teradata gives customers the tools to operationalize analytic workflows on their Vantage platforms. This enables them to connect integrated enterprise data with AWS analytics in real-time queries, putting analytic results in the hands of business and analytic teams to drive outcomes.
For example, business analysts can retrieve the latest scores from Amazon Forecast to help plan next month’s delivery schedules. Support teams can have the latest churn-risk scores from sophisticated SageMaker models while still on the phone talking to customers.
Figure 1 – Teradata Vantage API integration.
This Vantage integration with deployed AWS services puts the results of those analytics into the hands of Vantage users and Vantage workflows across the enterprise, bringing more insights and analytic value to customers’ data.
Teradata High-Level Architecture
The following diagram illustrates an end-to-end data flow from data generation to value analytics using Teradata Vantage on AWS.
Figure 2 – Modern Teradata enterprise architecture on AWS.
Model Inference/Scoring
The primary use case for the Vantage API integration is model scoring of data with AWS analytic services for business analysts. For these users, connecting enterprise data in Vantage to these already trained and deployed AWS services is crucial for driving their workflows and delivering insights and outcomes.
With the existing Teradata NOS connector to Amazon S3, data science teams have been able to use integrated enterprise data from Vantage for model training. These teams leverage the SageMaker platform to create the best performing model and finally deploy it as an AWS endpoint.
Financial Fraud Demo
In this example notebook flow, the analytic model has already been created and is deployed on the AWS Cloud. Vantage business users have access to the analytic service by passing data from Vantage to the deployed endpoint through a query using the in-database API_Request function, with calculated results returned as the query’s result set.
This Jupyter notebook uses the Teradata SQL Kernel, a Jupyter notebook kernel that allows you to connect and query Teradata using SQL language within the notebook.
The Vantage table NEW_FINANCIAL_TRANS contains the analytic dataset used by the model for detecting possible fraudulent accounts.
The analytic team creates this account behavior profile dataset by combining account history data. Using Teradata NOS to write this analytic data set to S3, a predictive model can be trained using SageMaker.
In [49]:
rec_id | avg_pmt_05_mth | max_utilization_05_mth | times_nsf | credit_util_cur_mth | credit_util_prior_5_mth |
76809 | -1900.2 | 62.4 | 0 | 0 | 21.6 |
597417 | -7424.4 | 34.22222 | 0 | 0 | 13.95741 |
631585 | -5240.4 | 29.37143 | 0 | 0 | 16.41428 |
597417 | -7424.4 | 34.22222 | 0 | 0 | 13.95741 |
45837 | -961 | 0 | 0 | 0 | 0 |
392307 | -2900.2 | 87.26667 | 0 | 0 | 45.49444 |
484301 | -3240 | 9.375 | 0 | 9.375 | 9.375 |
603917 | -7042 | 14.4 | 0 | 6.56 | 9.17167 |
12892 | -4711.8 | 30.18333 | 0 | 0 | 14.33889 |
12862 | -4711.8 | 30.18333 | 0 | 0 | 14.33889 |
Predictive Model with Amazon SageMaker
Once the data science team trains an acceptable model, it’s deployed as an AWS endpoint to be used by business teams to detect and minimize fraud exposure.
Here’s an example Python code from the SageMaker notebook used to deploy this analytic model. Note the AWS endpoint address, as this will be used by the Vantage API integration query to score data.
In this example from the SageMaker Jupyter notebook code, the deployed model is now available at the endpoint address: sagemaker-xgboost-2021-10-20-15-43-44-623
Integrating Vantage Data with AWS Analytic Services
With the new Vantage API_Request function, business analysts have seamless integration of Teradata Vantage with these Amazon SageMaker analytic services, connecting Vantage data for inference and scoring execution.
Vantage API query using a VIEW: BUSTOUT_FRAUD_SCORING
Wrapping the underlying API integration query with a SQL VIEW makes this API integration easy to use. In fact, with a VIEW users can be completely unaware their query is reaching out to these external services.
Below is the VIEW syntax, created with the Vantage API_Request function. You can see as input parameters the SageMaker endpoint address created earlier by the SageMaker deployment, and the proper AWS access credentials, along with the Vantage data to be sent for inference/scoring.
Calling the API/VIEW with this simple SQL query, the Vantage user is able to retrieve model scoring results from the externally deployed SageMaker model from their Vantage data.
SELECT * from BUSTOUT_FRAUD_SCORING sample 10
rec_id | fraud_risk_score |
76809 | 47 |
597417 | 87 |
631585 | 42 |
392307 | 59 |
631585 | 42 |
744410 | 95 |
631585 | 42 |
484301 | 94 |
603917 | 93 |
554445 | 79 |
A Vantage business analyst can use a simple query to join data such as end-of-month balance and account credit limit with a real-time fraud risk score from BUSTOUT_FRAUD_SCORING, the integration with the AWS analytic service.
acct_no | credit_limit | eom_balance_amount | percent_limit | fraud_risk_score |
1094 | 8000 | 735 | 9.19 | 95 |
87 | per9000 | 731 | 8.12 | 95 |
1251 | 4000 | 2026 | 50.65 | 94 |
385 | 10000 | 440 | 4.4 | 94 |
587 | 2000 | 1710 | 85.5 | 93 |
716 | 9000 | 1641 | 18.23 | 88 |
58 | 6000 | 264 | 4.4 | 87 |
500 | 5000 | 734 | 14.68 | 73 |
1016 | 6000 | 2866 | 47.77 | 69 |
206 | 3000 | 535 | 17.83 | 59 |
899 | 7000 | 3177 | 45.39 | 55 |
1087 | 6000 | 463 | 7.72 | 48 |
1134 | 9000 | 3636 | 40.4 | 47 |
641 | 2000 | 401 | 20.05 | 42 |
1093 | 5000 | 0 | 0 | 28 |
This generated report is actionable by business analysts on accounts that are likely fraudulent, using tactical queries like this with Vantage.
This FRAUD_RISK_SCORE can be easily leveraged and joined with Vantage business data. In this way, connecting business users with the analytic models is transparent and used in their daily workflows.
Summary
Teradata understands that customers are leveraging a wide variety of data science tools and platforms to extract new value from their data. With Vantage, Teradata embraces this full ecosystem of tools with new API integration as a key foundation for that strategy.
The Vantage API integration with AWS analytic services enables business users to drive outcomes with real-time analytics.
Learn more about Teradata Vantage on AWS, Amazon SageMaker, and Amazon Forecast.
Teradata – AWS Partner Spotlight
Teradata is an AWS Competency Partner and cloud-first, enterprise analytics solution that delivers customer insights at scale.