AWS for Industries

Fidelity International improves customer experience using Amazon SageMaker

Many companies want to actively use advance analytics to bring operational efficiencies and improve the customer experience. Automating and migrating to the cloud machine learning operations (MLOps) techniques can provide streamlined machine learning (ML), improved data processing, and enhance operational efficiency—leading to substantial cost reductions compared to on-premises deployment.

The challenge

There has been a lot of interest within enterprises wanting to adopt data-driven models to streamline their business processes and make better decisions based on data and analytical insights. Executing such advanced analytics projects in traditional on-premises environments has its own challenges. For example, lack of suitable infrastructure (like servers with GPU support or limited memory), high lead time to availability and upfront costs. There are also operational challenges such as redundant effort, manual dependencies, data archival, inconsistent processes, lack of standards and best practices for model monitoring and retraining.

For these reasons, the data value team at Fidelity International, a UK based asset manager, explored the options of using Amazon Web Services (AWS) to migrate key analytical workloads from on-premises to the cloud. One of the end consumer-focused advanced analytics use case called Resolution Analytics was identified for redesign and implementation on the cloud.

Resolution Analytics helps deliver a better customer experience by analyzing the historical data of customers such as their transactions, demographics and interactions. It builds a machine learning model to identify a sub-par experience, which has the risk of impacting customer confidence and experience. The model follows the principles of responsible AI and focusses on predicting key metrics that help identify a potential future complaint (along with the probable causes), which are then used to further improve Fidelity International’s operational processes.

About the customer team implementing the solution

The Data Value team (DVT) at Fidelity International is part of the Enterprise Data team that is responsible for executing their data strategy. Fidelity International’s data strategy is divided into three pillars:

  1. Data Governance: Wherein we define the data—the taxonomy, ownership, quality.
  2. Data Architecture: Which helps us manage the data—data stores, data modelling, Fidelity’s data lake and making it accessible.
  3. Data Consumption: Where we start using data for business benefits such as reporting, KPI dashboards, insights, and analytics.

The Data Value team represents this third pillar, that is, Data Consumption. Its aim is to empower leaders, as well as accelerate business outcomes, with data driven insights using advanced analytics tools such as Microsoft Power BI, Tableau, Python and Amazon SageMaker.

The solution

Fidelity International designed a fully engineered and automated advanced analytics ecosystem on AWS following the principles of hybrid cloud architecture to build a unified and flexible distributed computing environment.

By implementing this solution, Fidelity has received benefits like:

  • Major reduction in infrastructure cost (approx. 60%).
  • Up to 30% reduction in worker hours by eliminating redundant tasks.
  • No scope left for manual errors.
  • Resilience through Infrastructure as Code (IaC) implementation.
  • Quick future deployments by using reusable templates for AWS Step Functions.

Fidelity used the following AWS services:

As part of the Resolution Analytics use case, following are two patterns that were explored:

  1. Bring Your Own Container (BYOC): In this pattern, only the core ML service of SageMaker is used and the user brings all components of a trained model. This includes the docker image, data processing scripts, training scripts and prediction scripts. A model is created through a training job and onboarded into the SageMaker model store within a build pipeline. This pattern allows a higher degree of control over environments.
  2. Bring Your Own Script (BYOS): In this pattern, SageMaker utilizes its inbuilt container system and the user brings their own script to perform all the operations (like data processing, training and prediction). In this pattern, the created model is dependent on the AWS inbuilt container and gets onboarded into the SageMaker model registry within a build pipeline.

Fidelity International chose to build a Bring Your Own Model pattern: a comprehensive machine learning operations (MLOps) solution with end-to-end automation using the SageMaker BYOC design. In this pattern, an existing, pre-trained model is fetched and onboarded into the SageMaker model registry. The SageMaker BYOC pattern is then utilized. SageMaker machine learning is used and all components (including model, docker container, data processing, training and prediction scripts) are custom made and provided by the user.

Key features of the solution design

  • SageMaker used to build, validate, train, and deploy ML models.
  • MLOps automation to focus on streamlining the process of developing, training, deploying, and monitoring ML models.
  • MLOps pipeline orchestration using Step Functions.
  • Amazon SageMaker notebook instances to perform initial analysis and experiments by data scientists.
  • Terraform to create reusable templates and deploy infrastructure as code to provision all the components required for MLOps and environment setup.
  • Data and instances encrypted using AWS KMS keys for data at rest and using secure sockets layer (SSL) for data in transit.
  • CodeBuild to create and Amazon ECR to store custom images for the BYOC pattern.
  • Amazon EventBridge, Lambda and DynamoDB to trigger and maintain configurations for the MLOps pipeline.
  • Data to be sourced from Snowflake (ingested tactically into DVT Curated zone when not available strategically).
  • Custom AWS Identity and Access Management (IAM) policies following principle of least privilege to provide required AWS services access.
  • Tag-based approach to provide a level of segregation between resources created within the same services (such as Amazon S3, AWS KMS, SageMaker notebook instances and more). Data engineers or data scientists working on a specific use case will have access only on the components specific to that use case.

Architecture

Figure A – Hybrid Cloud Architecture for MLOps Automation

Figure A – Hybrid Cloud Architecture for MLOps Automation

Figure A captures the high-level solution design conforming to a hybrid cloud architecture:

  • Area 1 shows some on-premises components. Apache NiFi and Python based frameworks are used to load data into Snowflake. They have capabilities to connect with different sources like: Oracle, SQL Server, SharePoint Online, Secure File Transfer Protocol (SFTP) and such.
  • Area 2 describes the Snowflake/AWS console access by data scientists, the application team and business intelligence (BI) tools.
  • Area 3 represents central data lake built using Snowflake.
  • FIL DVT Team AWS Account (Area 4) represents complete AWS architecture and services along with SageMaker eco system being utilized for end-to-end MLOps automation.

Snowflake Connectivity Through AWS PrivateLink Setup

Fidelity International utilizes Snowflake as its data lake platform, which can be accessed by application teams using an AWS PrivateLink setup (a virtual private cloud endpoint service). PrivateLink can be used by applications to connect resources in their private cloud to Snowflake.

PrivateLink is a highly available, scalable technology used to privately connect Amazon Virtual Private Cloud (Amazon VPC) to services as if they are part of the same Amazon VPC. It doesn’t require an internet gateway, NAT device, public IP address, AWS Direct Connect connection, or AWS site-to-site VPN connection to allow communication with the service from the private subnets. Therefore, the applications control the specific API endpoints, sites, and services that are reachable from its Amazon VPC with secure, private, low latency, and fast network connections.

MLOps Pipelines
MLOps pipelines are the required services that orchestrate through the different components along with error handling and state maintenance. This is achieved by using Step Functions. At a high level, the complete process has been segregated into three pipelines:

  1. Extraction: Extract source data required for testing, training or predicting
  2. Build: Create or onboard machine learning model
  3. Prediction: Generate prediction

MLOps Pipeline Architecture and Design

  • Extraction Pipeline: An extraction pipeline is being utilized to extract data from the Fidelity International’s data lake on Snowflake.
    • A SageMaker processing job is used to fetch and execute scripts stored in an S3 bucket.
    • Raw data extracted is pushed back into Amazon S3 for further processing.

Figure B – MLOps Data Extraction Pipeline

Figure B – MLOps Data Extraction Pipeline

  • Build Pipeline: This is different based on the pattern that is being used.
  • Bring Your Own Container end-to-end Pattern: In this workflow, the build pipeline is being utilized to train, test and deploy machine learning models into SageMaker.
    • CodeBuild is used to pull code from a Bitbucket to Amazon S3 and create a docker image with the required dependency.
    • The SageMaker processing job is used to fetch raw data from Amazon S3, extracted in an extraction pipeline, to perform pre-processing tasks and create, train/test datasets.
    • The SageMaker training job is used to train the model and generate a model artifact, which is stored in Amazon S3.
    • The SageMaker processing job is used to perform an evaluation of the model.
    • Step Function states are used to verify an evaluation report.
    • If the evaluation result is above the defined threshold, then a model is created in the SageMaker Model registry, otherwise same process is repeated until the threshold is met.

Figure C – MLOps Model Build Pipeline for BYOC Pattern

Figure C – MLOps Model Build Pipeline for BYOC Pattern

  • Bring Your Own Model workflow by using the BYOC Pattern: In this workflow, the build pipeline is being utilized to onboard and deploy an already trained machine learning model into SageMaker. All the steps remain the same (as stated for the BYOC pattern) except for pre-processing and evaluation—training of the model is not required.
    • CodeBuild is used to pull code along with the model file (joblib) from the Bitbucket to Amazon S3 and creates a docker image with the required dependency.
    • A SageMaker training job is used to read the existing model file and generate a model artifact which is stored in Amazon S3.
    • The model is created in the SageMaker model registry.

Figure D – MLOps Model Build Pipeline for BYOM PatternFigure D – MLOps Model Build Pipeline for BYOM Pattern

  • Prediction Pipeline: A prediction pipeline is used to generate a prediction by utilizing the onboarded and deployed machine learning model into SageMaker.
    • A SageMaker processing job is used to fetch raw data from Amazon S3 and then performs a pre-processing task to create the prediction input dataset. Raw data was extracted in an extraction pipeline.
    • A SageMaker batch transform job is used to fetch the deployed SageMaker model and generate a prediction. The prediction output data is stored back into Amazon S3.
    • A SageMaker processing job is used to perform the required transformations on the prediction output and generate a final prediction report. The final report is also stored in Amazon S3.

Figure E – MLOps Model Prediction PipelineFigure E – MLOps Model Prediction Pipeline

Conclusion

With this architecture and well-engineered solution design, we overcame the challenge of running ML models on-premises. By utilizing the power of cloud, a flexible, robust, resilient, and secure MLOps automation process has been built on AWS following all the necessary best practices and standards.

Infrastructure constraints encountered with on-premises implementations along with other limitations like redundant effort and manual intervention were resolved automatically with this implementation. Also, by leveraging provisions like serverless infrastructure and pay-as-you-go model, we have drawn significant savings in overall infrastructure cost.

Fidelity are in the process of creating other MLOps pipelines for different types of model monitoring offered by Amazon SageMaker. So, look for this topic in an upcoming blog.

Contact an AWS Representative to know how we can help accelerate your business.

Further Reading

Acknowledgements

Other core contributors include: Gaurav Shekhar (data science practice lead), Vasant Kumar Vijayaraghavan (delivery manager, enterprise data portfolios) and Ajay Malik (solution architect) from Fidelity International along with Georgios Schinas and Mayur Udernani from AWS.

Customer Highlight

Fidelity International (FIL) offers investment solutions and retirement expertise to institutions, individuals and their advisers around the world. We bring together savings and pensions expertise with world-class investment choices — both our own and those of others — to help our clients build better futures for themselves and generations to come.

FIL manages total client assets of USD $728.6* billion from over 2.87 million clients across Asia Pacific, Europe, the Middle East, South America and Canada.

*All data as of 31 March 2023.

TAGS:
Anurag Varshney

Anurag Varshney

Anurag Varshney is an enterprise architect with 20+ years of experience in data architecture, business intelligence and Analytics. He works with Fidelity International leading the design and architecture for corporate functions along with the Data Value team - a center of excellence for Management Information & Analytics (MI&A) development. Anurag is passionate about data and currently focused on developing a cloud native, highly scalable MI&A architecture supporting data science and machine learning at Fidelity International.

Georgios Schinas

Georgios Schinas

Georgios Schinas is a Senior Specialist Solutions Architect for AI/ML in the EMEA region. He is based in London and works closely with customers in UK and Ireland. Georgios helps customers design and deploy machine learning applications in production on AWS with a particular interest in MLOps practices and enabling customers to perform machine learning at scale. In his spare time, he enjoys traveling, cooking and spending time with friends and family.

Gourav Pandey

Gourav Pandey

Gourav Pandey is a data solutions architect with a deep understanding of AWS services, architecture best practices and industry trends along with strong data engineering background. He specializes in designing and implementing highly available, robust, scalable, and cost-effective cloud-based solutions on the AWS platform. His current focus is establishing MLOps principles within analytics estate - enabling the deployment and management of machine learning models on AWS infrastructure.

Mayur Udernani

Mayur Udernani

Mayur Udernani leads AWS AI & ML business with commercial enterprises in UK & Ireland. In his role, Mayur spends majority of his time with customers and partners to help create impactful solutions that solve the most pressing needs of a customer or for a wider industry leveraging AWS Cloud, AI & ML services. Mayur lives in the London area. He has an MBA from Indian Institute of Management and Bachelors in Computer Engineering from Mumbai University.

Richard Ainley

Richard Ainley

Richard Ainley is a Senior Solutions Architect. He works with customers to help them craft highly scalable, flexible and resilient cloud architectures that address their business needs. He helps organizations understand best practices around advanced cloud-based solutions, and how to migrate existing workloads to the cloud. He is focused on the UK Financial Services sector, including Asset Managers, Hedge Funds, Insurers, and Retail Banks.