Guidance for Predicting B2B Churn on AWS

This Guidance uses machine learning (ML) to help you build a churn prediction model using structured and unstructured data. Customer churn, or customer attrition, measures the number of customers that stop using one of your products or services. By using a model that forecasts churn, you can take preventative action to identify behaviors and patterns that indicate churn probability for a set of customers. This Guidance can help business-to-business (B2B) organizations that use customer feedback and relationships to better understand customer satisfaction.

Please note: [Disclaimer]

Architecture Diagram

[text]

Download the architecture diagram PDF

Guidance Architecture Diagram for Predicting B2B Customer Churn

Step 1
A Large Language Model (LLM) running on Amazon SageMaker processes unstructured data from case management, ecommerce, and file systems by extracting sentiment data and recognizing entities and topics.

Step 2
An AWS Glue job consolidates structured customer data that has been extracted from e-commerce, customer relationship management (CRM), enterprise resource planning (ERP), and master data management (MDM) systems in addition to unstructured data insights obtained by the LLM into a single Amazon Simple Storage Service (Amazon S3) bucket.

Step 3
A SageMaker pipeline pre-processes the consolidated data and trains the churn model according to a speciﬁed metric by using an AutoML job. The AutoML job tries several ML models like neural networks or decision trees and adjusts its hyperparameters to obtain the best model possible for your data and use case.

Step 4
Create and register a churn model in the SageMaker model registry to manage versioning and deployment.

Step 5
An AWS Lambda function invokes a SageMaker batch inference job on a schedule using Amazon EventBridge to estimate the churn probability of a batch of customers. The inference results are stored in an S3 bucket.

Step 6
Amazon SageMaker Clarify helps you better understand the model’s reasoning behind the churn detection. It generates a report that identiﬁes the important features of the model that influenced the churn score and stores this in Amazon S3.

Step 7
A Lambda function generates a summary of churn results, which Amazon Simple Notiﬁcation Service (SNS) sends to decision-makers over email.

Step 8
You can query and analyze the churn results stored in S3 through Amazon Athena and AWS Glue. Additionally, you can visualize and further analyze churn results using Amazon QuickSight.

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Operational Excellence

This Guidance incorporates text data to enrich the dataset used to create a SageMaker model to predict a customer’s risk of churn. Amazon S3 stores the churn results as CSV files, and Athena queries the results without requiring additional operational overhead. Amazon SNS sends automated analysis reports to decision-makers so they can quickly act and reduce the likelihood of customer churn.

Read the Operational Excellence whitepaper
Security

AWS Identity and Access Management (IAM) controls access to data, ML models, and churn insights through granular permissions based on roles. Additionally, SageMaker can only access data through Amazon Virtual Private Cloud (Amazon VPC) endpoints. This means that data does not travel across the public internet, limiting potential points of data exposure.

Read the Security whitepaper
Reliability

SageMaker uses distributed training libraries to reduce training time and optimize model scaling. SageMaker also initiates batch transformation tasks across multiple Availability Zones to reduce risk of failure during training. If one Availability Zone fails, training can continue across another Availability Zone. Additionally, Athena, QuickSight, and AWS Glue are serverless services, making it easy to scale data queries and visualizations without you having to worry about provisioning additional infrastructure.

Read the Reliability whitepaper
Performance Efficiency

SageMaker batch inference allows you to process batches of data so you can run churn analysis on a set of customers at a time, rather than requiring you to have an endpoint up and running at all times. To support spikes in batch inference workloads, Lambda provides serverless compute that automatically scales based on demand.

Read the Performance Efficiency whitepaper
Cost Optimization

To help reduce costs, AWS Glue jobs are used for extract, transform, and load (ETL) on a batch of user data rather than individual records. Additionally, Lambda processes events to start batch transformation analysis so that you can spin up compute capacity only as needed rather than having a server running at all times. A combination of AWS Glue, Athena, and QuickSight consume churn insights as the most cost-effective way to read batched data stored in Amazon S3.

Read the Cost Optimization whitepaper
Sustainability

By extensively using serverless services, such as Lambda, AWS Glue, Athena, and QuickSight, you maximize overall resource utilization as compute is only used as needed. These serverless services scale to meet demand, reducing the overall energy required to operate the workload. You can also use the AWS Billing Conductor carbon footprint tool to calculate and track the environmental impact of the workload over time at an account, Region, and service level.

Read the Sustainability whitepaper

Implementation Resources

A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

Open implementation guide

Open sample code on GitHub

Implement customer churn prediction by performing advanced analytics on customer data

Architecture Diagram

Well-Architected Pillars

Implementation Resources

Related Content

Title

Disclaimer

Was this page helpful?

Guidance for Predicting B2B Churn on AWS

Implement customer churn prediction by performing advanced analytics on customer data

Architecture Diagram

Well-Architected Pillars

Implementation Resources

Related Content

Title

Disclaimer

Was this page helpful?

Ending Support for Internet Explorer