AWS for Industries

Using Amazon Athena and Amazon SageMaker for Improved Customer Churn Analysis

Customer churn is a growing concern for many companies—and not without merit. During the pandemic, 75% of US consumers switched brands or stores, and 60% of them plan to incorporate new buying patterns and habits post-COVID. In other words, 45% of consumers shifted their brand choices over the long term.

But customer churn is hardly new, and it can come at various levels, such as switching to competitors, canceling subscriptions, a drop in visits due to poor customer service, ghosting brands due to few touchpoints, and more. All the same, the issues surrounding churn are wide ranging. For example, while snack brand purchase cycles range from one to two weeks, other product categories can be several weeks or months. Every shopping trip provides opportunities for the shopper to switch brands based on perceived performance, price, availability, or even packaging.

While companies have designed new customer retention strategies, data and analytics provide an opportunity to predict customer churn faster and increase retention. Retaining customers not only builds a loyal customer base, but also ensures cost effectiveness on marketing campaigns designed to acquire new customers.

Furthermore, customer churn is one of the biggest reasons for lost company revenue. Therefore, efficient churn analysis and prediction is crucial. Reports suggest that reducing customer churn by just 5% can increase profits by 25-95%. This means that correctly predicting customers with a higher churning risk gives companies an advantage to plan accordingly and thus retain them.

An effective churn analysis mechanism has several steps. These include gathering available customer behaviors and usage patterns, converting structured data into meaningful insights in order to identify the cause of churn, and designing effective ways to engage with customers and implement customer retention strategies. One of our clients running several workloads on AWS sought to utilize these strategies. They wanted to identify the reasons for customer churn and prevent them in order to increase customer lifetime value.

In this post, we share how we developed a churn analysis mechanism for a leading garden supplies company. Our mechanism helped the client predict churn, maintain customer loyalty, and improve their targeted marketing campaign efficiency.

The Business Context

The client is a leading garden supplies company with a large customer base who subscribes to its products and services on monthly, quarterly, semi-annual, and annual bases. Unpredictable churn was impacting their revenue goals, so the company wanted to prevent churn by identifying the reasons that customers stopped making subsequent purchases. They also wanted to predict customers who were at a higher risk of churning by taking preventive actions in order to improve their customer lifetime value.

In addition, the client wanted to improve their sales via direct-to-consumer (DTC) websites and retail stores, as well as drive subscriptions, all while boosting product demand. While they had access to historical data, the company did not have clearly defined strategies for data analysis and high-risk churn prediction.

Using Amazon Athena for Developing Analytical Datasets

We established a clear approach to solve this challenge. We needed to understand the customer segments, predict churn, retain customers, and improve the targeted marketing campaign efficiency.

We integrated 15 different data sources, consisting of customer information and historical purchases, to create a strong audience profile. We collected, cleansed, and prepared the data for segmentation, and then stored it in the client’s Amazon Simple Storage Service (Amazon S3) buckets. Further analytical datasets were developed and modeled on Amazon Athena, which allows data preprocessing from multiple sources. The data was then filtered for outliers, aggregated, sorted, and merged by using Amazon Athena.

One of the biggest advantages Amazon Athena provides is that it lets us easily and quickly analyze unstructured, semistructured, and structured data that is stored in various formats. Moreover, it allows integration with the AWS Glue Data Catalog. As such, it offers persistent metadata storage for data in Amazon S3, which can be utilized to create tables and query data on Amazon Athena based on a centralized metadata store.

Amazon Athena also scales automatically, starts queries in parallel, and ensures high performance. Amazon Athena is serverless, meaning no infrastructure maintenance is required. This allows for hassle-free configuration while requiring payments for only the queries that Athena runs. It is easy to set up with just a few clicks in the AWS Management Console or via the AWS Command Line Interface (CLI).

After an initial customer data analysis, the client’s customer data was segmented based on subscription plans at a granular level. A prediction model for each customer segment was created based on the historical analysis of churned customers. This gave the client deep insight into their customer subscription trends.

Model Training on Amazon SageMaker

Next, we conducted feature engineering for segment-level modeling in each model in order to extract features based on parameters, such as subscription type, number of orders, the amount spent on the purchase, and more. We chose XGBoost as the machine learning (ML) algorithm for the model training, and then conducted SHAP (SHapley Additive exPlanations) to explain the prediction.

Once the algorithm was determined, we trained and evaluated each model for different customer segments separately. Then, we combined the predictions to identify the customers with the highest churning probability. The ML training and prediction models were conducted in Python on Amazon SageMaker to ensure that the models were optimized to operate at scale.

We utilized Amazon SageMaker because it provides ease to train, tune, evaluate, and deploy models, as well as tracks model performance. With time, we gained insights into when the model needed rebuilding from scratch. Set up was quick, requiring an AWS account, followed by the use of Amazon SageMaker notebooks to develop the ML solution.

Amazon SageMaker also provides high scalability, faster training, uptime maintenance, high data security, and many more. Furthermore, it includes built-in, optimized ML algorithms, like XGBoost, which are widely utilized for training purposes.

Ultimately, we captured various actions taken by the client’s CRM and marketing team that effectively prevented churn. We fed it back into the mechanism in order to recommend actions in the event of predicted churn.

The model output was visualized on Tableau, allowing the client to see the customers at higher risk of churning, and then take corrective actions. This let our client make highly accurate calculations of the churn probability. Utilizing the output, we built a dashboard on Tableau with metrics, such as month-over-month retention rate, churn reasons, revenue generated, and more. Ultimately, this led to improved results tracking and huge cost savings for the client.

Solution Benefits

The customer churn analysis model created by Sigmoid via AWS services identified 70% of the customers in each segment who were going to churn. It increased churn prediction accuracy by 2.5 times as compared with the client’s previous solution. And the model accuracy lowered marketing costs, as campaigns were more effectively targeted.

By identifying loyal customers, the model developed profiles that could be utilized to target lookalike audiences for the same products. Then, the client was able to design new marketing campaigns in order to retain customers and reduce customer churn by 15%.

If you would like more information about our customer churn analysis solution from Sigmoid Analytics on AWS, leave a comment on this blog. To request a demo, visit Sigmoid Analytics or contact your AWS account team today.

Danny Yin

Danny Yin

Danny (Yen-Lin) Yin is the Global Technical Lead for AWS Partners in the CPG industry. He joined AWS in 2018 with 18 years of experience in ecommerce application development and operations. Danny helps CPG companies enhance the consumer digital user experience and gain operational efficiency across different lines of business. Danny is also responsible for solutions architecture and technical guidance for CPG technology and consulting partners on AWS. Before he joined AWS, Danny was Director of Digital Engineering at Toys”R”Us, where he successfully migrated the world’s largest toy webstore from an outsourced application to an in-house hybrid cloud application on AWS.

Srishti Deoras

Srishti Deoras

Srishti Deoras is Content Marketing Manager at Sigmoid with a background in tech journalism. She has extensively covered data science and AI space in the past and is passionate about the technologies that define them.