This Guidance demonstrates how telecommunication companies can use machine learning (ML) to learn and identify fraudulent patterns and flag them for investigation. This can also help you to reduce revenue leakage, and remove the operational overhead of managing rulesets.
Architecture Diagram
Step 1
At training time, telecom data is batch transferred , or streamed using Amazon Kinesis, into an Amazon Simple Storage Service (Amazon S3) bucket. You can use AWS Glue Data Catalog to catalog the data.
Step 2
Data is feature engineered using Amazon SageMaker Data Wrangler and transformed into features. Data can be sourced direct from Amazon S3 or using Amazon Athena queries. Features are stored in Amazon SageMaker Feature Store.
Step 3
A custom (classification) model for fraud detection is trained using Amazon SageMaker. The model is tested and validated to ensure the model is regularized and performant for real-world use.
Step 4
Trained models are stored in SageMaker Model Registry to track and manage the model over time.
Step 5
Use Amazon SageMaker Model Monitor to monitor model quality over time, including data and model quality, and bias drift.
Step 6
Once all tests for accuracy and performance have passed, the model is deployed using SageMaker endpoints to support near real-time inference. SageMaker endpoints help manage scalability, efficient operation, and reliability.
Step 7
At inference time, data from the telco and partners is streamed in using Amazon Kinesis to an AWS Lambda function. Amazon API Gateway provides features to control access to the model endpoint. The fraud detection result produced by the model can then be consumed by the telco.
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
The telecom’s data is used to create models to identify fraud within the Telecom operator’s business. Model inferences are returned to the telecom application to help the operator identify and stop fraud to reduce revenue leakage.
-
Security
All data is encrypted, both in motion and at rest. Encrypted S3 buckets store data. SageMaker can only access that data through the VPC (and not the internet). Training is done in secure containers, and the results are stored in encrypted S3 buckets. AWS Glue can also employ functions to redact / anonymise data to prevent leaks of PII data.
-
Reliability
SageMaker hosting is used to server the trained model, which takes advantage of multiple Availability Zones and Elastic scaling groups. Lambda and API Gateway are used to ensure availability of the service.
-
Performance Efficiency
SageMaker endpoints can scale up and down to ensure the minimum number of instances needed are running. Instance sizes are measured using SageMaker Instance Recommender to ensure costs are minimised. Serverless services such as Lambda as used to make the process efficient and reduce usage of compute and storage resources.
-
Cost Optimization
SageMaker endpoints can scale up and down to ensure the minimum number of instances needed are running. Instance sizes are measured using SageMaker Instance Recommender to ensure costs are minimised. Lambda is also very cost efficient compared to EC2 instances, and are used here to reduce cost.
-
Sustainability
By extensively utilizing managed services and dynamic scaling, we minimize the environmental impact of the backend services. All compute instances are right-sized to provide maximum utility.
Implementation Resources
A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Related Content
[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.