Machine Learning for Telecommunication

Machine Learning for Telecommunication deploys a scalable, customizable machine learning (ML) architecture that provides a framework for end-to-end ML workloads for use in telecommunications use cases. This guidance streamlines the process of ad-hoc data exploration, data processing and feature engineering, and machine learning model building including training, evaluation and performing predictions by deploying the model in an endpoint.

The guidance also includes a synthetic telecom IP Data Record (IPDR) dataset to demonstrate how to use ML algorithms to test and train models for predictive analysis in telecommunication. You can use the included Jupyter notebooks as a starting point for doing your own artificial intelligence research to develop your own custom ML models, or you can customize the included notebooks for your own use case.


The diagram below presents the architecture you can build using the example code on GitHub.

Machine Learning for Telecommunication guidance architecture

An Amazon Simple Storage Service (Amazon S3) bucket includes a synthetic IP Data Record (IPDR) dataset, an AWS Glue job converts the datasets, and an Amazon SageMaker instance includes Machine Learning (ML) Jupyter Notebooks.

The guidance ingests data from the Amazon S3 bucket into the Amazon SageMaker cluster and runs the Jupyter notebooks on the dataset.

The notebooks preprocess the data, extract features, and divide the data into training and testing. Amazon S3 Select reads the Parquet compressed data that was processed by the AWS Glue job. ML algorithms process the training dataset to develop a model to identify anomalies and predict future anomalies.

Machine Learning for Telecommunication

Version 1.1.1
Last updated: 12/2019
Author: AWS

Additional resources

Did this Guidance help you?
Provide feedback 


Machine Learning for Telecommunication Guidance

Leverage the Machine Learning for Telecommunication guidance out of-the-box, or for building your own machine learning guidance.

Synthetic dataset for training

This guidance includes synthetic demo IP Data Record (IPDR) datasets in Abstract Syntax Notation One (ASN.1) format and call detail record (CDR) format.
Build icon
Deploy an AWS Solution yourself

Browse our library of AWS Solutions to get answers to common architectural problems.

Learn more 
Find an APN partner
Find an AWS Partner Solution

Find AWS Partners to help you get started.

Learn more 
Explore icon
Explore Guidance

Find prescriptive architectural diagrams, sample code, and technical content for common use cases.

Learn more