Image you are a machine learning developer working at a bank. You have been asked to develop a machine learning model to help analysts in your company with the amount of news that they need to read in order to make a investment decisions. The model will be trained on the 20newsgroups dataset that contains information on 20 topics in approximately 20,000 documents.
As part of your model, you need to extract semantic information from the news data, then identify similar news articles from the corpus and provide content recommendations to the analysts for similar news items based on the ones they are reading.
In this lab, you learn how to create an Amazon SageMaker Notebook instance, download, prepare and stage a dataset using a Jupyter notebook, train and deploy your topic model, and finally train and deploy the content recommendation model.
In Module 1, you configure your environment that you use during the lab.
Time to Complete Module: 20 Minutes
In this module, you learned about the example ML model you train in this lab. You also set up an AWS account and your lab environment with an Amazon S3 bucket, Amazon SageMaker Notebook instance, and a Jupyter notebook.
You are now ready to start the lab. In the next module, you download, prepare, and stage your dataset.