In this module, you download your dataset, preprocess the dataset, separate the dataset into training and validation, then stage the dataset in your Amazon S3 bucket.
Time to Complete Module: 20 Minutes
In this module, you imported and fetched the dataset you use for your content recommendation system. Then, you prepared the dataset through preprocessing, lemmatization and tokenization. Finally, you split the dataset into training and validation sets, then staged them in your Amazon S3 bucket.
In the next module, you training your topic model with the Amazon SageMaker NTM Algorithm and deploy the model to Amazon SageMaker.