Imagine you are building a website for recommendations. In your website, users see personalized movie title recommendations delivered in real time.

As part of your site, you want to generate movie title recommendations for users of the website. These movie title recommendations should be based on the users' browsing and viewing history.

In this lab, you learn how to use Amazon Personalize to train a solution for movie title recommendations. You use the AWS SDK for Python to prepare the data; then, you create a solution and campaign, and deploy the recommendation model in Amazon Personalize.

To make recommendations, Amazon Personalize uses a machine learning model that is trained with your data. The data used to train the model is stored in related datasets in a dataset group. Each model is trained by using a recipe that contains an algorithm for a specific use case. In Amazon Personalize, a trained model is known as a solution version. A solution version is deployed for use in a campaign. Users of your applications can receive recommendations through the campaign. For example, a campaign can show movie recommendations on a website or application where the title shown is based on viewing habits that were part of the dataset.

In Module 1, you create an Amazon SageMaker notebook instance and attach the appropriate policies required in this lab for your SageMaker role. Finally, you create the Jupyter notebook that you use during the lab.

Time to Complete Module: 20 Minutes


  • Step 1: Create an AWS account

    Use a personal AWS account or create a new AWS account for this lab. Do not use an organizational account so that you have full access to the necessary services and do not leave behind any resources from the lab. If you do not delete the resources used in this lab when you are finished, you may incur AWS charges.

  • Step 2: Create an Amazon SageMaker Notebook instance

    An Amazon SageMaker notebook instance is a fully managed machine learning (ML) Amazon Elastic Compute Cloud (Amazon EC2) compute instance that runs the Jupyter Notebook App.

    In this lab, you use the notebook instance to create and manage your Jupyter notebook that you can use to prepare data and to train and deploy your movie title personalization model.   

    To create an Amazon SageMaker Notebook instance:

    1. Open the Amazon SageMaker console.
    2. Choose Notebook instances, then choose Create notebook instance.
    3. On the Create notebook instance page, for Notebook instance name, type a name for your notebook instance.
    4. For Instance type, choose ml.t2.medium. This is the least expensive instance type that notebook instances support, and it suffices for this exercise.
    5. For IAM role, choose Create a new role.
    6. In the Create an IAM role box, choose Any S3 bucket, then choose Create role.
    7. Choose Create notebook instance.

    In a few minutes, Amazon SageMaker launches an ML compute instance—in this case, a notebook instance—and attaches an ML storage volume to it. The notebook instance has a preconfigured Jupyter notebook server and a set of Anaconda libraries.

  • Step 3: Attach policy to SageMaker IAM role

    In the previous step, Amazon SageMaker created an IAM role named Amazon SageMaker-ExecutionRole-*** with the required permissions and assigned it to your instance. To execute Amazon Personalize jobs in SageMaker, you need to attach the appropriate IAM policy to this role. For this step, you attach the IAMFullAccess and AmazonPersonalizeFullAccess AWS managed policies to the SageMaker IAM role.

    To attach the IAM and Amazon Personalize access policies to your SageMaker IAM role:

    1. In the left pane of the SageMaker Console, choose Notebooks, then Notebook instances.
    2. Choose the notebook instance you created in Step 2 to open the details view.
    3. In the Permissions and encryption section, choose the IAM role ARN hyperlink.
    4. On the Role Summary page, choose Attach policies.
    5. In the Filter policies search box, type IAMFull and select the IAMFullAccess policy.
    6. Type PersonalizeFull and select the AmazonPersonalizeFullAccess policy.
    7. Type S3Full and select the AmazonS3FullAccess policy.
    8. Choose Attach policy.

    Your IAM role should appear with the three full access policies you just added: IAMFullAccess, AmazonPersonalizeFullAccess, and AmazonS3FullAccess.


    (Click to enlarge)

  • Step 4: Create a Jupyter notebook

    You create a Jupyter notebook in your Amazon SageMaker Notebook instance. You also create a cell that gets the IAM role that your notebook needs to run Amazon SageMaker APIs and specifies the name of the Amazon S3 bucket that you will use to store the datasets that you use for your training data and the model artifacts that a Amazon SageMaker training job outputs.

    To create a Jupyter notebook:

    1. Open the Amazon SageMaker console.
    2. Choose Notebook Instances, and then open the notebook instance you created by choosing either Open Jupyter for classic Juypter view or Open JupyterLab for JupyterLab view.
      Note: If you see Pending to the right of the notebook instance in the Status column, your notebook is still being created. The status will change to InService when the notebook is ready for use.
    3. Create the notebook.
      • If you opened the notebook in Jupyter, on the Files tab, choose New, and conda_python3. This preinstalled environment includes the default Anaconda installation and Python.
      • If you opened the notebook in JupyterLab, on the File menu, choose New, and then choose Notebook. For Select Kernel, choose conda_python3. This preinstalled environment includes the default Anaconda installation and Python 3.
    4. In the Jupyter notebook, choose File and Save as, and name the notebook.

In this module, you learned about the example Amazon Personalize model you train in this lab. You also set up an AWS account and your lab environment with an Amazon SageMaker Notebook instance, IAM role, and a Jupyter notebook.

You are now ready to start the lab. In the next module, you download and prepare your dataset.