AWS Machine Learning Blog

Measure the Business Impact of Amazon Personalize Recommendations

We’re excited to announce that Amazon Personalize now lets you measure how your personalized recommendations can help you achieve your business goals. After specifying the metrics that you want to track, you can identify which campaigns and recommenders are most impactful and understand the impact of recommendations on your business metrics.

All customers want to track the metric that is most important for their business. For example, an online shopping application may want to track two metrics: the click-through rate (CTR) for recommendations and the total number of purchases. A video-on-demand platform that has carousels with different recommenders providing recommendations may wish to compare the CTR or watch duration. You can also monitor the total revenue or margin of a specified event type, for example when a user purchases an item. This new capability lets you measure the impact of Amazon Personalize campaigns and recommenders, as well as interactions generated by third-party solutions.

In this post, we demonstrate how to track your metrics and evaluate the impact of your Personalize recommendations in an e-commerce use case.

Solution overview

Previously, to understand the effect of personalized recommendations, you had to manually orchestrate workflows to capture business metrics data, and then present them in meaningful representations to draw comparisons. Now, Amazon Personalize has eliminated this operational overhead by allowing you to define and monitor the metrics that you wish to track. Amazon Personalize can send performance data to Amazon CloudWatch for visualization and monitoring, or alternatively into an Amazon Simple Storage Service (Amazon S3) bucket where you can access metrics and integrate them into other business intelligence tools. This lets you effectively measure how events and recommendations impact business objectives, and observe the outcome of any event that you wish you monitor.

To measure the impact of recommendations, you define a “metric attribution,” which is a list of event types that you want to report on using either the Amazon Personalize console or APIs. For each event type, you simply define the metric and function that you want to calculate (sum or sample count), and Amazon Personalize performs the calculation, sending the generated reports to CloudWatch or Amazon S3.

The following diagram shows how you can track metrics from a single recommender or campaign:

Figure 1. Feature Overview: The interactions dataset is used to train a recommender or campaign. Then, when users interact with recommended items, these interactions are sent to Amazon Personalize and attributed to the corresponding recommender or campaign. Next, these metrics are exported to Amazon S3 and CloudWatch so that you can monitor them and compare the metrics of each recommender or campaign.

Metric attributions also let you provide an eventAttributionSource, for each interaction, which specifies the scenario that the user was experiencing when they interacted with an item. The following diagram shows how you can track metrics from two different recommenders using the Amazon Personalize metric attribution.

Figure 2. Measuring the business impact of recommendations in two scenarios: The interactions dataset is used to train two recommenders or campaigns, in this case designated “Blue” and “Orange”. Then, when users interact with the recommended items, these interactions are sent to Amazon Personalize and attributed to the corresponding recommender, campaign, or scenario to which the user was exposed when they interacted with the item. Next, these metrics are exported to Amazon S3 and CloudWatch so that you can monitor them and compare the metrics of each recommender or campaign.

In this example, we walk through the process of defining metrics attributions for your interaction data in Amazon Personalize. First, you import your data, and create two attribution metrics to measure the business impact of the recommendations. Then, you create two retail recommenders – it’s the same process if you’re using custom recommendation solution – and send events to track using the metrics. To get started, you only need the interactions dataset. However, since one of the metrics we track in this example is margin, we also show you how to import the items dataset. A code sample for this use case is available on GitHub.

Prerequisites

You can use the AWS Console or supported APIs to create recommendations using Amazon Personalize, for example using the AWS Command Line Interface or AWS SDK for Python.

To calculate and report the impact of recommendations, you first need to set up some AWS resources.

You must create an AWS Identity and Access Management (IAM) role that Amazon Personalize will assume with a relevant assume role policy document. You must also attach policies to let Amazon Personalize access data from an S3 bucket and to send data to CloudWatch. For more information, see Giving Amazon Personalize access to your Amazon S3 bucket and Giving Amazon Personalize access to CloudWatch.

Then, you must create some Amazon Personalize resources. Create your dataset group, load your data, and train recommenders. For full instructions, see Getting started.

  1. Create a dataset group. You can use metric attributions in domain dataset groups and custom dataset groups.
  2. Create an Interactions dataset using the following schema:
    { "type": "record", 
    "name": "Interactions",
     "namespace": "com.amazonaws.personalize.schema", 
    "fields": [ 
        {
            "name": "USER_ID",
            "type": "string"
        },
        {
            "name": "ITEM_ID",
            "type": "string"
        },
        {
            "name": "TIMESTAMP",
            "type": "long"
        },
        {
            "name": "EVENT_TYPE",
            "type": "string"
        }
    ],
     "version": "1.0" 
    }
  3. Create an Items dataset using the following schema:
    {
        "type": "record",
        "name": "Items",
        "namespace": "com.amazonaws.personalize.schema",
        "fields": [
            {
                "name": "ITEM_ID",
                "type": "string"
            },
            {
                "name": "PRICE",
                "type": "float"
            },
            {
                "name": "CATEGORY_L1",
                "type": ["string"],
                "categorical": True
            },
            {
                "name": "CATEGORY_L2",
                "type": ["string"],
                "categorical": True
            },
            {
                "name": "MARGIN",
                "type": "double"
            }
        ],
    "version": "1.0"
    }

Before importing our data to Amazon Personalize, we will define the metrics attribution.

Creating Metric Attributions

To begin generating metrics, you specify the list of events for which you’d like to gather metrics. For each of the event types chosen, you define the function that Amazon Personalize will apply as it collects data – the two functions available are  SUM(DatasetType.COLUMN_NAME) and SAMPLECOUNT(), where DatasetType can be the INTERACTIONS or ITEMS dataset. Amazon Personalize can send metrics data to CloudWatch for visualization and monitoring, or alternatively export it to an S3 bucket.

After you create a metric attribution and record events or import incremental bulk data, you’ll incur some monthly CloudWatch cost per metric. For information about CloudWatch pricing, see the CloudWatch pricing page. To stop sending metrics to CloudWatch, delete the metric attribution.

In this example, we’ll create two metric attributions:

  1. Count the total number of “View” events using the SAMPLECOUNT(). This function only requires the INTERACTIONS dataset.
  2. Calculate the total margin when purchase events occur using the SUM(DatasetType.COLUMN_NAME) In this case, the DatasetType is ITEMS and the column is MARGIN because we’re tracking the margin for the item when it was purchased. The Purchase event is recorded in the INTERACTIONS dataset. Note that, in order for the margin to be triggered by the purchase event, you would be sending a purchase event for each individual unit of each item purchased, even if they’re repeats – for example, two shirts of the same type. If your users can purchase multiples of each item when they checkout, and you’re only sending one purchase event for all of them, then a different metric will be more appropriate.

The function to calculate sample count is available only for the INTERACTIONS dataset. However, total margin requires you to have the ITEMS dataset and to configure the calculation. For each of them we specify the eventType that we’ll track, the function used, and give it a metricName that will identify the metrics once we export them. For this example, we’ve given them the names “countViews” and “sumMargin”.

The code sample is in Python.

import boto3 
personalize = boto3.client('personalize')

metrics_list = [{
        "eventType": "View",
        "expression": "SAMPLECOUNT()",
        "metricName": "countViews"
    },
    {
        "eventType": "Purchase",
        "expression": "SUM(ITEMS.MARGIN)",
        "metricName": "sumMargin"
}]

We also define where the data will be exported. In this case to an S3 bucket.

output_config = {
    "roleArn": role_arn,
    "s3DataDestination": {
    "path": path_to_bucket    
    }
}

Then we generate the metric attribution.

response = personalize.create_metric_attribution(
name = metric_attribution_name,
datasetGroupArn = dataset_group_arn,
metricsOutputConfig = output_config,
metrics = metrics_list
)
metric_attribution_arn = response['metricAttributionArn']

You must give a name to the metric attribution, as well as indicate the dataset group from which the metrics will be attributed using the datasetGroupArn, and the metricsOutputConfig and metrics objects we created previously.

Now with the metric attribution created, you can proceed with the dataset import job which will load our items and interactions datasets from our S3 bucket into the dataset groups that we previously configured.

For information on how to modify or delete an existing metric attribution, see Managing a metric attribution.

Importing Data and creating Recommenders

First, import the interaction data to Amazon Personalize from Amazon S3. For this example, we use the following data file. We generated the synthetic data based on the code in the Retail Demo Store project. Refer to the GitHub repository to learn more about the synthetic data and potential uses.

Then, create a recommender. In this example, we create two recommenders:

  1. “Recommended for you” recommender. This type of recommender creates personalized recommendations for items based on a user that you specify.
  2. Customers who viewed X also viewed. This type of recommender creates recommendations for items that customers also viewed based on an item that you specify.

Send events to Amazon Personalize and attribute them to the recommenders

To send interactions to Amazon Personalize, you must create an Event Tracker.

For each event, Amazon Personalize can record the eventAttributionSource. It can be inferred from the recommendationId or you can specify it explicitly and identify it in reports in the EVENT_ATTRIBUTION_SOURCE column. An eventAttributionSource can be a recommender, scenario, or third-party-managed part of the page where interactions occurred.

  • If you provide a recommendationId, then Amazon Personalize automatically infers the source campaign or recommender.
  • If you provide both attributes, then Amazon Personalize uses only the source.
  • If you don’t provide a source or a recommendationId, then Amazon Personalize labels the source SOURCE_NAME_UNDEFINED in reports.

The following code shows how to provide an eventAttributionSource for an event in a PutEvents operation.

response = personalize_events.put_events(
trackingId = 'eventTrackerId',
userId= 'userId',
sessionId = 'sessionId123',
eventList = [{
'eventId': event_id,
'eventType': event_type,
'itemId': item_id,
'metricAttribution': {"eventAttributionSource": attribution_source},
'sentAt': timestamp_in_unix_format
}
}]
)
print (response)

Viewing your Metrics

Amazon Personalize sends the metrics to Amazon CloudWatch or Amazon S3:

For all bulk data, if you provide an Amazon S3 bucket when you create your metric attribution, you can choose to publish metric reports to your Amazon S3 bucket. You need to do this each time you create a dataset import job for interactions data.

import boto3

personalize = boto3.client('personalize')

response = personalize.create_dataset_import_job(
    jobName = 'YourImportJob',
    datasetArn = 'dataset_arn',
    dataSource = {'dataLocation':'s3://bucket/file.csv'},
    roleArn = 'role_arn',
    importMode = 'INCREMENTAL',
    publishAttributionMetricsToS3 = True
)

print (response)

When importing your data, select the correct import mode INCREMENTAL or FULL and instruct Amazon Personalize to publish the metrics by setting publishAttributionMetricsToS3 to True. For more information on publishing metric reports to Amazon S3, see Publishing metrics to Amazon S3.

For PutEvents data sent via the Event Tracker and for incremental bulk data imports, Amazon Personalize automatically sends metrics to CloudWatch. You can view data from the previous 2 weeks in Amazon CloudWatch – older data is ignored.

You can graph a metric directly in the CloudWatch console by specifying the name that you gave the metric when you created the metric attribution as the search term. For more information on how you can view these metrics in CloudWatch, see Viewing metrics in CloudWatch.

Figure 3: An example of comparing two CTRs from two recommenders viewed in the CloudWatch Console.

Importing and publishing metrics to Amazon S3

When you upload your data to Amazon Personalize via a dataset import job, and you have provided a path to your Amazon S3 bucket in your metric attribution, you can view your metrics in Amazon S3 when the job completes.

Each time that you publish metrics, Amazon Personalize creates a new file in your Amazon S3 bucket. The file name specifies the import method and date. The field EVENT_ATTRIBUTION_SOURCE specifies the event source, i.e., under which scenario the interaction took place. Amazon Personalize lets you specify the EVENT_ATTRIBUTION_SOURCE explicitly using this field, this can be a third-party recommender. For more information, see Publishing metrics to Amazon S3.

Summary

Adding metrics attribution let you track the effect that recommendations have on business metrics. You create these metrics by adding a metric attribution to your dataset group and selecting the events that you want to track, as well as the function to count the events or aggregate a dataset field. Afterward, you can see the metrics in which you’re interested in CloudWatch or in the exported file in Amazon S3.

For more information about Amazon Personalize, see What Is Amazon Personalize?


About the authors

Anna Grüebler is a Specialist Solutions Architect at AWS focusing on in Artificial Intelligence. She has more than 10 years of experience helping customers develop and deploy machine learning applications. Her passion is taking new technologies and putting them in the hands of everyone, and solving difficult problems leveraging the advantages of using AI in the cloud.


Gabrielle Dompreh is Specialist Solutions Architect at AWS in Artificial Intelligence and Machine Learning. She enjoys learning about the new innovations of machine learning and helping customers leverage their full capability with well-architected solutions.