AWS Machine Learning Blog

Amazon Personalize can now unlock intrinsic signals in your catalog to recommend similar items

Today, we’re excited to announce a new similar items recommendation recipe (aws-similar-items) in Amazon Personalize that helps you leverage your users’ interaction histories and what you know about the items in your catalog to deliver relevant recommendations.

Across Amazon, we provide personalized experiences for each of our users, and based on a user’s interests, we change their experiences and the items they see. Visitors are often recommended items that users with similar histories have interacted with. These recommendations are called similar items, and they help users discover items relevant to what they’re watching or purchasing. By taking into account the item a user is engaged with, we can improve engagement and conversion. This new recipe uses co-occurrence in interactions data (how often these items appear together across user histories) and thematic similarity (what is similar about the items in your catalog) when making recommendations to better quantify similarity for less popular or new items in your catalog.

This post shows you how to use our new recipe (aws-similar-items) and illustrates the difference compared to our collaborative filtering-based recipe (SIMS).

With this new recipe, similar item recommendations in Amazon Personalize are no longer limited to using only user-item interactions based on the co-occurrence of an item across users’ interaction histories. Co-occurrence is not the only way to define what is similar; thematic similarity takes advantage of the information contained in what you know about your items. Metadata and detailed descriptions used to describe your items contain valuable information about features relevant to your users. Items with similar features are similar whether customers have interacted with them or not. For example, video content set in the same time period, news articles covering common events, or retail items in the same shopping category are thematically similar independent of how users have interacted with them. Our new recipe unlocks these signals for Amazon Personalize to learn from. You can use the investments made to create rich and concise narratives about items to more effectively engage your users.

In Amazon Personalize, the aws-similar-items recipe uses deep learning based techniques and the knowledge you have about your items to identify similarity. This recipe makes sure that customers are exposed to a wider variety of relevant items, which drives better outcomes. Amazon Personalize enables developers to build applications using machine learning (ML) technology to deliver personalized user experiences with no ML expertise required. We make it easy for developers to build applications capable of delivering a wide array of experiences. Amazon Personalize is a fully managed ML service that goes beyond static rule-based recommendation systems and allows customers across industries to create custom recommenders that provide highly personalized user experiences. You receive results via an Application Programming Interface (API) and only pay for what you use, with no minimum fees or upfront commitments. All data is encrypted to be private and secure, and is only used to create recommendations for your users.

Solution architecture

The notebook that accompanies this post demonstrates how item metadata for item-to-item similarity improves the variety of recommendations. We use one Amazon Personalize dataset group with user-item interaction data and item metadata. We create two solutions using each of our related-items recipes, aws-similar-items and SIMS. The aws-similar-items recipe uses both user-item interaction history and item metadata to identify similar items in your catalog. SIMS only uses the user-item interaction history. We then recommend items based on a common seed item.

The following diagram illustrates the architecture we use across this post in examples and comparisons.

To demonstrate the difference in recommendations from two solution versions, we compare the results generated using each recipe. This allows us to evaluate how the inclusion of item metadata changes recommendations based on additional dimensions of similarity.

Comparing similar items’ inference results using Amazon Prime Pantry’s dataset

SIMS uses collaborative filtering, a technique that is widely used across item-to-item recommender systems. The recipe is based on an item’s co-occurrence statistics derived from user interaction data. Because the predictions are purely driven by these statistics around a user’s behavior, it works well when you have a large set of interactions data. This approach is fast and reliable during training and inference, but lacks support for intrinsic content. This means that we don’t include valuable information that accounts for thematic similarities across different items or services in a catalogue.

Our new Amazon Personalize recipe (aws-similar-items) uses a deep learning architecture that supports item-metadata along with user-item interactions. You can use this enhancement to provide a richer and more similar inference response.

The following screenshots and examples are derived from the following notebook hosted in the Amazon Personalize Samples GitHub repository. For this example, we used the Amazon Prime Pantry reviews dataset.

First let’s look at the steps we took in this experiment:

  1. Transform reviews into interactions.
  2. Select the most relevant item features to use as metadata:
    1. Brand
    2. Price
    3. Description to be analyzed as unstructured text using our unstructured text feature
  3. Train two Amazon Personalize solution versions using the SIMS and aws-similar-items recipes.

Before we look at our recommendations, we consider how many user-item interactions exist for each of our item IDs.

The following screenshots show the top five most and least interacted items in our dataset (left table), as well as metrics about the distribution of number of user-item interactions (right table).

Some items have thousands of interactions and others have very limited interactions. Looking at these numbers and knowing that SIMS identifies similarities only using user-item co-occurrences, we hypothesize that recommendations using an item with limited interactions will heavily lean towards the most popular items in the catalog. On the other hand, we expect our new aws-similar-items recipe recommendations to be less influenced by popularity and use the item-metadata provided.

Let’s prove this hypothesis with a few examples.

The following screenshot shows a commonly known product, laundry detergent. Here it’s specifically Tide Original Liquid Laundry Detergent. This item is one of the highest interacted items in our dataset, with almost 1,800 user-item interactions.

The following recommendations from the SIMS model are spot on. The price is in range, and the interaction count of the recommended items is very well distributed around the mean and standard deviation for the dataset. Great job SIMS!

The following recommendations are from the aws-similar-items model, two out of five are common and they are thematically similar.

This makes sense, given our item user-item interactions provide strong signals to Amazon Personalize, and in the case of aws-similar-items, these signals are enhanced with the item metadata.

Now we run the same exercise with a less popular item in our dataset. Although one that many consumers know well: Pepsi.

Based on the item’s full description, we can guess what you would naturally think. Similar items recommendations should be aligned with things that pair well with a soda!

The following screenshot shows SIMS recommendations.

The table tells us that the current item’s user-item interactions count (six) isn’t high enough for Amazon Personalize SIMS recipe to learn good interactions-only based signals. Consequently, the model falls back to recommending popular times. This is the case for a cold item.

But with our new recipe (aws-similar-items), we get the following results.

These recommendations look a lot better. We see a mix of snacks that are in a similar price range that are commonly found with a soft drink purchase. Although users in this dataset may be messy eaters and in need of detergent to wash those stains away.

This might be a good place to apply a filter to make sure that only food items are recommended.

As this example has demonstrated, our new aws-similar-items recipe is delivering the results we expected.

Conclusion

The quality of recommendation provided by Amazon Personalize is only as good as the model and the data made available. Amazon Personalize can now use additional data to provide improved similar items recommendation by using user-item interaction histories and item metadata. Detailed descriptions and other attributes provide valuable signals to recommend more similar items, especially for items with limited interactions histories. Cold starting recommendations for new items or ensuring that popularity doesn’t overly bias your results is a key use case for this new recipe in Amazon Personalize.

 We suggest running through the notebook example and conducting more experiments to see how recommendations from these models compare.


About the Authors

Luis Lopez Soria is an AI/ML specialist solutions architect working with the Amazon Machine Learning team. He works with AWS customers to help them adopt machine learning on a large scale. He enjoys playing sports, traveling around the world, and exploring new foods and cultures.

Matt Chwastek is a Senior Product Manager for Amazon Personalize. He focuses on delivering products that make it easier to build and use machine learning solutions. In his spare time, he enjoys reading and photography.

 Nghia Hoang is a Senior Machine Learning Scientist at AWS AI Labs working on developing personalized learning methods with applications to recommender systems. His research interests include Probabilistic Inference, Deep Generative Learning, Personalized Federated Learning and Meta Learning.

Manas Apte is a Software Development Manager for Amazon Personalize. He focuses on developing algorithms for recommender systems using deep learning. In his spare time, he enjoys reading and working in his garden.