AWS Business Intelligence Blog

Build a market basket analysis dashboard using nested filters in Amazon QuickSight

Amazon QuickSight is a scalable, serverless, machine learning (ML)-powered business intelligence (BI) solution. As a fully managed service, QuickSight lets you create and publish interactive dashboards that can be accessed from any device and embedded into your applications, portals, and websites.

Many organizations want to uncover correlations between products purchased by customers. Retailers, for example, analyze their sales transactions to gain insights into the purchasing behavior of customers who bought a particular product. This powerful market basket analysis technique helps you predict product sales and understand customer behavior. You can use it in various areas, such as product cross-selling, ecommerce recommendation engines, and inventory management. Traditionally, building market basket analysis dashboards requires data engineering pipelines that can take weeks to implement, because these often depend on ETL (extract, transform, and load) jobs, complex SQL operations, and updates on the data pipeline. The nested filter capability in QuickSight simplifies this process with a no-code interface.

In this post, we show you how to configure nested filters in a QuickSight dashboard and how they can aid in different business use cases within market basket analysis. We show how nested filters can provide more advanced filtering to help solve common challenges with market basket analysis dashboards in four different use cases.

The four use cases described in this blog post will all depend on a dataset containing information on super market sales transaction and customer attributes. The dataset includes details such as customer ID, customer name, age group, product, store ID, price and quantity, as shown in the screenshot below.

Customer_id Name Age Product Store_id Price Quantity
2552 Aaron 41-45 whole milk 1234 2.99 1
4941  Abby 60-65 pip fruit 1234 1.29 2
4501  Abigail 31-35 other vegetables 1234 2.99 1
4119  Ada 41-45 buns 1234 2.99 3
2867  Adam 31-35 citrus fruit 1234 1.99 3
3962  Addison 41-45 fruit 1234 2.99 1
1420  Adrian 21-25 vegetables 1234 2.99 2
4286  Aiden 21-25 brown bread 1234 2.99 1
3709  Alan 31-35 pastry 1234 2.99 1
1495  Alaska 31-35 sugar 1234 1.99 3

Use case 1: Analyze product sales for customers who purchased specific products

In this use case, you can analyze the sales quantity by product for customers who have purchased specific products. Dashboard users can use nested filters in the QuickSight dashboard to select multiple products, such as whole milk and coffee. The dashboard will then display the sales quantity of other products purchased by customers who have also bought whole milk or coffee, as shown in the following screenshot.

This dashboard enables you to identify which products are frequently purchased together with whole milk and coffee by the same customer. This insight allows you to create promotional bundles for these complementary products.

To understand how nested filters works, let’s look at two diagrams.

In the following figure, P represents the set of products and C represents the set of customers in the dataset. x is a subset of P. x includes the chosen products to analyze in the nested filter (for example, whole milk and coffee). y are customers who purchased products x.

The following figure illustrates using customer subset y as input to find their product sales. y are the customers we identified from the previous figure. z is the output of this use case. It shows all products purchased by y. It includes x (products selected in the nested filter) and other products purchased by y.

To help you better understand the nested filter logic for this first use case, the results in the diagram correspond to the output obtained by running the correlated subquery in QuickSight:

SELECT product, sum(quantity)
FROM dataset
WHERE customer IN( 
    SELECT DISTINCT customer 
    FROM dataset
    WHERE product IN(x)
)
GROUP BY product

Complete the following steps to create a nested filter using a supermarket dataset. This example illustrates the sales quantity of products for customers who have purchased either whole milk or coffee.

  1. First, create a visual. In this example, our table shows the Product and Quantity

  1. Choose the visual and choose the filter icon.
  2. Choose ADD and choose the column you want to filter. For this post, we use Customer to build the filter.
  1. When you choose the filter, it will open the filter edit pane. Or you can choose the options menu (three dots) and choose Edit.
  1. For Filter type, choose Nested filter.
  1. For Nested field, choose Product.
  1. Leave Qualifying condition and Nested filter condition as Include, then choose Apply.

The qualifying condition defines whether we want to include or exclude the set of customers who purchased whole milk or coffee in this example. The nested filter condition defines whether we want to include or exclude the filtered products to find the set of customers. We provide other examples with different qualifying conditions and nested filter conditions in the other use cases discussed in this post.

  1. Add a filter control, then you can publish the dashboard and users can start using the nested filter.

Now you have built the first market basket analysis dashboard. Whole milk and coffee were chosen in the product filters in this example. The table on the left shows product sales quantity of those customers who have purchased whole milk or coffee. You can see that except for whole milk, other vegetables is the highest sales product for this particular customer group.

Use case 2: Determine the customer segment who never purchases a specific product

Nested filters provide different options to enable various use cases. In this example, we choose a different qualifying condition for the nested filter. Before the introduction of nested filter, you could only filter out sales of specific products, and not all products bought by a particular customer group. Nested filters allow you to exclude any products purchased by customers who have bought a certain set of products. This facilitates isolating and analyzing the customer segment that has never bought those specific products.

You can select a list of products in the nested filter and identify customers who didn’t purchase any of those products. Again, we use whole milk and coffee in the nested filter, so the filter includes those customers who purchase neither whole milk nor coffee. In another words, customers who have purchased either whole milk or coffee will be excluded.

To understand how the nested filter works in this use case, let’s look at the following diagrams.

In the following figure, P represents the set of products, and C represents the set of customers in the dataset. x is a subset of P. x includes the chosen products to analyze in the nested filter (for example, whole milk and coffee). y are customers who purchased products x.

y is excluded in this use case. In the following figure, C – y includes all customers except y (those who purchased whole milk or coffee). This is the output of this use case. This can help you understand which customer group never buys x. You can add other customer attributes such as age group to understand how many customers are in each age group.

In this second use case, the results in the diagram correspond to the output obtained by running the correlated subquery in QuickSight:

SELECT ageGroup, count(distinct customer) 
FROM dataset
WHERE customer NOT IN( 
    SELECT DISTINCT customer 
    FROM dataset
    WHERE product IN(x)
)
GROUP BY ageGroup

The steps to set up the nested filter for this second use case are similar to the first use case.

  1. First, create a visual. For this use case, we add AgeGroup and # of customer as columns.
  1. Repeat Steps 2–6 shown in the previous section to create a nested filter.
  2. For Qualifying condition, choose Exclude.
  1. Add a filter control, as illustrated in use case 1.

In the following dashboard, the same products (whole milk and coffee) were chosen. This filter excludes customers who have purchased either of these two products. It also shows which customer age group didn’t purchase these two products. With this analysis, you can understand which group of customers have shown very low or zero interest in a specific set of products.

Use case 3: Include customers who have purchased any products other than the selected ones

In this use case, you can choose specific products in nested filter, and customers who have purchased any products other than the selected ones will be included in the results.

The following figure illustrates this use case. P represents the set of products, and C represents the set of customers in the dataset. x represents the products chosen in the nested filter. For instance, coffee in this use case. y represents the set of customers who purchased product x, including those who also bought other products. b represents customers who purchased only x and no other products. C – b is the output of the customer group in this use case. It allows you to understand customer information by excluding a specific customer group. This example excludes customers who have exclusively purchased coffee. By including a customer attribute in the output results, such as age group, you can gain insights into the distribution of customers across different age groups.

The results in the diagram correspond to the output obtained by running the correlated subquery in QuickSight:

SELECT ageGroup, count(distinct customer)
FROM dataset
WHERE customer IN( 
    SELECT DISTINCT customer
    FROM dataset
    WHERE product NOT IN(x)
)

Complete the following steps to create the filters:

  1. Create a visual. For this example, we add AgeGroup and # of customer as columns.
  1. Refer to the previous steps to create a nested filter.
  2. For Nested filter condition, choose Exclude.
  1. Add a filter control.

In this dashboard, we use coffee as the filter. It excludes customers who only purchased coffee, and shows the number of customers by age group.

Use case 4: Identify customers whose purchases are limited to specific products

This use case focuses solely on customers who exclusively purchased a specific subset of products chosen in the nested filter. For instance, if whole milk and coffee are selected, it includes three groups: customers who only bought whole milk, customers who only bought coffee, and customers who only bought whole milk and coffee together. Although this use case is similar to use case 1, there is a key difference: customers who purchased whole milk, coffee, and additional products (like beef) are excluded from use case 4 but are included in use case 1. This helps identify customers solely interested in certain products. If other customer attributes such as age group or occupation data are included, analysts can evaluate relationships between this limited purchasing behavior and customer profile.

The following figure illustrates this use case. P represents the set of products, and C represents the set of customers in the dataset. x represents the products chosen in the nested filter. For instance, coffee in this use case. y represents the set of customers who purchased product x, including those who also bought other products. b represents customers who purchased only x and no other products. This is the output of the customer group in this use case. You can add other attributes such as customer name, age group, and product to understand more details about these customers.

The results in the diagram correspond to the output obtained by running the correlated subquery in QuickSight:

SELECT DISTINCT ageGroup, product, customerName
FROM dataset
WHERE customer NOT IN( 
    SELECT DISTINCT customer 
    FROM dataset
    WHERE product NOT IN(x)
)

Complete the following steps to configure this filter:

  1. Create a visual. For this example, we add AgeGroup, Product, and Name as columns.
  1. Create a nested filter.
  2. For Qualifying condition and Nested filter condition, choose Exclude.
  1. Add a filter control.

In the following dashboard, two products have been selected in the filter control. We have identified the customers who had only purchased whole milk and coffee and never buy other products. They can be a potential group of customers to extend their consumption in other products.

Use case summary

The following table summarized the four use cases, customer groups included in each case, and settings in the nested filter.

Use Cases

Customer Groups Included in Nested Filter

(x represents products chosen in nested filter)

Nested Filter Setting
Qualifying Condition Nested Filter Condition
1. Analyze product sales for customers who purchased specific products
  • Customers who only bought x
  • Customers who bought x as well as others
Include Include
2. Determine customer segment who never purchase a specific product
  • Customers who never bought x
Exclude Include
3. Include customers who have purchased any products other than the selected ones
  • Customers who never bought x
  • Customers who bought x as well as others
Include Exclude
4. Identify customers whose purchases are limited to specific products
  • Customers who only bought x
Exclude Exclude

Conclusion

In this post, we explained how nested filters can help you build a market basket analysis dashboard and understand customer behavior in different use cases.

To learn more about nested filters, refer to Adding nested filters. You can also visit the Amazon QuickSight Community to ask and answer questions, connect with peers, and stay updated on the newest features and resources.


About the Authors

Roy Yung is a Specialist Solutions Architect for Amazon QuickSight. Roy has over 10 years of experience implementing enterprise business intelligence (BI) solutions. Prior to AWS, Roy delivered BI and data platform solutions in the insurance, banking, aviation, and retail industries.

Martina Zerilli is a Data Analytics and ML Consultant in the AWS Professional Services team based in Rome. She helps craft innovative next generation cloud solutions tailored to customers’ strategic objectives.