How Zulily drives discovery shopping using Amazon Kinesis Data Analytics and Amazon DocumentDB

This is a guest post by Sergey Podlazov – Director of Engineering (Shopping Experience) at Zulily, Senthil Kumar, Sr. Solutions Architect, AWS, and Praveen Chamarthi, Sr. Technical Account Manager, AWS

August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more.

Zulily offers a unique ecommerce experience to shoppers by offering amazing deals on products for moms, kids, and babies. We have scaled this model to launch thousands of new products every day for our shoppers to discover.

Zulily differs from other retailers because we’re a destination where our customers come seeking inspiration. Our engineering team sought to support that shopping experience by building a high-scale ecommerce platform to delight customers and provide inspiration as they browse our site.

In this post, we review the solution our team at Zulily built to provide an engaging, rich, and meaningful dynamic search experience, using Amazon Kinesis Data Streams, Amazon Kinesis Data Analytics, AWS Lambda, Amazon DocumentDB (with MongoDB compatibility), and Amazon ElastiCache for Redis.

The challenge: Delivering an enhanced customer-focused search experience

A great search is an important part of our customers’ experience on Zulily. In fact, search is so important that two-thirds of our customers use this feature. It’s the second most used feature on our site and app. So we set out to define what a great search experience is all about. We came up with four pillars:

Suggestive – We want to share with our customers things that other customers are looking for, such as what’s trending currently on our site or app.
Relevant – In the context of Zulily, where our products are displayed for up to 72 hours, relevancy means suggesting something to our customers that is in our inventory and is available.
Diverse – Zulily carries national brands as well as some well-known boutique brands in our catalog. We want to ensure our customers are provided with diverse suggestions that include both national and boutique brands.
Personalized – In today’s ecommerce world, personalization means displaying content to our customers that’s relevant to them as individuals.

Before we dive into the details of this new search experience, here’s a quick peek before and after the changes.

The following image shows the previous customer experience.

The following image shows the new and improved customer experience.

The following diagram illustrates the architecture we had in place before making any changes. Customers using Zulily’s mobile app or site sent requests to Elasticsearch’s REST API endpoint and got a response that included search results along with trending search keywords.

The new suggested search feature, called K-Top service, is implemented as a microservice. Data and events from clickstream analytics are inputs for this service. The service outputs trending search keywords, brands, and categories all validated against the inventory. As we built this as a microservice decoupled from the rest of the system, we enabled our engineers to maintain and manage the service independently. The following diagram shows our updated architecture.

Service overview and key components

Clickstream data includes customer page visits, assets that were clicked while browsing, or any clicked links during the visit. The K-Top service primarily needs to extract search keywords from this clickstream that comprises millions of user interactions.

The following diagram shows the service workflow.

The workflow includes the following steps:

Filter out the search keywords from the data stream.
Correlate the search keywords to related brands and categories to make our search richer and more meaningful.
Check inventory for availability and report.
Persist the results in a fast data store, where these results can be rapidly accessed by customers.

Technical design considerations

In this section, we discuss our reasons for the following design and technology choices to implement this solution:

Extracting search events (querying a data warehouse vs. querying the clickstream in near-real time)
Server vs. serverless
Persistent data store

Extracting search events

Zulily’s first option was to use our existing data warehouse in Google BigQuery to power the search experience. The advantage of using this approach is that the search events are already filtered and stored in a table. This makes it easier to query the data with a simple SELECT statement. However, this table is disconnected from the store, which doesn’t serve the purpose of a dynamic search experience.

The second option was to query data directly from the clickstream using Kinesis Data Streams and Kinesis Data Analytics. The clickstream dataset is comprised of hundreds of millions of user interactions, page views, and other important data points. We decided to go with this approach because it met our objective to extract and filter search terms directly from the clickstream in near-real time. This approach provides us with the ease and flexibility to filter and enrich data in near-real time.

Server vs. serverless

Our hosting choices for the service were either server-based choices like Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Container Service (Amazon ECS), or going completely serverless.

We went serverless with Lambda because we didn’t want to manage the infrastructure. Our team was very confident that Lambda could meet our needs and its native integration with the rest of the application’s service stack made it an easier choice.

Persistent data store

The engineers at Zulily are very comfortable working with MongoDB, but also wanted to leverage a fully managed solution. Amazon DocumentDB allows the team to use their existing MongoDB tools and drivers while removing management tasks such as hardware provisioning, patching, setup, configurations, backups, and scaling. This became our obvious choice for the persistent data store.

Solution architecture

The following diagram shows our solution architecture and AWS services.

The workflow includes the following steps:

Customer interactions (data, events as Clickstream data) are recorded and sent to the Kinesis data stream.
A filter is applied on this data stream by building a Kinesis Data Analytics app. Kinesis Data Analytics lets you filter and query data using SQL-like query language on a data stream. After the data is filtered, you can direct the resulting data to a destination data stream. This is a common pattern where data streams are queued and business logic is then applied to filter and enrich the data as it flows from one data stream to another.
The next step is to look for brands and categories from the search keywords entered by the customer. As the new events appear in the destination data stream (search events), the event transformer Lambda is triggered, which performs a lookup on brands and categories, and then stores the results in the enriched events Amazon DocumentDB data store.
The K Top producer Lambda function runs on a regular cadence. This function fetches keywords, brands, and categories from DocumentDB and validates them against available inventory. This check ensures that the products are in stock and that orders can be fulfilled.
After the inventory check, the Lambda function stores the results in a fast data store, ElastiCache for Redis. This data is then available for querying via the Search API.

Technical impact

We saw technical impacts in the following areas:

Speed – The design and technology choices we made enabled us to go from UX design to production in 10 weeks, compared to the 16 or more weeks it would have taken with alternative designs
Focus – By making the right choices to use AWS serverless and managed services, the team could focus on building the business logic rather than worry about managing infrastructure
Design principles – We were able to stick our design principle “Design for tomorrow, build for today”

Customer impact

With the new enhanced search experience, more than 75% of customers chose to utilize the enriched search suggestions along with the suggested keywords, brands, and product categories that were shown to them. When compared to just the 25% of the customers who performed raw searches using free-form text queries, this showed a greater adoption, appreciation, and acceptance of the new features.

Additionally, the enhanced search allowed us to check our inventories for merchandise availability before suggesting a product or category to our customers, which resulted in improved customer experience.

Conclusion

In this post, we shared how to enrich a simple product search using AWS purpose-built databases and serverless technologies to provide a more engaging customer experience. Time to market was reduced by over 60% through the use of Kinesis Data Streams, Kinesis Data Analytics, Lambda, and Amazon DocumentDB, versus the heavy lifting traditional infrastructure management requires. Once live, the enhanced search experience successfully engaged 75% of Zulily customers, which is proof of this feature’s success over the previous static experience of free-form text queries.

For more information about our solution, watch our presentation at AWS re:Invent 2020.

About the authors

Sergey Podlazov is an experienced engineering leader with a track record of building compelling, personalized shopping experiences for Zulily’s millions of customers around the world. With a strong background in data engineering, Sergey has held leadership roles at both startups and global companies, including indoor mapping company Point Inside and Amazon. He holds a master’s degree in information systems from the University of Kansas. While not at work, he spends time with his family, plays in a rock band, skis, and volunteers with The Russian Community Support Group in Washington state, which celebrates Russian heritage and community in the region.

Senthil Kumar is an AWS Solutions Architect based out of New York, NY. He is passionate about enabling enterprise customers on their digital transformation journey in the cloud and helps architecting and building cloud native solutions.

Praveen Chamarthi is a Senior Technical Account Manager with Amazon Web Services (AWS), with 20+ years of expertise in Operations, Software Configuration Management, Build & Release Management. Praveen works with enterprise customers to design, deploy, and scale cloud applications to achieve their business goals.

AWS Database Blog

How Zulily drives discovery shopping using Amazon Kinesis Data Analytics and Amazon DocumentDB

The challenge: Delivering an enhanced customer-focused search experience

Service overview and key components

Technical design considerations

Extracting search events

Server vs. serverless

Persistent data store

Solution architecture

Technical impact

Customer impact

Conclusion

About the authors

Resources

Blog Topics

Follow

Learn

Resources

Developers

Help