Customer Stories / Advertising & Marketing

Amazon Ads logo

Delivering Ultralow Latency Machine Learning for Amazon Ads

Amazon Ads employed Amazon ElastiCache and Amazon Kinesis to process billions of impressions every day at ultralow latency. Now, the company’s machine learning models recommend relevant products to customers in 20 markets.

Industry Challenge

Amazon Ads strives to deliver relevant and engaging suggestions to online shoppers at a scale of tens of billions of impressions each day. Shopping trends are ever changing, challenging engineers to deliver quality results for all those products. “We need to deal with hundreds of millions of deep learning requests per second for online influencing within a very small amount of latency,” says Shenghua Bao, senior manager of applied science—Sponsored Products for Amazon Ads. The company uses Amazon Web Services (AWS) offerings to meet this demand.

Amazon Ad’s Solution

Amazon Ads approached the machine learning challenges from two angles: product understanding, which is fairly stable, and trend understanding, which evolves continuously. To develop product understanding, Amazon Ads employs a deep learning approach that learns embedding representations to calculate the similarity between the product and the query. The product embeddings are trained on the descriptions and titles of Amazon products with over one billion trainable parameters. As Amazon has over a billion products, Amazon Ads needs an efficient way to deliver its service.

The solution must scale up to hundreds of millions of product embedding requests per second, which could easily consume several terabits per second of bandwidth. To scale efficiently, Amazon Ads has implemented a scalable hybrid approach using Amazon ElastiCache, an in-memory caching service supporting flexible, near-real-time use cases. The most popular products are stored in a local cache and the less popular products in a remote cache, thereby reducing network cost significantly by limiting remote cache access.

Yet product understanding alone cannot deliver relevant recommendations without near-real-time trend analysis. To process product engagement in near real time, Amazon Ads uses Amazon Kinesis, which offers key capabilities to cost-effectively process streaming data at nearly any scale. To handle high bot-traffic impacts on shopping trend understanding, Amazon Ads developed an in-house bot-traffic detection system and processes online traffic by request to make results less sensitive to interference. Processing traffic by request distributes bot traffic evenly across hosts, reducing the effects of any artificial spikes caused by bots. To address feature publishing congestion, Amazon Ads uses Amazon Simple Queue Service (Amazon SQS) to prioritize near-real-time traffic over batch features to make the best use of compute resources and stay ahead of rapidly changing trends.


We need to track shopping trends from tens of billions of product views each day, requiring real-time access within milliseconds at p99 latency. These capabilities are made possible using AWS services.”

Shenghua Bao
Senior Manager, Applied Science—Sponsored Products, Amazon Ads

Benefits of Using AWS

Combining deep product understanding and near-real-time shopping trend analysis facilitates ultralow latency results at scale for Amazon Ads. “We need to serve embedding representations of billions of products online, and we also need to track shopping trends from tens of billions of product views each day, both requiring real-time access within milliseconds at p99 latency,” says Bao. “These capabilities are made possible using AWS services.”

The scale of deep learning features and models continues to grow as Amazon Ads expands to new marketplaces. “Thanks to AWS technology, we are expanding the machine learning solutions to 20 marketplaces worldwide,” says Bao. Amazon Ads also uses AWS infrastructure to scale down and achieve cost savings after peak shopping seasons, such as the winter holidays.

About Amazon Ads

Amazon Ads offers a range of products and information to help customers—registered sellers, vendors, book vendors, Kindle Direct Publishing (KDP) authors, app developers, and agencies—achieve advertising goals. With insights, reach, and premium entertainment properties from music to streaming, users can connect with the right audiences in the right places, both on and off Amazon.

AWS Services Used

Amazon ElastiCache

Amazon ElastiCache is a fully managed, Redis- and Memcached-compatible service delivering real-time, cost-optimized performance for modern applications.

Learn more »

Amazon Kinesis

Amazon Kinesis cost-effectively processes and analyzes streaming data at any scale as a fully managed service. 

Learn more »

Amazon SQS

Amazon Simple Queue Service (SQS) lets you send, store, and receive messages between software components at any volume, without losing messages or requiring other services to be available.

Learn more »

Get Started

Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.