Customer Stories / Advertising Technology 

2022
Amazon Ads Logo

Amazon Ads Uses AWS for Real-Time Machine Learning in Ad Serving

Hear from Kun Liu, director of machine learning at Amazon Ads, on how the company uses AWS solutions, such as Amazon Elastic Container Service (Amazon ECS), to scale machine learning inferencing systems for real-time ad selection.

Amazon Ads uses AWS to Drive Double-Digit Improvement in Advertiser ROI | Amazon Web Services

Amazon Ads offers a wide range of solutions to help brands and businesses achieve their advertising goals while delivering a relevant and engaging ad experience for consumers. Amazon Ads operates its ad serving infrastructure at massive scale, handling hundreds of millions of ad requests per second (trillions of ads per day) within a latency budget of under 120 milliseconds.

To optimize for higher conversions during ad selection, Amazon Ads runs hundreds of machine learning (ML) models that decide how to score and source relevant ads to show shoppers, predict whether shoppers will click or purchase after viewing an ad, and allocate and price ads during real-time ad serving. As Amazon Ads continued to expand its use of deep learning, it faced challenges to select the appropriate hardware configurations to support both ultra-low latency inferencing and asynchronous inferencing for ad prediction. Amazon Ads sought a more horizontally scalable solution, and wanted to customize software and hardware for each respective model, including CPU and GPU configurations.

Amazon Ads chose Amazon Web Services (AWS) to reduce time spent managing infrastructure, lower costs, and optimize ad selection by choosing from the broadest and deepest selection of compute and machine learning capabilities to meet its latency and performance requirements. Using Amazon Elastic Container Service (Amazon ECS) and AWS App Mesh, Amazon Ads built a micro-service inferencing architecture, which scaled model hosting and optimized hardware and software optimizations for each type of inference model. The company chose NVIDIA Triton Inference Servers running on GPU-based Amazon Elastic Compute Cloud (Amazon EC2) G4dn instances for ultra-low latency predictions with deep neutral networks. For asynchronous predictions using BERT models, Amazon Ads uses Amazon SageMaker Multi-Model Endpoints running on Amazon EC2 Inf1 instances, which deliver 2.3x higher throughput and up to 70 percent lower cost per inference than comparable current generation GPU-based Amazon EC2 instances.

This architecture allows Amazon Ads to query billions of learnable parameters across inferencing models while staying cost effective. Amazon Ads can score hundreds of millions of ads per second within a 20-millisecond window. “Using AWS, we are able to iterate much faster than ever before to deliver more complex models. AWS offers tools that reduce heavy lifting, help us spend more time on data science, and less time on data engineering and managing infrastructure,” says Kun Liu, director of machine learning at Amazon Ads. “We are able to improve our shoppers’ engagement on ads, as measured by clicks and purchases by double digits over the last years,” says Liu. “And as a result, we further drive the double-digit improvement of the advertiser return on investment. We can deliver more relevant ads to shoppers and better return for our advertising customers.”

Watch “Under the hood at Amazon Ads” from AWS re:Invent 2021.

AWS Services Used

Amazon Elastic Container Service (Amazon ECS)

Amazon ECS is a fully managed container orchestration service that makes it easy for you to deploy, manage, and scale containerized applications.

Learn more »

AWS App Mesh

AWS App Mesh is a service mesh that provides application-level networking to make it easy for your services to communicate with each other across multiple types of compute infrastructure.

Learn more »

Amazon EC2

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud.

Learn more »

Amazon SageMaker

Amazon SageMaker JumpStart helps you quickly and easily get started with machine learning. The solutions are fully customizable and support one-click deployment and fine-tuning of more than 150 popular open source models.

Learn more »

Get Started

Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.