AWS for Industries

Yahoo DSP Ad Targeting Uses Amazon SageMaker To Reduce Operating Cost and Increase Audience Reach

Introduction and context

Yahoo offers an omnichannel Demand-Side Platform (DSP) that gives advertisers top-notch technology, massive scale, access to premium supply, and trusted consumer relationships. Demand-side platforms are used in digital advertising to enable advertisers and agencies to purchase ad inventory from multiple ad exchanges and supply-side platforms (SSPs) through a single interface. Yahoo’s DSP offers advanced audience targeting options allowing advertisers to reach customers based on behavior, location, and other factors. Within its DSP, Yahoo offers a Modeling and Scoring system for predicting user conversions. The term ‘user conversion’ in advertising means that a user interacts with an ad or a product listing and then takes a desired action. This action is defined by the advertiser as valuable to their business. Yahoo DSP uses machine learning (ML) along with advertiser conversion rules to predict which users are most likely to convert. Near real-time ad events which include impressions, clicks, and conversions are associated with an anonymized user and then scored against offline trained models to assign the user to audience segments. An ‘audience segment’ is a group of users with common or related specific interests, demographics, or behaviors that are used to target advertising campaigns. These audience segments are then sent to targeting systems for ad serving so they are exposed to the relevant advertiser’s campaigns.

The Yahoo DSP Modeling and Scoring system was previously run on-premises (on-prem), where the workflows for training, testing, troubleshooting, deploying, and governing ML models were based on custom scripts. These manual operations were time-consuming and led to suboptimal utilization of on-premises resources. The workloads consisted of thousands of models making predictions for billions of anonymized user IDs from over five billion events per day. The ML training dataset had hundreds of thousands of features, and the team needed to retrain several hundred models every day to keep them up to date. The team faced the daunting challenge of migrating this massive ML workload to the cloud.

Yahoo DSP partnered with AWS to migrate the Modeling and Scoring workloads to a modernized and optimized platform. This blog describes the scalable and automated solution that was implemented on AWS using MLOps (Machine Learning Operations) best practices.

Solution

The solution has a near real-time (NRT) processing component for the prediction of user segments and an offline processing component for feature extraction and training of models. The NRT component handles billions of events per day. It filters and aggregates the events to reduce the number of calls to the Amazon SageMaker AI Inference endpoints from billions to hundreds of millions. The end-to-end latency –from event ingestion to the user segment becoming available to the ad server– is a few minutes.

The NRT processing workflow (Figure 1) utilizes Amazon Managed Service for Apache Flink to continuously process events from the external data source. The SageMaker AI Inference endpoints provide a fully managed service for hosting the models trained in the offline batch processing component.

The Offline Batch Processing component (Figure 2) utilizes Amazon EMR for feature and label extraction from the user data store. The models are re-trained weekly using SparkML on Amazon EMR. The extracted features are stored in the Online Feature Store used in NRT processing.

The solution utilizes the unique capability of SageMaker AI Feature Store to keep the Online and Offline features in sync.

Near real time processing diagram

Near real-time (NRT) processing

Amazon Managed Service for Apache Flink consumes Impressions/pixel events from an external data source and processes them as below.

  • Process incoming events – filtering out the events which do not have relevant details, and extracting features such as impression and click counts. For example, a five-minute aggregation on the extracted features is used to reduce the amount of features needed to be processed downstream. (1)
  • Reads the features from the feature store and updates the record with the updated data from the events. Builds a feature vector for inference by merging real-time features with features from Online Feature Group. (2)
  • Calls SageMaker AI Inference Endpoint with feature vector as payload. (3) Calculates which segments the user qualifies for by applying the custom qualification logic based on scores from inferencing, involving additional decisive variables from Online Feature Group.
  • Sends user’s segments to Ad Serving System. (4)

Offline batch processing

As part of optimizing and modernizing the Batch Processing workflow, the Yahoo Data Science team devised innovative approaches to reduce the number of features and moved to a two-model system instead of thousands of models as in the on-prem solution. The on-prem system utilized a binary model for each audience resulting in thousands of models. Through experimentation and multiple iterations, the team discovered the benefits of using two models, one binary model to classify converters for any audience and one multilabel classification model to determine the specific audience (2). The binary model allows for reducing the number of users the multilabel model needs to process, increasing the speed and reducing the costs. The offline feature extraction workflow extracts features for approximately a billion users and loads the features into the Online Feature Store (3). Managed Workflows for Apache Airflow (MWAA) is utilized to schedule offline label/feature extraction & model training. The trained model is then converted from the SparkML offline training to the SageMaker hosted MLeap inference format and deployed into Amazon SageMaker AI’s Inference Endpoint.

Outcomes

With the new solution, the Yahoo DSP Ad Targeting team achieved approximately 50% cost savings and an improved conversion rate, compared to the on-premises solution. By leveraging SageMaker AI features such as Feature Store, Inference Endpoints, and other AWS services like Amazon Managed Service for Apache Flink and EMR, Yahoo streamlined its machine learning processes and eliminated the need for manual intervention. This allowed the data scientists to focus on innovation and iterate rapidly on different ML models and algorithms. As an example, superior targeting algorithms resulted in a 10% increase in audience reach. With the automated workflows enabled by Amazon SageMaker AI, the Yahoo team was able to develop the entire pipeline with just three resources – one data scientist and two ML engineers, a significant 80% efficiency gain – compared to the fifteen resources previously required.

Long-term goals

While this solution reduced cost and improved performance for streaming models, many batch predictive audiences are still on the legacy system. Yahoo plans to utilize the same pipelines to replace the legacy system completely. This project is currently ongoing.

Conclusion

In this post, we explained how Yahoo DSP, a leading player in the media and digital advertising industry, uses Amazon SageMaker AI to reduce costs and achieve efficiency gains with limited resources while improving key performance indicators. The solution outlined in this blog also shows how Yahoo streamlined the DSP scoring system and leveraged scale, resilience, security, and innovation available on cloud.

Gil Barretto

Gil Barretto

Gil Barretto is a principal software engineer at Yahoo Ad Systems. He has over 20 years of experience in software development with over 10 years in developing Big Data and ML solutions. He has been developing solutions to help advertisers reach their target audiences. In his free time, he enjoys hiking and building a Lego city with his children.

Jack Wang

Jack Wang

Jack Wang is a senior engineer at Yahoo Home and Eco-systems, leveraging over 10 years of experience in designing and building E2E Big Data & AI/ML Batch/Streaming solutions. Previously, as a mobile computing advocate, he had been working as a columnist as well as an Android/Windows Mobile engineer in ASUS since the early 2000s. He is currently interested in building agent-based local inference, and PEFT for Generative AI Models.

Mecit Gungor

Mecit Gungor

Mecit Gungor is an AI/ML Specialist Solution Architect at AWS helping customers design and build AI/ML solutions at scale. He covers a wide range of AI/ML use cases for Telecommunication customers and currently focuses on Generative AI, LLMs, and training and inference optimization. He can often be found hiking in the wilderness or playing board games with his friends in his free time.

Suneel Joshi

Suneel Joshi

Suneel is a Senior Solutions Architect at Amazon Web Services. He provides advocacy and guidance to customers in their cloud journey as they plan and build cloud solutions. He is a DevOps and Machine Learning enthusiast. Among other things, he helps customers build intelligence in their applications using AI services.

Venu Nagineni

Venu Nagineni

Venu Nagineni is an AI enthusiast and builder with a passion for democratizing artificial intelligence. He specializes in helping organizations unlock business value through cutting-edge AI technologies. In spare time, he moonlights as an expert plant whisperer, a fearless DIY home renovator, and a LEGO architect extraordinaire.