GumGum Saves 62% on Real-Time Bidding and Machine Learning with Amazon EC2 Spot Instances

2020

GumGum Inc. (GumGum) is an advertising technology company that uses computer vision and natural language processing (NLP) to deliver contextually relevant advertising campaigns for brands and agencies around the world. To do this, the company runs an advertising exchange and supply-side advertising platform that conducts 30 billion transactions and processes 100 TB of data per day.

Since 2010, GumGum has run its advertising technology, machine learning, and data processing systems on Amazon Web Services (AWS). “All of our growth and meaningful work has been accomplished on AWS,” says Vaibhav Puranik, senior vice president of engineering at GumGum. “Today we have thousands of publishers sending us billions of bid requests to show ads. As AWS grew, so did our adoption of AWS services.”

Amazing vibrant Times Square crowded with tourists and people sightseeing. New York City, Manhattan, USA.
kr_quotemark

Amazon EC2 Spot Instances are an integral part of our architecture. Anything we launch these days, we design to run on Amazon EC2 Spot Instances.”

Vaibhav Puranik
Senior Vice President of Engineering, GumGum

Cost Efficiency for Compute-Intensive Ad Tech Workloads

GumGum recently embarked on an organization-wide cost-optimization initiative, focusing on reducing costs for its compute-intensive advertising technology workloads such as real-time bidding, advertising analytics, and contextual advertising analysis.

The company swapped its CPU instances for GPU clusters on Amazon Elastic Compute Cloud (Amazon EC2), which are now processing thousands of events simultaneously, saving GumGum $12,000 per month. The company also prioritized using stateless architecture, enabling it to leverage Amazon EC2 Spot Instances, which let deployments take advantage of unused Amazon EC2 capacity at heavily discounted rates.

“Amazon EC2 Spot Instances are an integral part of our architecture,” says Puranik. “Anything we launch these days, we design to run on Amazon EC2 Spot Instances.” The company also uses an open-source library called AutoSpotting to manage its Amazon EC2 Spot Instance clusters, enabling high availability and automatic provisioning of new Amazon EC2 Spot Instances to handle surges in traffic. “By moving compute to Amazon EC2 Spot Instances, GumGum has saved 62 percent on compute costs,” Puranik added.

GumGum also uses AWS Savings Plans to achieve further savings over Amazon EC2 On-Demand Instances in exchange for a commitment to use a certain amount of compute power for a 1- or 3-year period. With a single AWS Savings Plans reservation, the company saved $7,000 per month. GumGum also analyzed and reconsidered its data-retention policies and storage tiering within Amazon Simple Storage Service (Amazon S3), adding an additional $45,000 in monthly savings for file storage.

AWS Machine Learning Powers GumGum’s Contextual Analysis

GumGum uses machine learning to find contextually relevant content and block brand-threatening content near advertising inventory—analyzing images, video, and text on 20 million unique webpages per day.

“We analyze all of these components separately, and then we do late fusion, which means that we take scores from all of these analyses, marry them together, and come up with one score for a page,” says Puranik. Those scores help GumGum determine which ads will perform best on each page—and ultimately help it maximize ad pricing and inventory yield for its customers. The company’s contextual analysis framework is built on deep learning models built on PyTorch and TensorFlow running on Amazon EC2 instances.

GumGum uses Amazon DynamoDB, a key-value and document database that delivers single-digit millisecond performance at any scale, to store inferences generated by GumGum’s contextual analysis and brand-safety systems. Because GumGum has to operate and scale within milliseconds-wide margins, it also uses Amazon DynamoDB Accelerator as a speed-boosting cache for Amazon DynamoDB. This allows GumGum’s ad servers to access stored results with just 2–3 ms of latency. The company saves as much as 70 percent with Amazon DynamoDB over the Apache Cassandra–based system that it replaced.

For the compute instances powering its machine learning applications, GumGum switched to Amazon EC2 Spot Instances and Amazon EC2 G4 Instances, which feature FP16 capabilities for training and up to 100 Gbps of networking throughput. The switch enabled “four times faster inferences and made for a smaller model, which means we could perform more inferences in parallel,” says Corey Gale, DevOps manager at GumGum.  

Goals to Increase Analyzing to 200 Million Pages per Day

GumGum plans to expand its use of computer vision and NLP running on AWS to drive additional revenue for its contextual analytics product, Verity. Today Verity is “packaged as an API for publishers and DSPs who want to take advantage of our contextual determination and brand-safety capabilities,” says Puranik. GumGum has also challenged itself to increase its daily processing from 20 million to 200 million pages.

GumGum Ad Server Reference Architecture

GumGum Verity Reference Architecture


About GumGum

GumGum Inc. is an advertising technology company that uses computer vision and NLP to deliver contextually relevant advertising campaigns for brands and agencies in under 100 ms as it auctions online advertising space to the highest bidder.

Benefits of AWS

  • Supports processing 20 million pages a day
  • Saves 62% in compute costs
  • Saves 70% in database costs
  • Saves more than $45,000 a month in storage costs
  • Provides 4x faster deep learning training
  • Expedites processing within milliseconds-wide margins

AWS Services Used

Amazon DynamoDB

Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale.

Learn more »

Amazon DynamoDB Accelerator

Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement – from milliseconds to microseconds – even at millions of requests per second.

Learn more »

Amazon EC2

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.

Learn more »

Amazon EC2 Spot Instances

Amazon EC2 Spot Instances let you take advantage of unused EC2 capacity in the AWS Cloud.

Learn more »


Get Started

Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.