PredictHQ Helps Businesses Optimize Revenue from Events with Demand Intelligence
Machine Learning (ML) Supports Forecasting Accuracy
Companies can only prepare for what they know is coming, and events—such as games, concerts, and trade shows—are a major driver of forecasting. The estimated economic impact from a National Basketball Association (NBA) all-star game, for example, is around $100 million. For the many businesses such as taxis and food outlets serving gameday fans, anticipating demand—or lack thereof—can equate to thousands of dollars in profit or loss.
PredictHQ is a demand intelligence company that processes 2 billion data points from all over the globe to help customers forecast demand more accurately. Founded in New Zealand and headquartered in San Francisco, PredictHQ’s team of 30-plus data scientists have built more than 1,600 ML models to verify, standardize, and enrich data at scale.
Running these models on a local machine would take days. With AWS, we can do it in minutes as we can utilize larger nodes, or whatever is optimized for a particular service. That’s how we ensure our pipelines are as fast as possible but also at an agreeable price.”
VP of Engineering, PredictHQ
Microservices Pipeline Processes 25 Million Events
Born on the Amazon Web Services (AWS) Cloud, PredictHQ runs its ML production models on Amazon Elastic Compute Cloud (Amazon EC2). Its microservices pipeline processes data from 25 million events, with results relayed to customers through a series of application programming interfaces (APIs).
One of the main reasons for selecting AWS is the number of AWS Regions and Availability Zones. Some of PredictHQ’s largest customers, such as Domino’s Pizza, are based in the US. “Migrating between AWS Regions and expanding or shrinking our resource utilization is relatively simple on AWS,” says Glen Alexander, vice president of Engineering at PredictHQ.
PredictHQ is also using Amazon Relational Database Service (Amazon RDS) to automate database administration and Amazon Redshift to support analytics. Alexander adds, “The wide range of solutions AWS offers, such as database management services, simplifies our operation because we don’t have to go to a different provider, or run infrastructure ourselves.”
No-Code Approach to API Deployment
Although PredictHQ’s solutions are technologically advanced, its customers don’t have to be. Companies with lean IT teams benefit from PredictHQ’s no-code approach, using products such as Control Center, a web-based solution with a user-friendly dashboard allowing customers to search and locate specific events within their region.
Live TV events is just one of 19 different event types that PredictHQ tracks. One customer that operates sports bars across the US leverages demand intelligence data from Control Center. This allows them to see the types of upcoming televised games as well as the expected audience and popularity of each event. Armed with this information, the company can adjust staffing and order enough food and beverages to meet demand from televised sports games.
Saving Valuable Data Science Hours
Reducing forecasting errors saves businesses millions of dollars. A leading fast-food chain is tapping into $8.5 million in labor optimization and supply chain efficiencies each year by using PredictHQ to improve staffing, supply chain, and revenue operations.
Technologically sophisticated customers and data scientists can also take advantage of PredictHQ’s data pipeline for demand mapping and feature engineering. Though such customers could build their own pipelines, doing so would require hundreds—if not thousands—of valuable data science hours.
“You can’t rely solely on historical internal data; you need third-party data to postulate what’s going to happen. And we’re well known for our data quality and coverage,” Alexander explains. PredictHQ performs extensive data cleansing and enrichment within its pipeline to remove data duplicates and data “spam”, which could otherwise lead to inflated, false-positive demand forecasts. For example, from January to October 2021, 51.89 percent of events received in PredictHQ’s pipeline were either missing critical event information, spam, virtual events, duplicates or had an incorrect status.
Harnessing Compute Power for ML Model Churn
By running its ML models using Kubernetes on the AWS Cloud, PredictHQ can perform distributed data analysis at scale. Its internal data processing pipeline that scientists use for ML model development needs to process terabytes of data at once, autoscaling up to hundreds of nodes to retrieve results as fast as possible.
“We want our data scientists to run their ML models more than once to churn it for maximum accuracy, and to do that we need a massive amount of compute power,” Alexander says. “Running these models on a local machine would take days. With AWS, we can do it in minutes as we can utilize larger nodes, or whatever is optimized for a particular service. That’s how we ensure our pipelines are as fast as possible but also at an agreeable price.”
Controlling Costs with Variable Instance Types
PredictHQ uses a range of Amazon EC2 instance types and sizes, many from the M5 instance family, to closely match compute and memory requirements per workload. Its customer-facing data pipeline is updated daily, sometimes hourly, to check for changes in event data that could affect demand. To control costs, PredictHQ uses Amazon EC2 Spot Instances for data processing that doesn’t need to happen in real time.
However, for time-sensitive data, PredictHQ turns to Amazon EC2 Reserved Instances. Severe weather updates, airport delays, and other data that can have a significant incremental or decremental impact on sales are priority processing jobs.
Processes 1.6 Million API Requests Per Day
PredictHQ’s internal and external pipelines have maintained 99.999 percent uptime over the past year, despite extreme usage peaks during model testing or in the run-up to major events. Its solutions process 1.6 million API requests per day, with a peak load of 200 requests per second, but stability has remained constant on AWS.
As the business continues to build its customer base, it’s exploring the use of Amazon CloudFront to optimize the browsing experience for clients in the US. PredictHQ is also deepening the partnership with AWS to offer its services through AWS Marketplace. “There’s a lot of synergy between our offerings and AWS. Our alignment enables customers to access our solutions quickly and accelerate their time to value,” Alexander concludes.
To learn more, visit aws.amazon.com/machine-learning/
PredictHQ is a global demand intelligence company helping businesses understand the impact of events for more accurate and profitable forecasting. Its data pipeline reaches 25 million events across 20,000 cities, all accessed by one API.
Benefits of AWS
- Offers no-code approach to API deployment
- Scales for peak loads of 200 API requests per second
- Controls costs and improves performance with variable instance types
- Maintains 99.9999% uptime
- Optimizes labor and revenue with less forecasting errors
- Runs complex ML models in minutes
- Saves valuable data scientist time
AWS Services Used
Amazon Elastic Compute Cloud (EC2)
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.
Amazon EC2 Spot Instances
Amazon EC2 Spot Instances let you take advantage of unused EC2 capacity in the AWS cloud. Spot Instances are available at up to a 90% discount compared to On-Demand prices.
Amazon Relational Database Service (RDS)
Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud.
Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.