AWS for Industries
21st Century Forecasting for the Travel and Hospitality Industry
Last year was unprecedented for the travel and hospitality industry and for travelers and guests, providers and suppliers, employees and stakeholders alike. As a result of the disruption, companies that relied on moving average forecasting or looked to the same period last year have found themselves at the helm of a rudderless ship. Without accurate forecasts, travel and hospitality companies cannot determine when or how many travelers or guests will arrive. They don’t know how to staff appropriately, adjust their inventory positions, model the impact of promotions or events on business performance, optimize pricing and revenue management and project revenue and cash flow. William Cox, Senior Data Scientist at Grubhub, summarized this industry dilemma well, saying, “Oversupply increases operating costs while undersupply decreases customer satisfaction.” This means that accurate forecasting is essential to an industry focused on enhancing customer experiences and optimizing operational efficiencies. Yet unpredictability has made forecasting accurately even more challenging than ever before.
My peers and I frequently talk to industry customers who, despite unpredictability, are trying to find balance in their forecasting efforts. Given the diversity of customers we support, we have a unique perspective, one that allows us to recognize common themes and solutions. More and more customers have asked us to share our insights, so we compiled them into a new white paper entitled “21st Century Forecasting for the Travel and Hospitality Industry.” Among other things, it covers:
· Customer success stories to inspire you
· Partner-led innovation to accelerate outcomes
· Guidance from industry practitioners for constructing a more resilient forecasting discipline for your organization
We hope this white paper will be a useful resource for travel and hospitality companies as they address current challenges and those yet to come. 21st Century Forecasting for the Travel and Hospitality Industry is available for download.
In preparation for the publication of the white paper, I sat down (virtually) with some Grubhub team members—Data Scientist Gayan Seneviratna, Director of Engineering Sagar Sahasrabudhe, and Senior Data Scientist William Cox—to understand how they handle this complex forecasting opportunity.
Steven M. Elinson: While your business is recognized publicly by many, at a high level can you describe the unique characteristics of your forecasting challenge?
William Cox: Grubhub is a leading online and mobile food-ordering and delivery marketplace. Grubhub strengthens a restaurant’s online presence, helps them market their business to local diners, and handles delivery if they want freedom from managing their own drivers. Operational excellence is at the core of getting deliveries completed on time and having a good forecast is fundamental for smooth operations. Orders are an online arrival of demand, with a short horizon of delivery expectations. In order to fulfill them, we need to have some predictions upfront so we can ensure that the right number of delivery drivers are on the road. There are multiple dimensions to the forecasting business opportunity at Grubhub.
SME: Many of the travel and hospitality companies that we speak with share a desire to forecast at the 15-minute increment level. Can you tell me more about the time horizons that Grubhub is concerned about?
Gayan Seneviratna: Demand prediction across different time horizons have different use cases, supporting operational, tactical, and strategic decision making. For example, getting the daily demand forecast out a couple of months enables the driver onboarding teams to project how many new drivers they need to onboard. This becomes especially important in high-growth markets. On the other hand, given that deliveries from inception to completion last around 40 minutes, it is important to get the demand for shorter time windows, such as at the half-hour level, a few days in advance. This enables the operations team to project how many drivers are needed from the available pool to support an efficient operation. In addition, very short-term forecasts are crucial to make smarter runtime decisions. The time horizon for these forecasts tend to occur around the five-minute level. This is essential because new information is revealed as time progresses, and the forecast generated earlier needs to be updated. New information could be an organic boost in demand or can be driven by some event that was not accounted for in the earlier forecast, such as inclement weather. Our forecasting system needs to be able to account for all these use cases across the various time horizons.
SME: Impressive how you cover multiple spans of time all at the same time. Do you have similar challenges around all of the geographies in which you operate?
Sagar Sahasrabudhe: In the world of forecasting, we refer to the dispersion of the forecast across geographies as spatial granularity. Spatial considerations are critical in forecasting at Grubhub. Nationwide demand forecasting aids certain strategic decision making, but for efficient operations it needs to be done more granularly. This poses a question about scale. Is it better to forecast all of Chicago or in smaller neighborhoods? Or should we aggregate forecasts at the restaurant level? Forecasting models lose accuracy if the data becomes sparse, for example, if there are only a few orders per restaurant, and they can also take too long to predict thousands of units. But high accuracy models done over large geographical areas will not help operations understand where to get drivers. And at any scale, forecasting for a new market with no historical data is challenging because of lack of available data. These tradeoffs and concerns guide the choice of our models and design of our forecast system.
SME: Forecasting at Grubhub is beginning to sound like a Led Zeppelin song—the forecast is essentially a journey through both time and space?
WC: Forecasting can’t be limited to only a point value of the organic demand that we expect. Instead, it needs to be a function of actions that can be shaped by business needs. For example, promotions can help generate more demand than what we may usually see. In such a case, producing a demand curve based on intensity of promotions becomes important. It can be used to optimize the promotional investment relative to how much demand can be supported without increasing delivery costs and cancellations. This requires our models to learn from either past promotion experience in one market or from similar promotions executed in other markets.
SME: This seems like a tremendous amount of work, being performed at an incredible scale with intense velocity. How does Grubhub achieve all of this?
Collective team: In order to accurately forecast demand, we have researched and implemented different forecasting models. Some use standard autoregressive statistics like ARMA, allowing them to account for recent trends. Others, like our DeepAR, learn patterns across many regions simultaneously, making forecasts in newer regions possible as extrapolation from older ones. These and other models allow us to address the various needs of temporal and spatial granularity. Each of our models is retrained on all available order data at the moment of forecasting. Those order time series are retrieved from Amazon Simple Storage Service (Amazon S3), where our data lake is stored in the form of Hive-accessible parquet directories. We retrain our models and forecast with them on a daily basis. Our in-house forecasting library is flexible enough to allow both research (back testing) and implementation (prediction), and thereby eliminates any differences between them. However, the batch training required for production is CPU-intensive, since we have an ensemble of many models across many delivery regions. As such, our production systems utilize Amazon Elastic Map Reduce (Amazon EMR) clusters and Dask computing to parallelize the prediction effort. When complete, these forecasts are written back to S3 tables in parquet format, where downstream scheduling systems can ingest them.
Our forecasting team has devoted significant effort to ensuring that forecasts are well-monitored and fail-safe. Extensive automated data validation is performed for every market’s forecast before it is stored. For example, we validate if the forecast is drastically different from yesterday’s. We also maintain reasonable backup models in case the more advanced models fail to predict. During all of this, our production system logs events and messages to our S3 buckets. Most importantly, we employ a human-in-the-loop approach that allows for market managers to make adjustments to forecasts based on dynamic situations without historical basis (unusual weather patterns, national events, or new marketing promotions). Alongside our ensemble of models, this permits our forecasts to be adjustable as a demand curve. The combination of our custom forecasting software, speed-oriented production system, and human-monitored procedures allow Grubhub to properly forecast and serve our customers as best as we can.
—
Ultimately, the art and science of forecasting requires the right organizational culture and resources, with procedures that support the continuous improvement of the forecasting process and technology that enables the capture of genuine patterns and relationships in data while removing noise.
Download the AWS Travel and Hospitality Forecasting white paper for more insights and guidance.