AWS for M&E Blog

PGA TOUR win probability model powered by AWS

Driving advanced golf analytics

In the world of professional sports, data-driven insights have become a game-changer for players and fans alike, revolutionizing the way we understand player performance, game strategies, and the very dynamics of the sport itself. The game of golf, long associated with tradition and precision, has recently embraced the power of advanced technology to provide unparalleled insights into the game.

In 2020, PGA TOUR embarked on a mission to unravel the challenges of predicting player performance and tournament outcomes. Their journey began with a curiosity-driven exploration of win, cut probabilities—providing a measure of a player’s overall skill and predicted outcome for upcoming play.

Inspired by models utilized in other sports, Mike Vitti, PGA TOUR Senior Vice President of Data Science and Technology Solutions, and his team, set out to determine win, cut probabilities in golf. Unlike team sports, where winners are one of two teams participating in a competition on a familiar field, golf pits each player against competitors on a demanding and ever-changing course and varied environmental conditions. To overcome these challenges, the team needed to build a comprehensive probability model that considered both recent player-specific performance and course-specific elements. This model would have to be highly adaptive to account for changing conditions and player performances in the recent past and over the duration of the tournament. Moreover, with varying courses each week, there is rarely a constant baseline to rely on, making the task of developing a consistent performance model all the more challenging.

Introducing PGA TOUR Win, Cut Probability

The PGA TOUR introduced Win, Cut Probability, a novel ML-powered analytics model that provides real-time insights into the likelihood of outcomes during tournaments. Powered by Amazon Web Services (AWS) advanced data analytics, cloud infrastructure, and serverless compute, the Win, Cut Probability model measures a player’s chances of advancing in and winning a tournament. Its operation comprises two core components: The first involves simulating the entire tournament before it starts. A rolling 12-month performance history is considered to ensure the accuracy of the simulation, with recent play being given more weight. This process estimates the probability of winning before the event even begins. With the vast skill level of all PGA TOUR players, this probability typically ranges between 0% and 8%.

While preliminary modeling is impressive, the real magic unfolds during the tournaments, where AWS computational muscle plays a pivotal role in solving this complex problem. As tournament play progresses, a leaderboard of predictability scores is replaced with real recorded scores after every shot. The probability for all players winning and projected cut line must be recalculated to accommodate these changes. To do this, advanced analytics run 10,000 simulations and recalculates win probability every 15 seconds using AWS cloud capabilities, which are instrumental in enabling this complex and continuous re-calibration. “What used to be a highly manual, time-consuming process to calculate projections, is now done with tremendous speed and accuracy leveraging AWS,” said Vitti.

Player’s Cut Probability

This dynamic re-calibration allows the PGA TOUR to provide real-time win probability estimates to fans via broadcast, web, and mobile app, enhancing fan engagement and immersion in the tournament. Spectators can witness the fluctuating probabilities throughout play and gain insights into how player performances affect overall outcomes. Players can additionally track cut line shifts, providing them with visibility into their projected likelihood of advancing in the tournament. Notably, this technology has also proved to be invaluable for operational purposes, as it aids in developing competition models for different PGA TOUR events.

Probability trends during the course of a tournament

 

Under the hood: The genesis of the simulation model

PGA TOUR tournaments typically are played from Thursday to Sunday with play starting at 8am and running until 6pm. A varying number of players, ranging from 30 to as many as 150 on any given tour, are in each tournament. Due to this variation, the compute needed to run the model adjusts based on the changing number of players, time of the day, and day of the week.

Once Mr. Vitti and team had the model functioning as desired, they needed to deploy and scale the app efficiently. Knowing there would be spikes in consumption as tour progressed and paused, the team looked to AWS. The PGA TOUR leverages AWS serverless architecture for the elasticity it provides. Going serverless helped the PGA TOUR run the model on the right sized infrastructure, accommodating scale and reducing the overall cost. AWS serverless simplifies the deployment of cloud applications by shifting the management of the underlying compute resources to AWS. Tasks such as server management, resource allocation, and scaling are offloaded to AWS, so that the PGA TOUR can focus on fine tuning the model, accelerate their time to production, and lower costs.

Going serverless with AWS also helped the team deploy quickly and with only a small and nimble team. In fact, the model was implemented by a single developer in the span of three months leveraging the agility that serverless architecture provides, thus accelerating the time to market.

Player cut line

AWS infrastructure provides the PGA TOUR with the ideal platform to run the Win, Cut Probability model. Leveraging Amazon ECS with AWS Fargate, Amazon Redshift, Amazon DynamoDB, Amazon EC2 instances, and Amazon S3 buckets, the system supports data storage, processing, and dissemination seamlessly. By leveraging AWS serverless compute to run the thousands of simulations and process nearly four billion records, the PGA TOUR team can take advantage of AWS elasticity, spinning up the required compute power on demand, ensuring the model’s efficiency without the need for constant hardware investments.

The AWS serverless compute model removes the operational overhead of infrastructure management, making it easy to deploy, manage, and scale the containerized workloads without the complexity of managing a control plane, nodes, or instances. The PGA TOUR is able to run their model quickly and scale as needed, and since AWS handles the capacity planning and auto-scaling, compute usage is optimized, allowing them to reduce capacity in the evenings and when competitions are paused.

Win probability architecture

Before a tournament starts, relevant historical data is migrated to DynamoDB using Data Migration containers for faster access. When the data is migrated, the controller spawns 50 child containers and each child container runs 200 iterations to compute the probability model. Once the tournament begins, the 50 containers run continuously during play, receiving updated scoring information every 15 seconds and computing the updated probability based on this information.

The controller checks DB once a minute and aggregates projections. Reporting and analytics are output using Aurora PostgresSQL. Results are then transferred through GolfTech API to broadcast channels.

Conclusion: A new horizon in golf analytics

The PGA TOUR Win, Cut Probability function is a true testament to the marriage of sports expertise and cutting-edge technology, and exemplifies how data-driven insights and modern compute technologies can transform traditional sports into immersive experiences. Development of the Win Probability capabilities has enabled the PGA TOUR to explore future opportunities to constantly provide their fans improved tournament schedules, player pairings, and enabled opportunity for their players to compete at the signature events. As the journey continues, the intersection of AWS capabilities and the PGA TOUR’s vision holds the promise of an exciting and data-rich future for golf tournaments around the world.

Ari Entin

Ari Entin

Ari Entin is principal of AWS sports marketing communications, based in Silicon Valley. He joined Amazon in 2021 from Facebook where he led AI communications and marketing. He has driven integrated media campaigns for top-tier consumer electronics, sports and entertainment, and technology companies for decades.

Neelam Patel

Neelam Patel

Neelam Patel is a Customer Solutions Manager at AWS, leading key Generative AI and cloud modernization initiatives. Neelam works with key executives and technology owners to address their cloud transformation challenges and helps customers maximize the benefits of cloud adoption. She has an MBA from Warwick Business School, UK and a Bachelors in Computer Engineering, India.

Harry Lauer

Harry Lauer

Harry Lauer leads product marketing for Amazon ECS and AWS Fargate and is based in Seattle, Washington. He has driven large-scale marketing and GTM programs for start-ups and hyper-growth organizations, through IPO and acquisition, as well as for Fortune 100 companies.