AWS Machine Learning Blog
Bundesliga Match Fact Ball Recovery Time: Quantifying teams’ success in pressing opponents on AWS
In football, ball possession is a strong predictor for team success. It’s hard to control the game without having control over the ball. In the past three Bundesliga seasons, as well as in the current season (at the time of this writing), Bayern Munich is ranked first in the table and in ball possession percentage, followed by Dortmund being second in both. The active tactics and playing styles that facilitate high possession values through ball retention have been widely discussed. Terms like Tiki-Taka were established to describe a playing style that is characterized by a precise short passing game with frequent long ball possessions of the attacking team. However, in order to arrive at high possession rates, teams also need to adapt their defense to quickly win back a ball lost to the opponent. Terms like high-press, middle-press, and low-press are often used to describe the amount of room a defending team is allowing their opponents when moving towards their goal before applying pressure on the ball.
The recent history of Bundesliga club FC Köln emphasizes the effect of different pressing styles on a team’s success. Since Steffen Baumgart took over as coach at FC Köln in 2021, the team has managed to lift themselves from the bottom and has established a steady position in the middle of the table. When analyzing the team statistics after the switch in coaches, one aspect stands our specifically: with 54 pressing situations per game, the team was ranked first in the league, being able to win the ball back in a third of those situations. This proved especially successful when attacking in the opponent’s half of the pitch. With an increased number of duels per match (+10% compared to previous season), the Billy Goats managed to finish the last season on a strong seventh place, securing a surprising spot in the UEFA Europa Conference League.
Our previous Bundesliga Match Fact (BMF) Pressure Handling sheds light on how successful different players and teams are in withstanding this pressure while retaining the ball. To facilitate the understanding of how active and successful a defending team applies pressure, we need to understand how long it takes them to win back a lost ball. Which Bundesliga teams are fastest in winning back lost possessions? How does a team’s ability to quickly regain possession develop over the course of a match? Are their recovery times diminished when playing stronger teams? And finally, are short recovery times a necessary ingredient to a winning formula?
Introducing the new Bundesliga Match Fact: Ball Recovery Time.
How it works
Ball Recovery Time (BRT) calculates the amount of time it takes for a team to regain possession of the ball. It indicates how hungry a team is at winning the ball back and is measured in average ball recovery time in seconds.
Throughout a match, the positions of the players and the ball are tracked by cameras around the pitch and stored as coordinates in a positional data stream. This allows us to calculate which player has ball possession at any given moment in time. It’s no surprise that the ball possession alternates between the two teams over the course of a match. However, less obvious are the times where the ball possession is contested and can’t be directly assigned to any particular team. The timer for ball recovery starts counting from the moment the team loses possession until they regain it. The time when the ball’s possession is not clear is included in the timer, incentivizing teams to favor clear and fast recoveries.
The following example shows a sequence of alternating ball possessions between team A and B. At some point, team A loses ball possession to team B, which starts the ball recovery time for team A. The ball recovery time is calculated until team A regains the ball.
As already mentioned, FC Cologne has been the league leader in the number of pressing situations since Steffen Baumgart took office. This style of play is also evident when you look at the ball recovery times for the first 24 match days in the 2022/23 season. Cologne achieved an incredible ball recovery time of 13.4 seconds, which is the fourth fastest in the league. On average, it took them only 1.4 seconds longer to recover a lost ball than the fastest team in the league, Bayern Munich, who got the ball back from their opponents after an average of 12 seconds.
Let’s look at certain games played by Cologne in the 2022/23 season. The following chart shows the ball recovery times of Cologne for various games. At least two games stand out in particular. On the first match day, they faced FC Schalke—also known as the Miners—and managed an exceptionally low BRT of 8.3 seconds. This was aided by a red card for Schalke in the first half when the game was still tied 0:0. Cologne’s quick recovery of the ball subsequently helped them prevail a 3:1 against the Miners.
Also worth mentioning is the Cologne derby against Borussia Mönchengladbach on the ninth match day. In that game, Cologne took 21.6 seconds to recover the ball, which is around 60% slower than their season average of 13.4 seconds. A yellow-red card just before halftime certainly made it difficult for the Billy Goats to speed up recovering the ball from their local rival Borussia. At the same time, Borussia managed to win the ball back from Cologne on average after just 13.7 seconds, resulting in a consistent 5:2 win for Borussia over their perennial rivals from Cologne.
How it’s implemented
Positional data from an ongoing match, which is recorded at a sampling rate of 25 Hz, is utilized to determine the time taken to recover the ball. To ensure real-time updates of ball recovery times, we have implemented Amazon Managed Streaming for Apache Kafka (Amazon MSK) as a central solution for data streaming and messaging. This allows for seamless communication of positional data and various outputs of Bundesliga Match Facts between containers in real time.
The following diagram illustrates the end-to-end workflow for Ball Recovery Time.
The match-related data is collected and ingested using DFL’s DataHub. Metadata of the match is processed within the AWS Lambda function MetaDataIngestion
, while positional data is ingested using the AWS Fargate container called MatchLink
. Both the Lambda function and the Fargate container publish the data for further consumption in the relevant MSK topics. The core of the Ball Recovery Time BMF resides within a dedicated Fargate container called BMF BallRecoveryTime
. This container operates throughout the corresponding match and obtains all necessary data continuously through Amazon MSK. Its logic responds instantly to positional changes and constantly computes the current ball recovery times.
After the ball recovery times have been computed, they’re transmitted back to the DataHub for distribution to other consumers of Bundesliga Match Facts. Additionally, the ball recovery times are sent to a specific topic in the MSK cluster, where they can be accessed by other Bundesliga Match Facts. A Lambda function retrieves all recovery times from the relevant Kafka topic and stores them in an Amazon Aurora Serverless database. This data is then utilized to create interactive, near-real-time visualizations with Amazon QuickSight.
Summary
In this post, we demonstrated how the new Bundesliga Match Fact Ball Recovery Time makes it possible to quantify and objectively compare the speed of different Bundesliga teams in winning back a lost ball possession. This allows commentators and fans to understand how early and successful teams apply pressure to their opponents.
The new Bundesliga Match Fact is the result of an in-depth analysis by a team of football experts and data scientists from the Bundesliga and AWS. Noteworthy ball recovery times are shown in the live ticker of the respective matches in the official Bundesliga app and website. During live matches, ball recovery times are also provided to commentators through the data story finder and visually shown to fans at key moments in broadcast.
We hope that you enjoy this brand-new Bundesliga Match Fact and that it provides you with new insights into the game. To learn more about the partnership between AWS and Bundesliga, visit Bundesliga on AWS!
We’re excited to learn what patterns you will uncover. Share your insights with us: @AWScloud on Twitter, with the hashtag #BundesligaMatchFacts.
About the Authors
Javier Poveda-Panter is a Senior Data Scientist for EMEA sports customers within the AWS Professional Services team. He enables customers in the area of spectator sports to innovate and capitalize on their data, delivering high-quality user and fan experiences through machine learning and data science. He follows his passion for a broad range of sports, music, and AI in his spare time.
Tareq Haschemi is a consultant within AWS Professional Services. His skills and areas of expertise include application development, data science, machine learning, and big data. He supports customers in developing data-driven applications within the cloud. Prior to joining AWS, he was also a consultant in various industries such as aviation and telecommunications. He is passionate about enabling customers on their data/AI journey to the cloud.
Jean-Michel Lourier is a Senior Data Scientist within AWS Professional Services. He leads teams implementing data driven applications side by side with AWS customers to generate business value out of their data. He’s passionate about diving into tech and learning about AI, machine learning, and their business applications. He is also an enthusiastic cyclist, taking long bike-packing trips.
Fotinos Kyriakides is an ML Engineer with AWS Professional Services. He focuses his efforts in the fields of machine learning, MLOps, and application development, in supporting customers to develop applications in the cloud that leverage and innovate on insights generated from data. In his spare time, he likes to run and explore nature.
Luuk Figdor is a Principal Sports Technology Advisor in the AWS Professional Services team. He works with players, clubs, leagues, and media companies such as the Bundesliga and Formula 1 to help them tell stories with data using machine learning. In his spare time, he likes to learn all about the mind and the intersection between psychology, economics, and AI.
Jakub Michalczyk is a Data Scientist at Sportec Solutions AG. Several years ago, he chose math studies over playing football, as he came to the conclusion that he wasn’t good enough at the latter. Now he combines both these passions in his professional career by applying machine learning methods to gain a better insight into this beautiful game. In his spare time, he still enjoys playing seven-a-side football, watching crime movies, and listening to film music.