Between 2005 and 2015 Simon Rolfes played 288 Bundesliga games as a central midfielder, scored 41 goals and won 26 caps for Germany. Currently Rolfes serves as Sporting Director at Bayer 04 Leverkusen where he oversees and develops the pro player roster, the scouting department and the club’s youth development. Simon also writes weekly columns on Bundesliga.com about the latest Bundesliga Match Facts powered by AWS. There he offers his expertise as a former player, captain, and TV analyst to highlight the impact of advanced statistics and machine learning into the world of football. Here, Rolfes, together with Bundesliga data scientist Gabriel Anzer, will analyze the importance of some of the new Bundesliga Match Facts powered by AWS that fans can see during the 20/21 season, and Luuk Figdor and Mart Noten of the AWS Professional Services team then detail the AWS technology used behind these advanced stats.
When I was playing for Bayer Leverkusen, we had a tremendous striker named Stefan Kießling – one of the best goal-scorers in the Bundesliga when he was playing. In five out of six seasons between 2008-2014, he had double digit goals and was a key part of our offense every year.
We didn’t have Attacking Zones when I played, one of new Bundesliga Match Facts powered by AWS, but it would have shown that Stefan was excellent in the center of the pitch, but dangerous from everywhere. Teams knew they needed to key on him, which also opened our offense to be able to strike from the wings as teams centered the defense in the middle around him.
This is how we saw it playing out, but with Attacking Zones this data would have been captured to show how Stefan’s presence may have opened the right wing, for example, in one game, the left in another, or even that we were aggressive in the center of the pitch. This would have been useful for fans to see our team’s style of play and how we were attacking.
Match Facts have been designed to give fans deeper insights, providing them with a glimpse behind the “tactical curtain” and bringing them closer to the action on the pitch. These insights are based on analyzing 3.6 million data points that are collected every game.
Attacking Zones allows fans to see where teams focus their offense to create scoring opportunities and are defined as an area on the field that a team uses to attempt to score a goal. This will show fans where a team is attacking and which side that team views as most opportunistic to score.
What the Bundesliga and AWS have done with this advanced stat is to break down the final third of the pitch in front of the opponent’s goal, and divide it into four vertical zones of equal size. Each time a player enters the final third of the pitch – either by dribbling or passing – the algorithm, built on AWS Fargate, assigns an attack on the corresponding zone.
Through graphical elements, fans can now see which Attacking Zone a team favors and what share of attacks (percentage) target that area. This helps gives insight into the game-planning and strategy of the coaches. Perhaps a team sees a defensive weakness they are looking to exploit, or maybe it’s just making sure that the best player is on one side of the field or another, and the team continues to feed him opportunities.
Munchen (FCB) vs Sport-Club Freiburg (SCF) on Match Day 16 (15-17 January) in the 2020/2021 season we find that FCB is predominantly attacking through the right side of the pitch with 48 attacks from the right versus 28 on the left. SCF on the other hand attacked mostly on the left with 23 attacks versus only 13 on the right. This was likely caused by Serge Gnabry and later Leroy Sane, being very active offensively but opening a lot of room behind them to allow Freiburg to counter attack.
Another team that is quite unique in their attacking formation is RB Leipzig. The data shows that teams attack only 33% of the time through the middle of the pitch. RB Leipzig in their match versus Schalke 04 on Match Day 3 initiated 50% of their attacks through the middle. Schalke clearly was not able to protect the center of the pitch and in the end lost, 4-0. Another observation is that RB Leipzig predominantly attack over the left side (35%). This can be directly attributed to Angelino, RB Leipzig's left back, having a fantastic season with 4 goals and 6 assists to date.
Aren’t we fortunate to have Match Facts like this now, to bring us closer to the pitch, but also to the planning room to see how a coach is instructing his team to play? I know I am.
Luuk Figdor and Mart Noten members of the AWS Professional Services team who worked with Bundesliga to bring these Match Facts to life, will now explain how this advanced stat came to fruition.
Attacking Zones delivers an overview of where a team focuses its attacks. The final third of the pitch – the attacking third – gets divided into four parallel sectors, and Attacking Zones shows in real time what percentage of a team’s attacks takes place in each sector. How does it do that?
Throughout each match, the Bundesliga is constantly receiving data about the positions of players on the pitch (based on x-y coordinates) and the speed at which they move. These “Positional Packages” are sent 25 times per second to AWS, where AWS Fargate, a serverless compute engine for containers, infers the position and directions of each players movement. This in turn can be used to track when a team enters an attacking zone.
Once the movement of players is known the next action is to identify which player is in ball possession. This is done using a set of algorithms. These algorithms transform the positional packages into what is called a ball possession event or individual ball possession. Individual ball possession (IBP) is calculated using an adapted version of the algorithm first proposed by Link et al and measures which player is currently in possession of the ball. Attacking Zones heavily relies on the calculation of these ball possession events. More information on how ball possession is computed can be found in the Most Pressed Player blogpost but simply put, it is calculated by looking at the location of the ball, the proximity of the nearest players and a number of parameters such as: height of the ball, change in angle of ball direction, and a minimum number of frames that a player is within a certain distance of the ball.
In order to get an objective definition of an attack, first the field needs to be separated into three horizontal areas, of roughly 35 meters each, by creating distinct x and y coordinate thresholds. These areas and their respective thresholds are calculated by dividing the actual pitch size by three as not every pitch has the same size. According to the DFL statutes, a pitch must be 105 meters long and 68 meters wide. There are, however, allowances for those dimensions to drop no lower than 100m and 64m respectively.
These three areas are then classified as the defending area, the midfield area and the attacking third. In any ball possession event coming in, the algorithm can identify the area of the pitch the game currently is being played in, due to those newly created threshold coordinates.
With the attacking third defined, this area can then be divided into 4 vertical distinct attacking zones, each roughly 17 meters wide. By dividing the pitch this way, the flow of the game can be traced each time new positional information is ingested into the AWS environment. The Attacking Zones algorithm running on AWS Fargate identifies if teams are passing the ball in their own defending area, or if they are moving the play forward. Once a player successfully crosses the threshold for the attacking third with the ball, the algorithm identifies in which Attacking Zone that occurred and marks a new entry.
Looking at the historic data of the Bundesliga 2020/2021 season so far, the Attacking Zones algorithm identifies a clear preference for the wings versus attacks through the center. The data shows that out of the roughly 20,500 attacking entries counted this season the wings take respectively 35% and 33% of the attacking entries. However, what is counted as a successful entry in to the attacking third?
Attacking Zone entries are counted when ball possession remains with the same team while it progresses into the attacking third. This means that a pass must not have been touched, not even lightly, by a player of the opposing team. A second way to count a new Attacking Zones entry is when a player dribbles from the midfield area in to the attacking third. This is identified by analyzing the sequence of ball possession events that precede a player being in ball possession in the attacking third.
The first image below shows a player dribbling from midfield into the attacking third. The ball is portrayed as the red dot. The left image shows the player in possession being positioned in the midfield area with the intention to move forward into the attacking third.
The right image shows the next frame, the x-y coordinate values place the player in the attacking third and it is verified that the player has crossed the thresholds into an attacking zone with the ball, thus the attacking entry for Attacking Zone 3 is incremented.
The second example below shows a player passing the ball into the attacking third. Again, the ball is portrayed as red dot. In the left image the player with the ball is located in the midfield area. He intends to pass the ball to a teammate in the attacking third. In the next frame, the right image, that is ingested from the pitch, the algorithm verifies that the ball has been passed to a teammate without an opposing player touching it. Because of this, an attack entry is added to Attacking Zone 2.
For a real-life example, let’s look at “Der Klassiker” of Bayern München vs. Borussia Dortmund from March 6 earlier this season. During this match a large number of attack entries was counted: For both teams combined, the algorithm detected a total of 115 attacking entries over 90 minutes. Out of this total, close to 70% of the attacks were started by a pass, 30% of attacks were started by a dribble. These statistics coincide with the tactic of Borussia Dortmund coach Edin Terzić to use long balls to reach Norwegian target man Erling Haaland. Borussia Dortmund used this tactic with great success, gaining 10 attack entries in the first 30 minutes and scoring two goals by opening up the pitch in all four attacking zones. Unfortunately for Dortmund, after Haaland’s substitution the team could not keep up the momentum and the amount of attack entries plummeted to only reach a total of 21 attacks in the entirety of the match.
Bayern München on the other hand, quickly found their footing in the match after a slow start and reinstated their dominance. With a total of 94 attacking entries the data shows that Bayern was dominant after that initial phase of the match. In 55 of those attacking entries the same set of players seem to be involved. Kingsley Coman was the player with the most attacking entries (18 out of the 94). This is even more impressive taking in to account that he got substituted at minute 66 of the game. At the same time, the data also shows that Leroy Sané, Niklas Süle and Thomas Muller were heavily involved during attacking entries for FC Bayern Munich. Thanks to the team effort of FC Bayern Munich’s players the 0-2 lead of Borussia Dortmund got turned around in to a 4-2 win for the team from Munich.
Below are two examples of how this Match Fact appears during a broadcast and is showcased to fans.
We’re excited about the launch of Attacking Zones and its use in modern day football. As teams look to exploit defensive weaknesses to gain control of the ball and ultimately, score, Attacking Zones allows fans to see where teams focus their offense to create scoring opportunities. These insights are available for all fans throughout the game in the Bundesliga app used by the Bundesliga's broadcasters in live matches to elaborate on the offensive plan of each team. As these new Match Facts powered by AWS will help fans to uncover more tactics behind the game, we’re excited to learn what patterns you will uncover. Share your insights with us: @AWScloud on Twitter, with hashtag #BundesligaMatchFacts