AWS and NFL Pull Back the Curtain on Defensive Strategy

Think like a NFL defensive coordinator with this new ML-powered Next Gen Stat that reveals the theory behind plays, bringing fans one step closer into the huddle

We’ve all been there: it’s Monday night. You’re watching your team play at the top of their game in a matchup you’ve been waiting for all season. The win is safely minutes away when, out of nowhere, your omniscient, impenetrable defense allows a scoring drive with a seemingly utter lack of coverage that makes us jump out of our seat looking for answers to the question, ‘how did that happen!?’

Beneath that disappointment is likely a reluctant understanding that the defense was outmaneuvered, perhaps weeks ago, in the mind of a prescient coach. Because the game of football is played on many levels: there’s the blood, sweat and tears on the field, of course, but there’s also the chess-like game of intellects, coaches versus coaches, using decades of experience, immense arsenals of data, and perfectly tuned players to orchestrate strategies informed by deep thinking and statistical analysis. 

For fans, much of this game—the theoretical one—has remained invisible and out of reach. Video games and fantasy leagues chase the feeling of meaningful participation; while neither are “real,” their popularity speaks to a huge appetite for a deeper understanding of why games unfold the way they do. 

So how do you give fans access to the theory and strategy—the “why”—behind what happens on the field between September and February? Historically, football stats have focused on the offensive side of the game, which makes sense. Scoring is compelling and a player’s value is clearly expressed in terms of how they contribute to a play. But as technology and fans have gotten smarter, an unmet need in statistical storytelling around defensive performance has become abundantly clear. 

“There's always been a lacking ability to tell statistical stories about the defensive side of the ball,” 

Mike Band, Next Gen Stats

“And not only the defensive side of the ball in terms of production and stats, but in strategy, in play calling, from the coaches side of it. So, as we grow our understanding of the defensive side of football, the complexity of the game also evolves.” This complexity is driven by technological advancements and engineering inventions, created with a goal of delivering a fan experience that goes deeper into the theory of actual games, illuminating the mystery and nuances of defensive strategy. In other words, AWS and Next Gen Stats are using data to open doors for fans to watch games like a defensive coordinator and experience football in a smarter, more immersive way.
The launch of the new stat, called coverage classification, is an exciting foray into a vast and relatively unexplored statistical frontier. A first-of-its-kind defensive stat, coverage classification identifies what kind of defensive coverage a team will run using data collected via a microchip in each player’s shoulder pad. 
  • COVER 0
  • Coverage Type: Man

    All-or-nothing man coverage with no deep safety. Each defender in coverage is responsible for an eligible receiver. Since 2018, defenses blitz on 90% of Cover 0 plays outside of the red zone; 62% of Cover 0 plays in the red zone. 

  • COVER 1
  • Coverage Type: Man

    A single-high centerfield safety with man coverage underneath. With a 4-man pass rush, an extra defender is free to spy the QB or bracket a top receiver. 

  • Coverage Type: Man

    Two safeties split the deep area of the field in half (1/2) with man coverage underneath. If all five eligible receivers run a route, the defense can send a max of four rushers. 

  • COVER 2
  • Coverage Type: Zone

    Two safeties split the deep area of the field in half (1/2), while the remaining coverage defenders occupy five underneath zones. CBs are responsible for the flats; Slot defenders are responsible for hook/curl zones; MLB is responsible for the middle hole zone. 

  • COVER 3
  • Coverage Type: Zone

    Three deep defenders split the backend of the defense into thirds (1/3), while the remaining coverage defenders occupy three or four zones, depending on if the defense blitzes. 

  • COVER 4
  • Coverage Type: Zone

    Also known as “Quarters” coverage. Four defenders are responsible for each deep quarter (1/4) of the field, while the remaining coverage defenders occupy three underneath zones. 

  • COVER 6
  • Coverage Type: Zone

    Also known as “Quarter-Quarter-Half” coverage. One side of the defense is occupied by two deep safeties, while the other half is occupied by one deep safety. The remaining coverage defenders occupy underneath zones based on the receiver strength. 

  • Coverage Type: Prevent

    A special coverage type saved for situations with little time remaining in a half. A number of coverage defenders align at extreme depths off the line of scrimmage, sacrificing underneath completions in order to prevent big plays & TDs. 

"Going beyond just man or zone, a computational model powered by AI- and machine learning (ML) is able to make a diagnosis in fractions of a second with stunning accuracy."

By analyzing data points like field position and nearly imperceptible body movements—hips turned slightly to the left, a cornerback standing a few inches closer to the line of scrimmage—this Next Gen Stat can say with high degree of accuracy that, for example, the Seahawks used a cover 3 scheme, just moments after the play ends.
The process of teaching an AI how to identify defensive coverages is kind of like a game of Pictionary. A lot of trial and error goes in to deciding which data points to look at and which to ignore, but the more details that go into the picture the easier it’s going to be to guess what the drawing is. Unlike a human brain, AI is able to look at all the data at the same time in no predefined order. This allows it to figure out the best way to prioritize each data point without any preconceived notions about value or importance. Engineers then use algorithmic designs and metrics to tell the AI what to draw and let it decide the fastest and most accurate way to get to that picture. From there, the ML component learns the difference between the eight different variations of the picture (the different coverage schemes) and is able to deliver a diagnosis.
The degree of difficulty involved with developing and accurately expressing defensive stats is extremely high. The sheer volume of raw data is one factor. A given play lasts about 5 seconds, and in order to make an accurate analysis, the computational model needs to be fed 5 frames per second, each of which contain “images” that describe each players’ field position. These images include a player’s distance from the line of scrimmage, speed, acceleration, orientation to the quarterback, and position, speed, acceleration, and orientation relative to every other player on the field. Multiply that by the 22 players on the field and over 100 plays per game and soon you have millions of rows of data that needs to be processed fast.
Another factor is, what do you do with the data in order to make it mean something? What metrics and rules need to be created in order for a computational model to actually make sense of the millions of data points being fed into it? After experimenting with different types of model architecture, the ML solutions lab found a faster way to process the data without sacrificing accuracy. 

The computational neural network model is able to look at each image in each frame, understand the relationship within each frame, and is able to track how those relationships change over time. 

By looking at each “pixel” of each image, the model then pieces together a story about how they each relate to each other. The result is an incredibly accurate identification, which was able to learn how to correctly infer coverage classification from scratch in a matter of hours. After it was trained, it could recognize the defense in under a second.
The NFL and AWS have been building up to the accomplishment of launching a meaningful defensive stat for years. Starting with Completion Probability which launched in 2018 and building step by step with the innovations of passing score, fourth down decision guide, completion probability, expected rushing yards, and more, an experimental partnership has grown between the two companies. According to Band, “over the years, AWS has brought solutions that [the NGS team] would have never come up with on our own.” Since the first Big Data Bowl in 2018, NGS engineers and research and analytics groups have leaned on AWS to evolve their thinking and technological capabilities. “What's really been great,” according to Band, “is that if we bring football expertise and experience productionizing stats, and the AWS ML team brings solutions and techniques that we've never seen before, we have continued to have success every time we've set a goal to create a new metric.”

Behind this work is the idea that stats can take fans beyond the “what” of the game and transport you into the world of “why.” They add color and detail to the fan experience, make it more immersive, and open opportunities to feel like participant in the action. The energy and excitement that draws millions in every year becomes that much closer and more real the more you know, the more you can understand the reasons behind the action unfolding on the field. Without investing any additional time or effort into a game you already love, innovative stats like coverage classification can reveal the logic that goes into strategic decisions and give fans more information to create meaningful stories that weren’t visible before.

So as you tune in to watch this football season, go ahead and dig in to Next Gen Stats. Grow your expertise and watch as new stories about what’s happening on and off the field come to life before your eyes. You might end up impressing yourself—and your friends—with what you’re able to see when you can use data to get beyond “what” happens during a game and uncover the “why.” At the very least, it might soften the blow when your defense suffers a surprise scoring drive.