AWS for Industries

AWS and NFL’s new special teams Next Gen Stat is ready for kickoff

The newest advanced metric tackles the hidden dynamics of punt and kickoff returns. The NFL Next Gen Stats team is here to give us the inside scoop.

We have all witnessed a returner getting tackled a nanosecond after receiving a punt or kickoff. Holding onto the ball, let alone gaining a chunk of yardage, is a huge win. The odds of a return touchdown are, let’s say, on more of the miraculous side. Only 0.6% of NFL kickoffs (six out of 1,013) and 0.3% of punts (three out of 952) were returned for touchdowns during the 2022 regular season. But that’s exactly why it’s a thrill to see a returner beat the odds, weaving through wave after wave of improbability. It’s an art. An extreme outlier in reality. And, it can sway the game in a blink of an eye.

Unless things go extremely right or extremely wrong, however, a blink of an eye is about how much attention special teams often get. “The intricacies of this battle for ball control and field position remain hidden from advanced analysis,” says Mike Band of NFL Next Gen Stats. “Yet, kickoffs and punts make up roughly one-fifth of the game, and can often have a major impact on field position and game flow.”

To address this gap, AWS machine learning (ML) engineers and the NFL’s Next Gen Stats group co-developed Expected Return Yards, the first-ever set of advanced stat models focused on kickoff and punt returns. Over the last five years, this partnership has cooked up analytics that dig deeper into the offensive and defensive sides of the ball. Now, they are applying those learnings and ML techniques to special teams.

This is all part of a larger effort to help fans experience and understand every aspect of the game as it happens, says Band. “If we can tell stories about each component within the game, it allows us to tell whatever story is transpiring on the field. Not just for the sake of it, but to bring the fans closer to closer to the sideline.”

Expected Return Yards predicts the yardage a returner will gain if they field a kickoff or punt. Imagine freezing time right when a returner receives the ball. “This stat model estimates how many yards the returner is expected to gain once they receive the ball based on the X and Y speed, acceleration, and orientation direction of every player on the field at that timestamp…” (those numbers are derived from player tracking data sent from chips in the players’ pads, by the way) “…and from that timestamp, the model makes an estimate of the probability distribution of how many yards that player would gain if they were to return the punt or kickoff,” explains Band.

When the team started experimenting, they explored combining punt and kickoff data to train one single model. But, it performed poorly. It turns out that while punts and kickoffs use similar parameter sets, the nature of the data is more diverse. The player location on the field, proximity of defenders when the returner catches the ball, returner speed, real-time position in relation to one another, and acceleration of punts versus kickoffs were dissimilar enough that the model ran into problems.

“In order to generate the best distribution of predictions, you have to create a model specific to that situation. It’s like comparing apples to only apples in a model and then you create a similar model to compare oranges to oranges,” says Band. So, the stat was split into two distinct models, which performed much better.

Another key challenge came when trying to generate a probability percentage of a return touchdown. Punts and kickoffs happen all the time in a game, but rarely result in scores. Only six kickoff returns were returned for touchdowns during the 2022 regular season, while only three punt returns resulted in touchdowns (out of approximately 1,000 returns each). In the typical ML modeling process, which involves algorithms helping a system automatically look for patterns in data to make vital decisions for themselves, these low numbers are considered outliers and are often devalued. This set up crucial questions for the Next Gen Stats team: with such a small dataset, how do you predict the outlier events in football like a return touchdown? Can we capture the true touchdown probability on a punt return or a kick return that aligns with reality?

It took a lot of borrowing, experimenting, adapting—and yes, even ML models training other ML models—to find the best solution. Expected Return Yards began with the foundational architecture of existing expected yards stat models. Engineers then tinkered with different techniques to correctly model the unusual event of a return touchdown. They finally found a breakthrough using a novel ML method originally designed for time-series forecasting called Spliced Binned-Pareto (SBP) distribution. In a nutshell, SBP models data in a way that accounts for rare events by extending both ends of a distribution. This same method could also be applied to account for extreme rainfall in flood predictions, where an uncommon event has a huge impact on the overall performance of the model.

Let’s say the expected return yards distribution for a play is 3–15 yards. As you move past 15 yards and get closer and closer to the end zone, jagged blips in the data pop up from the probability of bigger yardage gains. This has to do with the pivotal moment a returner makes it past each wave of defenders, says Band. “It is what we imagined from the football perspective—that the probability of scoring a touchdown has to do with whether or not you got past a big group of defenders and then the possibility that you ran past another defender further down the field. The SBP method better captured those blips of possibility, which led to better touchdown probability estimates above our baseline model.”

A relatively new process called transfer learning also played a role in crafting the stat. A small dataset usually doesn’t result in a good performing model. The more examples you have to train the model on, the higher the accuracy tends to be. Transfer learning boosts performance by using a model trained on a task and reusing it to train another model with a similar task. Leaning into this method, the team taught and tuned the new Expected Return Yards models using other expected yardage models (rushing and yards after catch) already in production.

An interesting development from this process is that the student will become the teacher. Discoveries found with this new model will be used to refine and enhance the very models that trained them. “Our next venture is to apply this newer architecture to our existing expected rushing yards model and our existing expected yards after catch models. We can refine them and look at the potential biases in them. And in this case, we’ve better accounted for outlier outcomes—the probability of an outlier that’s more in line with reality. And so now, we have work to do to improve our offensive-focused models with our latest learnings. Every modeling venture we’ve gone through, success or failure, we’ve learned, can apply to our existing work.  It’s a constant feedback loop,” says Band.

Want a deeper dive into the data science behind Expected Return Yards? Check out this Q&A with Amazon ML Solutions Lab engineers.

“There are so many games within the game, so many insights to find, and so many stories left to be told.” – Mike Band, Next Gen Stats

Stats are just the start

So there you have it—a reliable prediction of expected punt or kickoff return yardage along with the probability of a touchdown. While exciting, this is just a sliver of the special teams story this stat can begin to tell. Which returners are the most consistent in creating yards? Is a punt returner too conservative or aggressive in signaling for a fair catch? Which gunners are most effective at limiting the space for a punt return? These are all insights provided in the new model that can peel back the layers of this largely ignored battle of position.

Machine learning momentum

Expected Return Yards is the next step in giving fans a fresh look at everything unfolding on the field—and in the minds of coaches. Starting in 2018, The NFL and AWS have now engineered a suite of advanced analytics digging into every side of football. With each new model, understanding of machine learning and neural networks as it relates to the game grows. And as learning and technology grows, so does the confidence in the partnership to tackle never-seen-before stats. “It’s led us to knowing that when we go into a project of such big magnitude, that we have a high probability of success, and then we have a high probability of coming out with a good and usable model,” explains Band.

The future of fandom

So, what can we expect next? The trajectory of AWS and Next Gen Stats is “really about bringing newer insights derived from newer tech throughout the entire fan experience,” says Band. “It’s about giving fans a live look at the heartbeat of the game and following the rollercoaster of what happens during any given play.”

When it comes to the advanced sports analytics world, we’re only in the first seconds of the first quarter. Existing stat models will refine and grow. New models will reveal new stories on every side of the ball. Innovative applications of the stats will further transform the fan experience on and off the field. As the technology evolves, so too will our ability to analyze the game—second by second, or rather, millisecond by second, to be more exact.

Learn more about Next Gen Stats powered by AWS, and see more fan engagement solutions leveraging machine learning and other capabilities in the cloud.