For the latest graphic in our F1 Insights series, powered by AWS, we will be showcasing an insight that forecasts future events using machine learning methodology.
A key question that is often asked on a Friday evening is ‘Where do you think the cars will be in qualifying based on the practice results?’ There are usually endless hours spent by journalists and fans trying to analyse every inch of the practice session, trying to come up with the answers. Instead, with this F1 Insight powered by AWS we will use machine learning and analytical methodology in an attempt to give us that answer in the most mathematically robust way possible. The ML model, run on Amazon SageMaker, will essentially take the practice data from the vent in question and use historical data of how teams progress between Saturday and Sunday’s races to try to arrive at a data-driven answer to what the qualifying results will actually look like. Using machine learning methodologies to ‘predict’ the future is becoming more and more common place, so using it in Formula 1 seems like an obvious choice, and AWS an obvious partner to work with.
The problem we were faced with as engineers and data scientists when trying to derive the algorithms that will forecast the qualifying pace is that from Friday practice to Saturday qualifying, many parameters can change. Some of these are within control of the teams (i.e. how they choose to configure their cars between the Friday and Saturday), and some of them are an actual evolution of the circuit. The former group (those that the team control) are much harder to predict. For example, the fuel level that is carried on a Friday and the fuel level that is carried on a Saturday could be different. As this particular parameter can account for more than 1 second of lap time between two days, it is important that we build accurate algorithms to be able to predict this change. There are other parameters too, and the model needs to recognise these and forecast the overall net change from one day to the next in order to give us a robust prediction.
The first thing we need to explain then, is the detail of the changes in question. In order to do this, we should consider how the teams configure and run their cars during Friday practice, what they are trying get out of the practice and the test programme they run in order to do that.
F1 teams tend to run in free practice with two principal targets: the first is to do their homework and prepare as diligently as possible for the important qualifying and race sessions; the second is to test performance or reliability updates which will be deployed in either the current or following races. The latter has become increasingly important as track testing in modern day Formula 1 has reduced drastically over the last decade.
Focusing on the former, teams use Practice 2 (P2) for both qualifying and race practice. In general, they will use the Practice 3 (P3) session for more qualifying-based tuning. Practice 1 (P1) is predominantly used for car testing and/or to find the right car setup.
This new F1 Insight powered by AWS will focus on predicting the qualifying session ranking and lap times based on the results of both the P2 and P3 practice timing and historical data.
If we were to look closely at a P2 session then it can be seen that the teams tend to carry out the qualifying simulations with a different fuel level and Power Unit (PU) mode than what they actually use in the Saturday qualifying session. Each team is open to make these car configuration choices how they wish and it is common that teams run slightly different fuel levels for what each of them terms “low fuel.” On the other hand, within a team both cars will carry the same level of fuel. For example, the drivers of team X start their Friday qualifying simulations with 35kg of fuel, whereas the two drivers of team Y will start their runs with 20kg. If we consider that for each kilogram of fuel carried, the laptime is increased by about 0.03 seconds per lap (with some variation from track to track), then the 15kg difference reported above would result in a 0.45 seconds per lap difference between teams simply by considering the fuel difference alone. The teams will usually run between 20 and 40kg in the Friday Qualifying simulations. This results in a difference of about 0.6 seconds per lap, so we can conclude that using raw lap time only from the P2 session can be very misleading.
The reason the teams run with more fuel in practice than they do in qualifying sessions is because they want to have more than just the single timed lap on Saturday to try and optimise their lap times. In addition, teams tend to mask their true pace from their competitors to maintain a strategic advantage. We should consider, however, that teams also want to maintain certain stability in car configuration and operating model from track to track. This means that although there will be differences between the teams, we are able to learn patterns within the teams due to this relative stability in practice fuel levels at different circuits.
Another key change between practice and qualifying is the PU mode (i.e. the amount of power that the driver will draw on during his qualifying simulations on Friday). The reality is that the teams tend not to overstress the PU in a practice session due to the fact that they have a very limited number of units to call on during the season. As a rule, they will utilise a less powerful mode in practice than they will in the actual qualifying session. Like the fuel mass discussion, teams will tend to run different PU levels (with respect to maximum) to each other and this can account for a significant difference in lap times between the teams.
The other thing that changes between practice and qualifying is the track. The track improves as the car runs on it due to cleaning effect and the rubber that is deposited on the surface, which increases the grip level. The lap time tends to decrease throughout the weekend due to this effect. This effect is the same for all teams, but the resulting lap time due to increased grip does not have to be equal across all the cars.
In short, there are critical parameters that change and develop between practice and qualifying. Some are due to decisions taken by the team, and one is due to the natural effect of track improvement. The model and algorithm is required to learn these and forecast changes from Friday to Saturday in order to give an accurate output in its prediction.
The model used for this insight uses a branch of machine learning known as supervised learning. Essentially, the model learns from historical data in terms of how much each team improves from Friday to Saturday. It uses this forecast and the practice session results to ‘predict’ the actual qualifying result.
In fact, this is achieved using two modelling methods, one that takes into consideration what is happening as described above explicitly and one that uses an implicit approach.
The first type of model considers the median of each car’s improvement based on historical data, as well as the team’s fuel and power adjustments as a variance around this median value.
The second type of model uses a regression technique that takes the difference in the laptimes with respect to the past and learns the actual result in order to be able to forecast the future using the teams, the drivers, the circuit, and the weather conditions (in particular if the session was wet) as variables.
From these two modelling techniques we are able to derive a predicted laptime for qualifying.
From the Car Performance F1 Insights powered by AWS, we are able to predict, before the event, how the different teams will perform, and we can predict the lap time differences (or deltas) between each car. This pre-event method makes an assumption that no performance upgrade will be deployed.
We use this pre-event knowledge in order to give a confidence level to the prediction. This means that if a car was usually towards the back of the field in qualifying suddenly finds itself in fifth in P2, then it is likely that the team changed the process. An example of his would be if the team ran with less fuel, artificially improving lap times. By indicating a low confidence in such cases we are simply saying that the performance is unexpected. We need to see the results of the actual qualifying session to see if this “anomaly” persists.
However, if this “anomaly” still persists into the qualifying session, it is more likely that team would have deployed a significant upgrade to that particular race.
A typical output that shows the expected grid and lap time gaps in qualifying would be is in the chart below:
The methodology returns a relatively elegant solution in terms of the equation as explained here below:
We are really looking forward to seeing how this new Qualifying Pace insight performs against reality. It is the perfect graphic where we will only have to wait 24 hours to correlate the results of the ranking prediction to the actual result. Of course, as with all mathematical models, we cannot forecast with 100 percent accuracy for the future, but we hope that this particular graphic will give a robust and analytical insight into where we think the qualifying positions will drop out. Given the fact that the algorithms use historical data to train on, this is an insight that will only continue to improve as more and more data comes in. As said, the accuracy with undoubtedly improve. We hope that you have as much fun with the output of this insight as we have had putting it together.