In each Formula One race there are over 1000 laps run (barring significant misfortunes), which provides a huge amount of data to be analysed. The intelligentF1 model provides the means of analysis uncovering trends of underlying pace and details of car performance which are not immediately obvious, and rarely commented upon especially if it is something that the teams would rather not tell you about.

The model has a number of elements; fuel load, tyre type and age, tyre degradation with a phase 2 accelerated degradation model, pitstop time loss and a methodology to handle safety car deployment. When the models for these elements are correct, the races of the cars in clear air can be simulated – exactly as is done by the strategy guys at the teams (except that they have much, much more data) – and the strategy permutations which show which cars are really *racing* each other can be understood.

**So what data is simulated?**

As the laptime data comes in during the race, the intelligentF1 model builds a Race History Chart with all the cars’ data. This can be manipulated to follow a given car over a given set of laps. An example Race History Chart is shown below for the first five cars in the 2011 Italian Grand Prix.

The horizontal axis is lap number, and the vertical axis shows the time each car is in front of (or behind) a reference average race time. This reference average time is often taken as the race time for the winner, such that the line representing the winner finishes at zero as it does for Vettel in the chart above. As this reference time is arbitrary, it can be set to different values to best view the race performances of different cars – this has the effect of shifting the lines up and down the graph.

The performance of each car can be seen in the lines running from left to right. The key for all cars is on the right. So we can see Vettel pulling away at the front, Schumacher and Hamilton battling and Button passing them at the end of the first stint and then passing Alonso in the final stint. The sudden drops in the lines are due to the time lost in pitstops. Up to this point, this is the same as the data which can be found in other places, such as FORIX.

**How is the data simulated?**

Now we have this data, we can simulate the races using the intelligentF1 model. The same Race History Chart is shown below with the addition of the dashed lines showing the intelligentF1 model simulation of the race.

The pace of each car is determined by the gradient of the line, the faster the car, the steeper the gradient. The intelligentF1 model matches the gradient of the line in a stint where the car is in clear air (not always available) and can then predict the pace of the cars at other stages of the race. As can be seen, some gradients match and others don’t.

**So how do you know it’s right?**

There is a big difference between being right, and being good enough to understand the data and what it really means. There will be variations in a driver’s performance from lap to lap and mistakes, spins, overtaking manoeuvres, pitstop errors and times when a car is trapped behind a slower car. None of this can be predicted. However, these are the best cars and drivers in the world, and this means that there is actually a high level of consistency in their performance (outside these occasional unpredictable events), and this is usually at a level very close to their optimum. Therefore, when a car is in clear air, the intelligentF1 model is able to infer the underlying pace of the car/driver combination, and extrapolate this to other stages of the race. Consistency across a race, or stints from a number of drivers provides validation of the fuel and tyre models. For example, consider the simulations of the races of Alonso and Alguersuari at the 2011 Italian Grand Prix in the chart below. The intelligentF1 model simulations are very good showing that the models prediction of underlying pace is accurate enough to start doing more interesting things.

**What can it tell you?**

Where the intelligentF1 model is powerful is in allowing assessment of the data where things are not so simple and the traces do not match the predictions. Let’s take some examples from the 2011 Italian Grand Prix at Monza, as this was a nice straightforward two-stop race. Firstly, let’s look at the races of Vettel, Schumacher and Hamilton. I have also put Alonso’s trace on the figure below.

The intelligentF1 model matches Vettel’s first stint, but then suggests he should be going about 0.5s faster in the remaining stints. So either Seb was so seriously underfuelled for the race and needed to back off for two-thirds of the race (which would be a horrible mistake), or he was cruising, matching his pace to the fastest cars behind – the McLarens. The intelligentF1 model suggests here that he could have won by 30s if he’d wanted. Could Hamilton have beaten him if he had got to the first corner first? This pace advantage suggests it would be very unlikely.

Now let’s look at Schumacher’s trace. His pace is nice and consistent for much of the first stint, until Hamilton really starts pressing, and he starts struggling with his tyres. The second stint is where it gets interesting. Michael’s pace is much slower due to the pressure from Lewis, and is quite inconsistent. However, once Lewis goes past, Michael’s pace then increases beyond what would be expected from the first stint. The most likely explanation for this is that in going slower he was stressing his tyres less, and then had more tyre life left at the end of the stint than would be expected. Mercedes then let him go for as long as he could maintain this pace, resulting in a longer stint and the recovery of most of the time lost in battling Lewis. He loses a bit in the final lap before his pitstop, though.

Once past, Hamilton was able to use his pace to open a gap, to stay ahead at the final stops and to start chasing down the Ferrari ahead of him. However, by looking at the first couple of laps after his stop, it is clear that he was not chasing as hard as he might have been until alerted to the possibilties by a radio call from the team. The intelligentF1 model projects that had he used his full pace, he would have caught Alonso with about three laps to go. An opportunity missed.

**Where does it develop from here?**

The intelligentF1 model will be applied to upcoming Formula One races to simulate the underlying pace of the cars and drivers, to analyse strategic decisions, and to paint a clearer picture of why the race turned out the way it did. Perhaps it can even be done in real time…

Omkar Nene

November 20, 2011

Hey James

I am, Omkar a 4th year at I.I.T Bombay (Indian Institute of Technology). Like you I am passionate F1 fan greatly intrigued by the dynamics of the sport.Hence I took up the project ( Optimal Race strategy for formula 1 cars ) as part of my mathematical Modelling and Analysis course this semester.I have a preliminary model for which I have written a code in MATLAB. ( It takes into account stochastic components like rain and drivers mistakes as well)

My major roadblock is getting the tyre degradation curves(is there a way of predicting the degradation and phases using the Free practice data).Right now i just use a two piece linear function to approximate the time losses and phase change point.Dfferent gradients for Hard/soft compounds, Wet/dry conditions.

Also as of now it is a single car model which i intend to extended to multiple cars to make it realistic. Ideally one would want the model to be on-live and adaptive.I realise you have the same goal in mind.

P.S Any formal route of contacting you (mail id ) ??

intelligentf1

November 21, 2011

Hi Omkar, it’s nice to see that there are other people having a go at this. I think it is a good idea to compare notes.

I’d be interested to see what you do for wet/changeable conditions, as the evolution of the track is very difficult to deal with – are you looking for a wet/dry tyre change decision point in your optimal strategy? I’d also be interested in whether allowing for driver mistakes changes the strategy – it is worth compromising on the optimum strategy in case your driver makes a mistake? Interesting question.

For the tyre degradation curves – it depends how accurate you think you are. The amount of data (in terms of laptimes) from free practice is very small – and the drivers are often dropping off the pace to get space. I don’t think you can get a reliable picture of the tyre degradation for a car – you can get some idea by looking at the overall data for all cars, but I don’t tend to fit a degradation curve to free practice as I don’t think it’s reliable. The teams can look at the forces and the loads through the car lap-by-lap in free practice, so they can see what loads the tyres are able to take – this gives a much more accurate picture of the tyre grip loss with use. With laptime data only, we’re guessing.

My degradation is also a two piece linear fit. If you look at the real race data, I think that it is hard to justify anything more as the noise-to-signal is very high. It does OK, and it certainly adds to the understanding of the race.

I’ll send you a mail, so we can swap ideas.

Brezeck

May 1, 2012

Hi James. This is an interesting website with a lot of great works on formula one. I am Brezeck Wang from Hobart College in Geneva, NY. I am currently also working on a mathematical model to model a formula one race. Again, like Omkar, I could not really sort things out about tyre degradation. I thought about using linear fit on data but could not find any. Would you please share some insights with me? How can I find the data?

Thank you very much.

intelligentf1

May 3, 2012

Hi Brezeck – sorry it’s taken so long to reply.

It depends what you are trying to do. All the laptimes are available from the FIA – I have the times for 2011 and 2012 and can send them to you if you don’t have them.

To fit the data I make a number of (I hope physically sound) assumptions – the difference in pace between equivalent tyres can be attributed to fuel, and once fuel is accounted for, then you can back out the tyre effects. Once you start looking at the data in detail, you’ll find that individual laptimes have a very high noise-to-signal ratio, and so I tend to use cumulative times (as you can see from the race history charts I post everywhere) to fit the curves for fuel/degradation effects. You have to be careful – there are many fits that can work for short stints (equivalent to being able to put many straight line fits through noisy data), so you have to look at the longer stints, and also to make sure that the fits work reasonably well across a number of cars. Without that baseline, the data is insufficient to draw meaningful conclusions.

There are many things that you simply cannot do from laptime data – it is extremely coarse. There is a big danger in trying to infer stuff that cannot be justified – and sometimes it can be justified and it is still wrong…

If you let me know a little more about what you are trying to do – I’ll see if I can provide more specific help – which I’ll probably do via email.

Brezeck

August 3, 2012

James, apologies for the LATE reply. I thought I subscribed to the comment but somehow I did not. My project was just a semester-end final project so it was really simple. I did exactly what you said when I was trying to get the data. That worked out well for me, at least it showed the dynamics behind this sport.

Great blog and I should have checked back often. Keep it up!

LR

July 25, 2012

Hi James, what a very nice feeling that was to find your blog and know that someone else is also playing with this data!

I have been working in a similar model although with a few crucial differences.

I would be very interested in discuss some ideas about modeling laptime variation during a stint, could you please drop me an email?

Also I wonder if you know some good source of pitstop data for the year 2010.

Thank you! And again, keep up the good work, this is a brilliant blog!

intelligentf1

July 26, 2012

It’s always nice to find someone else doing something similar. And thanks for the nice comments.

I don’t model stationary time in pitstops, so I infer the losstimes directly from the laptime data (so include in/out lap losses). I think FORIX has all the pitstop data which is provided by the FIA for 2010.

I’ll get in touch.

workinmotorsport

April 16, 2013

Hi James

I fell upon your blog as I was wondering whether anyone had created a site like this. I really like what you have done here, it is really similar to some of the analysis that the real teams do.

I’m in the process of creating a blog, http://www.jobinf1.com, putting out advice on how people can get a job in Formula 1 and to give them an initial insight into what some of the jobs involve. I think I might point people to your blog if they want to be a race strategist as they will need to be interested in this kind of analysis.

Keep up the good work !

Richard

Michal

November 26, 2013

Perhaps a little bit off topic. I have a question about pitstop duration and laptimes. I’m working on some toy app, to visualise F1 race, and I see something in the data what I do not understand.

Last GP for instance Brazil 2013

When I look at the pitstops let’s say pitstops of Webber.

He stopped on lap 23, the duration is 25.012 seconds. I would expect the laptime of lap 24 much longer then laptime of 23. But the difference is only 10,974 s. In other words I cannot find back this lost 25.012 seconds in the laptimes.

I’m obviously missing something and I would appreciate if you could explain it to me.

Best Regards,

Michal

intelligentf1

November 26, 2013

No problem. Here are Webber’s laptimes from the laps around your problem.

20 1’18.701

21 1’18.590

22 1’18.901

23 1’23.733

24 1’34.707

25 1’16.957

26 1’17.242

Before the stop he is doing about 1’18.7s laps. He stops on lap 23, and has to slow down to get into the pits and to obey the speed limit, so the lap is slower. Once he crosses the speed limit line the pitstop time starts. He crosses the line in the pits (laptime 23 is complete – and he’s lost about 5s on the previous lap before even making his stop). Then he stops and continues in the race. He trips the pitstop timer at the speed limit line, and it took 25s. When he comes round at the end of lap 24, we see his lap was 10s slower than lap 23, but about 17.5s slower than laps 25-26 which are the tyres he was on during the outlap.

So if we compare laps 22 and 23, we see he lost about 5s on the way in; and comparing laps 24 and 25 we see that he lost about 17.5s on the way out including the stop. Which makes 22.5s. Not 25s Why the difference? It’s because there is no real relation between the pit time and the lap time loss. The lap time loss measures the difference in the time taken to run on the normal circuit and to run through the pits, which is not the same as the pit time, which measures the time spent in the pit lane. This is important as it means that you cannot use the pit lane time to measure the time lost in pit stops – you have to use the laptimes. There will always be an offset.

This is probably easiest to see with Massa’s drivethrough. He lost 4s on the in-lap and 10s on the outlap, but his pitlane time was 17s. Again the difference is about 3s, so it is pretty consistent.