F1 Strategy Analysis

[This article was first published on Sport Data Science, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I was recently browsing reddit and found this AMA from a former mercedes strategy engineer

The most surprising thing was that most of the race strategy was calculated using VBA in excel. This isn’t some start up outfit this is the mighty Mercedes winners of the last 7 constructors and drivers championships. In this blog I’m going to go about building a calculator in R. Maybe then some formula 1 teams will buy it off ne and ill be rich! I guarantee it will be better then based in Excel.

In the first part im going to look at the fundamentals of F1 strategy, in the second part I will look at some actual data form actual races.

What is the F1 strategy fundamentals? well its to minimise the total racetime.

```{r}


start_laptime <- 95

fueldeg <- 0.06

laps <- 1:70

tyre1 <- "soft"

tyre2 <- "medium"

tyre3 <- "hard"


```

First things first I set up some standard variables in this exploration of the fundamentals of Formula 1 strategy. This is not related to any race but I have just selected a some base numbers to get an idea of the topic. So the the start lap time is the lap time for the first lap of the race. The fuel deg is the gain in lap time from the reduction in weight of the fuel.





f1_sim_s <-  tibble(laps, start_laptime, fueldeg, tyre1)

colnames(f1_sim_s)[4] <- "tyre"

f1_sim_m <- tibble(laps, start_laptime, fueldeg, tyre2)


colnames(f1_sim_m)[4] <- "tyre"


f1_sim_h <-  tibble(laps, start_laptime, fueldeg, tyre3)


colnames(f1_sim_h)[4] <- "tyre"


f1_sim <- f1_sim_s %>%
                      bind_rows(f1_sim_m) %>%
                        bind_rows(f1_sim_h)


f1_sim2 <- f1_sim %>% mutate(lapt = start_laptime - (laps-1)*fueldeg)

ggplot(f1_sim2, aes(x = laps, y = lapt)) + geom_point(colour = "#6d009c") +
                                                                labs(x = "Lap Number", y = "Laptime (s)", title = "Formula 1 Full Race Laptime") +
                                                                    theme(panel.background = element_rect(fill = "#c9c9c9"), panel.grid.minor =  element_blank(), panel.grid.major = element_line(colour = "#9c9a9a"))

Above you can the code used to create 3 data frames for the 3 different tyres available on an Formula 1 weekend. I then merge it into 1 big data frame of 70 laps for the 3 different tyres. Plotting that over the race shows the reduction in lap time in a linear fashion for all 3 tyres. That is not reality as the 3 tyres are different compounds – the soft tyre is the quickest but least durable where as the hard tyre is the lowest but most durable

tyre <- "soft"
ds <- 0
dr <- 0.2
dc <- 0.1


soft_tyre <- tibble(tyre, ds, dr, dc)


tyre <- "medium"
ds <- 0.75
dr <- 0.08
dc <- 0.08


medium_tyre <- tibble(tyre, ds, dr, dc)


tyre <- "hard"
ds <- 1.5
dr <- 0.06
dc <- 0.07


hard_tyre <- tibble(tyre, ds, dr, dc)



tyre_stats <- soft_tyre %>% bind_rows(medium_tyre) %>%
                              bind_rows(hard_tyre)

In the above code I add some stats to each of the tyres that will dictate how they race. DS is the difference to the soft how much slower is the tyre then the base which is the soft tyre. DR is the base degradation rate and DC is the change in degradation rate. The tyres have what is termed as the cliff built into them. This seemed to me like the compound interest model and therefore I have applied the same method to tyre degradation.

f1_sim3 <- f1_sim2 %>%
                      left_join(tyre_stats, by = "tyre") %>%
                        mutate(tl = lapt + ds) %>%
                        mutate(deg = dr * (1 + dc)^(laps-1)) %>%
                          mutate(dl = tl+deg)



cols <- c("soft" = "#6d009c", "medium" = "#eb9b34", "hard" = "#009e45")

ggplot(f1_sim3, aes(x = laps, y = dl, col = tyre)) + geom_point() +
                                                        ylim(90,110) +
                                                        scale_color_manual(values = cols) +
                                                      labs(x = "Lap Number", y = "Laptime (s)", title = "Formula 1 Full Race Laptime") +
                                                                    theme(panel.background = element_rect(fill = "#c9c9c9"), panel.grid.minor =  element_blank(), panel.grid.major = element_line(colour = "#9c9a9a"))



f1_sim_group <- f1_sim3 %>% group_by(tyre) %>%
                                summarise(raceT = sum(dl)/60)

ggplot(f1_sim_group, aes(x = reorder(tyre, raceT), y = raceT, fill = tyre)) + geom_col() + 
                                                                                coord_flip() +
                                                                           scale_fill_manual(values = cols) +
                            labs(x = "Tyre", y = "Total Race Time (min)", title = "Formula 1 Tyre Race Time") +
                                                                    theme(panel.background = element_rect(fill = "#c9c9c9"), panel.grid.minor =  element_blank(), panel.grid.major = element_line(colour = "#9c9a9a"))

In the code above I calculate the lap times for each tyre if you were to run the same tyre in this particular race.

When the lap times on each tyre is reviewed the “cliff” is visible and face the soft tyre may be softer to start but quickly becomes the slowest tyre

Over a whole race distance the hard tyre is the fastest tyre with the medium a little slower and the soft being significantly slower and at 134 minutes total race time over an f1 races 2 hour time for a race. So easy game just run the race on the hard tyre for an easy victory. Sadly its not so simple. When is life ever! The rules state that you must use 2 of the compounds in a race therefore introducing a mandatory pitstop which you need to optimise to find the ideal lap with which to pit depending on the tyre the race is started on.

```{r}


recs <- 1:4900


recs2 <- tibble(recs)

f1_sim_s2 <- f1_sim2 %>% filter(tyre == "soft")

recs3 <- recs2 %>% full_join(f1_sim_s2, by = character()) %>%
                        filter(recs < 71)

In order calculate the optimum pit lap if the race is 70 laps long I’m going to calculate the total race time for if the driver stopped on any lap. Therefor the data frame is 4900 rows long which 70 * 70 laps.


f1_stint <- f1_sim_s %>% mutate(stopl = laps) %>%
                            mutate(newt = "medium")

colnames(f1_stint)[1] <- "recs"
colnames(f1_stint)[5] <- "laps"
colnames(f1_stint)[6] <- "ntyre"





colnames(tyre_stats)[1] <- "lap_tyre"


f1_sim_4 <- recs3 %>% left_join(f1_stint, by = c("recs", "laps")) %>%
                            group_by(recs) %>%
                              fill(ntyre, .direction = "down") %>% 
                                mutate(lap_tyre = if_else(is.na(ntyre), tyre.x, ntyre)) %>%
                                  group_by(recs, lap_tyre) %>%
                                      mutate(tyre_age = 1:n()) %>%
                                        left_join(tyre_stats, by = "lap_tyre") %>%
                                             mutate(deg = dr * (1 + dc)^(tyre_age-1)) %>%
                          mutate(dl = lapt+deg+ds) %>%
                              ungroup() %>%
                                  group_by(recs) %>%
                                      summarise(tott = sum(dl)/60) %>%
                                      mutate(strat = "sm")

 
ggplot(f1_sim_4, aes(x = recs, y = tott)) + geom_point(col = "#6d009c") + 
                                                       labs(x = "Lap", y = "Total Race Time (min)", title = "Formula 1 Race Time Soft/Medium Strategy") +
                                                                    theme(panel.background = element_rect(fill = "#c9c9c9"), panel.grid.minor =  element_blank(), panel.grid.major = element_line(colour = "#9c9a9a"))
 

Above you can see the total race time for a strategy starting on the soft tyre and pitting for a medium. As you can see minimum race time is seen around lap 30 which would be the optimum lap to stop in this scenario. Formula 1 strategy is all about getting to the smallest race time. On a one stop strategy there are basically 3 options soft to medium, soft to hard, and medium to hard. They don’t often start on the hard tyre as that would mean losing places off the grid start.

Copying the same process for the 3 different strategies and plotting them on the same graph to compare race times and optimum pit times. In this scenario the soft medium strategy is the fastest strategy with the medium hard the slowest overall. The medium hard will offer more flexibility and a wider pit window due the least difference between the best possible and worst possible race time.

min_t <- f1_sim_8 %>% group_by(strat) %>% 
                          slice(which.min(tott))



ggplot(min_t, aes(x = reorder(strat, tott), y = tott, fill = strat)) + geom_col() + 
                                                                                coord_flip() +
                                                                          # scale_fill_manual(values = cols) +
                                                labs(x = "Tyre", y = "Total Race Time (min)", title = "Formula 1 Tyre Race Time") +
                                                                    theme(panel.background = element_rect(fill = "#c9c9c9"), panel.grid.minor =  element_blank(), panel.grid.major = element_line(colour = "#9c9a9a"))



colnames(min_t)[2] <- "mint"


f1_sim_9 <- f1_sim_8 %>% left_join(min_t, by = "strat") %>%
                                mutate(delta = tott - mint) %>%
                                  mutate(fil = if_else(delta < 3, 1,0)) %>%
                                      filter(fil == 1)


ggplot(f1_sim_9, aes(x = recs.x, y = strat, col = strat)) + geom_point() + 
                                                                     
                                                            xlim(0, 70) + 
                                                              scale_color_manual(values = cols) + 
                                                                  guides(colour = guide_legend(title = "Strategy")) + 
                                                              labs(x = "Lap", y = "Strategy", title = "Formula 1 Strategy Pit Windows") +
                                     theme(panel.background = element_rect(fill = "#c9c9c9"), panel.grid.minor =  element_blank(), panel.grid.major = element_line(colour = "#9c9a9a"))

The code above finds the minimum time for each strategy and then finds the laps that are within 3 seconds of the minimum time. This then gives a pit window. This is a range of laps that the driver can pit and still be reasonably close to the possible minimum. Therefore the strategy can be adjusted for whats ongoing on the track. Clearly the pit window for the medium/hard strategy is a lot later then the other two. The soft hard strategy also has an earlier pit window due to the more durability of the hard tyre.

Two stop or one stop?

Finally one stop is not the mandatory strategy and in reality you could do an 8 stop strategy if you wanted. The other common strategy though is a 2 stop strategy. Below is the code that identifies the fastest 2 stop strategy and then put that together with the fastest one stop strategy. I then calculated the difference between the 2 strategies through the whole race

```{r}


ter <- 1:70 # creaing the table with the laps 


ter2 <- tibble(ter) #creating a dataframe


## creating all the pitstop lapoptions 

ter3 <- ter2 %>% mutate(stp2 = 70 - ter) %>%
                  mutate(stp3 = 70 - ter) %>%
                    uncount(stp2) %>%
                      mutate(rec = 1:n()) %>%
                      group_by(ter) %>%
                      mutate(n = 1:n()) %>%
                      mutate(stop2 = ter + n) %>%
                      ungroup() %>%
                        filter(ter > 1 ) %>%
                        mutate(strt = 1:n())  %>%
                        select(strt, ter, stop2) 

colnames(ter3)[2] <- "stop1"
                                



ter4 <- ter3 %>% pivot_longer(cols = 2:3, names_to = "stop", values_to = "lap")


ter5 <- 1:2346


ter6 <- tibble(ter5)

# uncounting all the options so they have the full 70 laps 
ter7 <- ter6 %>% mutate(n = 70) %>%
                   uncount(n) %>%
                     group_by(ter5) %>%
                      mutate(lap = 1:n())


colnames(ter7)[1] <- "strt"


##calculating the total time for each scenario 

ter8 <- ter7 %>% left_join(ter4, by = c("lap", "strt")) %>%
                   mutate(strat = if_else(lap == 1, "start", stop)) %>%
                  mutate(lap_tyre = if_else(strat == "start", "soft", 
                      if_else(strat == "stop1", "soft", "medium"))) %>%
                                                 group_by(strt) %>%
                                   fill(lap_tyre, .direction = "down") %>%
                                        fill(strat, .direction = "down") %>%
                                                 mutate(startl = 95)  %>%
                                                 ungroup() %>%
                                                  group_by(strt, strat) %>%
                                                  mutate(stintl = 1:n()) %>%
                      left_join(tyre_stats, by = "lap_tyre") %>%
                                      mutate(deg = dr * (1 + dc)^(stintl-1)) %>%
                  mutate(lapt = (startl - (0.06*(lap-1))) + ds + deg) %>%
                                                    ungroup() %>%
                                                   group_by(strt) %>%
                                                    summarise(tott = sum(lapt))

 
````

There were 2346 options for 2 stop strategies this code calculates the total race time for all of them. It found the fastest was stops on lap 22 and 43

best_one <- recs3 %>% left_join(f1_stint, by = c("recs", "laps")) %>%
                            group_by(recs) %>%
                              fill(ntyre, .direction = "down") %>% 
                       mutate(lap_tyre = if_else(is.na(ntyre), tyre.x, ntyre)) %>%
                                  group_by(recs, lap_tyre) %>%
                                      mutate(tyre_age = 1:n()) %>%
                                        left_join(tyre_stats, by = "lap_tyre") %>%
                                 mutate(deg = dr * (1 + dc)^(tyre_age-1)) %>%
                          mutate(dl = lapt+deg+ds) %>%
                              ungroup()  %>%
                                        filter(recs == 29) %>%
                        mutate(lap_one = if_else(is.na(tyre.y), dl, dl+25)) %>%
                                        select(laps, lap_one)

colnames(best_one)[1] <- "lap"


strat_comp <- ter9 %>% left_join(best_one, by = "lap") %>%
                          mutate(delta = lap2- lap_one) %>%
                           mutate(cumdiff = cumsum(delta))


ggplot(strat_comp, aes(x = lap, y = cumdiff)) + 
                       geom_line(col = "#6d009c", size = 2) +
      labs(x = "Lap", y = "Gap (s)", title = "Gap Between 1 and 2 Stop Strategy") +
                         theme(panel.background = element_rect(fill = "#c9c9c9"), panel.grid.minor =  element_blank(), panel.grid.major = element_line(colour = "#9c9a9a"))


```

At the start they are both on the same time then the 2 stop strategy pits first and there is a big lead for the one stop. This get chipped away till the one stop pits. The gap is pretty even till the two stop pits again and then the gap slowly drops off till the 2 stop overtakes 4 laps from the end and wins by 4 seconds. This is very much a perfect scenario and the difference is only 4 seconds. It takes no account of any traffic or how difficult it might be to overtake.

Hope you enjoyed this initial explore into the fundamentals of formula 1 strategy.

To leave a comment for the author, please follow the link and comment on their blog: Sport Data Science.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)