**Rstats – quantixed**, and kindly contributed to R-bloggers)

I’ve previously crunched times for local Half and Full Marathons here on *quantixed*. Last weekend was the Kenilworth Half Marathon (2018) over a new course. I thought I’d have a look at the distributions of times and paces of the runners. The times are available here. If the Time and Category for finishers are saved as a csv, the script below works to generate the following plots.

Aggregated stats for the race are here. The beeswarm plot nicely shows the distribution of runners times and paces per category. There’s a bimodality to some of the age groups which is interesting. You can see from the average times that people get slower as they get older, as expected.

There was a roughly 2:1 split of M:F runners with a similar proportion in all categories. The ratio is similar for DNSers. The winning times were Andrew Savery of Leamington C A & C in MV35 with 01:12:51 and Polly Keen of Nuneaton Harriers in F sen with 01:23:46.

Congrats to everyone who ran and thanks to the organisers and all the supporters out on the course.

require(ggplot2) require(ggbeeswarm) file_name <- file.choose() df1 <- read.csv(file_name, header = TRUE, stringsAsFactors = FALSE) # aggregate M and F to a new category called Gender df1$Gender <- ifelse(startsWith(df1$Category,"F"),"F","M") # format Date column to POSIXct df1$Time <- as.POSIXct(strptime(df1$Time, format = "%H:%M:%S")) orig_var <- as.POSIXct("00:00:00", format = "%H:%M:%S") p1 <- ggplot( data = df1, aes(x = Category,y = Time, color = Category)) + geom_quasirandom(alpha = 0.5, stroke = 0) + stat_summary(fun.y = mean, geom = "point", size=2, aes(group = 1)) + scale_y_datetime(date_labels = "%H:%M:%S", limits = c(orig_var,NA)) p1 # instead of finishing time, let's look at pace (min/km) df1$Pace <- as.numeric(difftime(df1$Time, orig_var) / 21.1) * 3600 df1$Pace <- as.POSIXct(df1$Pace, origin = orig_var, format = "%H:%M:%S") p2 <- ggplot( data = df1, aes(x = Category,y = Pace, color = Category)) + geom_quasirandom(alpha = 0.5, stroke = 0) + stat_summary(fun.y = mean, geom = "point", size=2, aes(group = 1)) + scale_y_datetime(date_labels = "%M:%S", limits = c(orig_var,NA)) p2 # calculate speeds rather than pace df1$Speed <- 21.1 / as.numeric(difftime(df1$Time, orig_var)) p3 <- ggplot( data = df1, aes(x = Category, y = Speed, color = Category)) + geom_quasirandom(alpha = 0.5, stroke = 0) + stat_summary(fun.y = mean, geom = "point", size=2, aes(group = 1)) + ylim(0,NA) + ylab("Speed (km/h)") p3 # now make the same plots but by Gender rather than Category p4 <- ggplot( data = df1, aes(x = Gender,y = Time, color = Gender)) + geom_quasirandom(alpha = 0.5, stroke = 0) + stat_summary(fun.y = mean, geom = "point", size=2, aes(group = 1)) + scale_y_datetime(date_labels = "%H:%M:%S", limits = c(orig_var,NA)) p4 p5 <- ggplot( data = df1, aes(x = Gender,y = Pace, color = Gender)) + geom_quasirandom(alpha = 0.5, stroke = 0) + stat_summary(fun.y = mean, geom = "point", size=2, aes(group = 1)) + scale_y_datetime(date_labels = "%M:%S", limits = c(orig_var,NA)) p5 p6 <- ggplot( data = df1, aes(x = Gender, y = Speed, color = Gender)) + geom_quasirandom(alpha = 0.5, stroke = 0) + stat_summary(fun.y = mean, geom = "point", size=2, aes(group = 1)) + ylim(0,NA) + ylab("Speed (km/h)") p6 ggsave("raceTimeByCat.png", plot = p1) ggsave("racePaceByCat.png", plot = p2) ggsave("raceSpeedByCat.png", plot = p3) ggsave("raceTimeByGen.png", plot = p4) ggsave("racePaceByGen.png", plot = p5) ggsave("raceSpeedByGen.png", plot = p6)

**Edit 2018-09-12T18:52:43Z** I wasn’t happy with the plots and added a few more lines to look at gender as well as category. And show speed as well as pace and finishing time.

—

The post title is taken from “Pledging My Time” a track from Blonde on Blonde by Bob Dylan

**leave a comment**for the author, please follow the link and comment on their blog:

**Rstats – quantixed**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...