R: Analyisis of a Sport Event

[This article was first published on Statistik Stuttgart » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Inspired by a post by a R-blogger my interest was piqued to examine the runs in my athletic club. Therefore, I started R and analysed he LAC Degerloch Volkslauf 2010; a 10km race near Stuttgart-Hoffeld. Next lines, I present this statistical examination. The data can be found at: data.

Firstly, I converted the data file into a CSV file and wrote adapted a R script for reading and converting the data:

require(ggplot2)
results <- read.csv("Volkslauf10km.csv", sep = ";")
FUN <- function(s) {
    sum(as.integer(strsplit(s,':')[[1]])*c(60,1,1/60))
}
results$Minuten <- sapply(as.character(results$Ergebnis), FUN)
results$Geschlecht <- "Männer"
results$Geschlecht[grep("W", results$AK)] <- "Frauen"

Next, I divided into men and women and plotted the age against the time.

ggplot(results, aes(Jhg, Minuten)) + theme_bw()
+ geom_point() + facet_wrap(~ Geschlecht)
+ geom_smooth()+ xlab("Jahrgang")
+ ylab("Minuten")

By the last picture, one can suppose that men are faster on average than women and there may be an influence of the age. So we additionally assign the people to decades.

ggplot(results, aes(Minutes)) + theme_bw()
+ geom_histogram(binwidth = 2)
+ facet_grid(Geschlecht ~ Dekaden)
+ ylab("Anzahl") + xlab("Minuten")

It seems to be that men are on average faster than women. Another interesting matter is that we have 1970 are only low represented in comparison to the 1980s and 1960s. Maybe this is an effect that has career or family reasons.

On the following lines I compare the performance of the different decades.

ggplot(results, aes(Minutes)) + theme_bw()
+ geom_histogram(binwidth = 2)
+ facet_grid(Dekaden ~ Geschlecht)
+ ylab("Anzahl") + xlab("Minuten")

It looks like that there is a stronger drop in performance from the 1960s to 1930s than the decades before. It seems that the mass hold their performance until their exceed the age of 50.

To leave a comment for the author, please follow the link and comment on their blog: Statistik Stuttgart » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)