Statistical Analysis of the LAC Degerloch Volkslauf 2010

June 15, 2011

(This article was first published on this and that about applied statistics » R, and kindly contributed to R-bloggers)

Inspired by a post by a R-blogger my interest was piqued to examine the runs in my athletic club. Therefore, I started R and analysed he LAC Degerloch Volkslauf 2010; a 10km race near Stuttgart-Hoffeld. Next lines, I present this statistical examination. The data can be found at: data.

Firstly, I converted the data file into a CSV file and wrote adapted a R script for reading and converting the data:

results <- read.csv("Volkslauf10km.csv", sep = ";")
FUN <- function(s) {
results$Minuten <- sapply(as.character(results$Ergebnis), FUN)
results$Geschlecht <- "Männer"
results$Geschlecht[grep("W", results$AK)] <- "Frauen"

Next, I divided into men and women and plotted the age against the time.

ggplot(results, aes(Jhg, Minuten)) + theme_bw()
+ geom_point() + facet_wrap(~ Geschlecht)
+ geom_smooth()+ xlab("Jahrgang")
+ ylab("Minuten")


By the last picture, one can suppose that men are faster on average than women and there may be an influence of the age. So we additionally assign the people to decades.

ggplot(results, aes(Minutes)) + theme_bw()
+ geom_histogram(binwidth = 2)
+ facet_grid(Geschlecht ~ Dekaden)
+ ylab("Anzahl") + xlab("Minuten")

It seems to be that men are on average faster than women. Another interesting matter is that we have 1970 are only low represented in comparison to the 1980s and 1960s. Maybe this is an effect that has career or family reasons.

On the following lines I compare the performance of the different decades.

ggplot(results, aes(Minutes)) + theme_bw()
+ geom_histogram(binwidth = 2)
+ facet_grid(Dekaden ~ Geschlecht)
+ ylab("Anzahl") + xlab("Minuten")

It looks like that there is a stronger drop in performance from the 1960s to 1930s than the decades before. It seems that the mass hold their performance until their exceed the age of 50.

To leave a comment for the author, please follow the link and comment on their blog: this and that about applied statistics » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , ,

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)