# R: Analyisis of a Sport Event

June 15, 2011
By

(This article was first published on Statistik Stuttgart » R, and kindly contributed to R-bloggers)

Inspired by a post by a R-blogger my interest was piqued to examine the runs in my athletic club. Therefore, I started R and analysed he LAC Degerloch Volkslauf 2010; a 10km race near Stuttgart-Hoffeld. Next lines, I present this statistical examination. The data can be found at: data.

Firstly, I converted the data file into a CSV file and wrote adapted a R script for reading and converting the data:

```require(ggplot2)
results &lt;- read.csv(&quot;Volkslauf10km.csv&quot;, sep = &quot;;&quot;)
FUN &lt;- function(s) {
sum(as.integer(strsplit(s,':')[[1]])*c(60,1,1/60))
}
results\$Minuten &lt;- sapply(as.character(results\$Ergebnis), FUN)
results\$Geschlecht &lt;- &quot;Männer&quot;
results\$Geschlecht[grep(&quot;W&quot;, results\$AK)] &lt;- &quot;Frauen&quot;
```

Next, I divided into men and women and plotted the age against the time.

```ggplot(results, aes(Jhg, Minuten)) + theme_bw()
+ geom_point() + facet_wrap(~ Geschlecht)
+ geom_smooth()+ xlab(&quot;Jahrgang&quot;)
+ ylab(&quot;Minuten&quot;)
```

By the last picture, one can suppose that men are faster on average than women and there may be an influence of the age. So we additionally assign the people to decades.

```ggplot(results, aes(Minutes)) + theme_bw()
+ geom_histogram(binwidth = 2)
+ ylab(&quot;Anzahl&quot;) + xlab(&quot;Minuten&quot;)
```

It seems to be that men are on average faster than women. Another interesting matter is that we have 1970 are only low represented in comparison to the 1980s and 1960s. Maybe this is an effect that has career or family reasons.

On the following lines I compare the performance of the different decades.

```ggplot(results, aes(Minutes)) + theme_bw()
+ geom_histogram(binwidth = 2)
+ ylab(&quot;Anzahl&quot;) + xlab(&quot;Minuten&quot;)
```

It looks like that there is a stronger drop in performance from the 1960s to 1930s than the decades before. It seems that the mass hold their performance until their exceed the age of 50.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...