Better modelling and visualisation of newspaper count data

February 19, 2013
<!-- Styles for R syntax highlighter In this post I outline how count data may be modelled using a negative binomial distribution in order to more accurately present trends in time series count data than using linear methods. I also show how to...

Predictors, responses and residuals: What really needs to be normally distributed?

February 18, 2013
Introduction Many scientists are concerned about normality or non-normality of variables in statistical analyses. The following and similar sentiments are often expressed, published or taught: "If you want to do statistics, then everything needs to be normally distributed." "We normalized…Read more →

Veterinary Epidemiologic Research: Linear Regression

February 14, 2013
$Veterinary Epidemiologic Research: Linear Regression$

This post will describe linear regression as from the book Veterinary Epidemiologic Research, describing the examples provided with R. Regression analysis is used for modeling the relationship between a single variable Y (the outcome, or dependent variable) measured on a continuous or near-continuous scale and one or more predictor (independent or explanatory variable), X. If

Taking Expectations to the Next Level

January 31, 2013
Higher Expectations I came across this post on Thursday and found it to be quite interesting. Clearly rental prices vary according to where you live. That isn't too surprising. I started thinking a bit more about it and thought that Boston and the nearby communities would have to...

January 30, 2013
A Problem A major problem in secondary data analysis is that you didn't get to decide what data was collected. Lets say you were interested in how many times a student has read the Twilight books). Specifically, you want to know how effective the ads for...

The "golden age" of a football player

January 28, 2013
It's been some time since my last post on football. And we're talking about european soccer here. So I finally managed to write some functions which allow me to extract player stats from www.transfermarkt.de. The site tracks lots of stats in the world of soccer. For each player, there is information about the dominant foot, height, age, the estimated...

Data science = failure of imagination

January 8, 2013
From: http://www.r-bloggers.com/data-driven-science-is-a-failure-of-imagination/I think I like this distinction between Bayesian and Frequentist statistics: "we are nearly always ultimately curious about the Bayesian probability of the hypothesis ...

Generation of E-Learning Exams in R for Moodle, OLAT, etc.

December 20, 2012
(Guest post by Achim Zeileis) Development of the R package exams for automatic generation of (statistical) exams in R started in 2006 and version 1 was published in JSS by Gr?n and Zeileis (2009). It was based on standalone Sweave exercises, that can be combined …Read more »

Matrix Algebra Useful for Statistics

December 16, 2012
I was having a conversation with an acquaintance about courses that were particularly useful in our work. My forestry degree involved completing 50 compulsory + 10 elective† courses; if I had to choose courses that were influential and/or really useful they would be Operations Research, Economic Evaluation of Projects, Ecology, 3 Calculus and 2 Algebras.