Here's a simple way to make a bar plot with error bars three ways: standard deviation, standard error of the mean, and a 95% confidence interval. The key step is to precalculate the statistics for ggplot2. Continue reading →

I recently enjoyed reading O’Hara, R. B., & Kotze, D. J. (2010). Do not log-transform count data. Methods in Ecology and Evolution, 1(2), 118–122. doi:10.1111/j.2041-210X.2010.00021.x. The article prompted me to think about processes involving discrete events and how these might be presented graphically. I am not talking about counts (which are well represented by a

Boris Chen, a data scientist for the New York Times, has been running since August a weekly blog with statistical analysis of NFL players, as fodder for Fantasy Football players around the country. Here's how he describes what he does: My model pulls aggregated expert rankings from fantasypros, and I pass that data into a machine learning clustering algorithm...

Color is often used to display an extra dimension in plots of scientific data. Unfortunately, everyone does not decode color in exactly the same way. This is especially true for those with color vision deficiency, which affects up to 8 percent of the population in its two most common forms. As a result, it has been estimated that the...

The last couple of days I read a number of times about stabilization in house prices which had been dropping due to the crisis. And you get hit by numbers such as change against Q2 2013 or Q3 2012. These are accompanied by reasons why this or that quarter may be special so changes may be off. To be...

I’ve made quite a few blog posts about neural networks and some of the diagnostic tools that can be used to ‘demystify’ the information contained in these models. Frankly, I’m kind of sick of writing about neural networks but I wanted to share one last tool I’ve implemented in R. I’m a strong believer that

Get data that fit before you fit data. Why verify? Garbage in, garbage out. How to verify The example data used here is daily (adjusted) prices of stocks. By some magic that I’m yet to fathom, market data can be wondrously wrong even without the benefit of the possibility of transcription errors. It doesn’t seem … Continue reading...

e-mails with the latest R posts.

(You will not see this message again.)