# stats

### The unicorn problem

October 13, 2012 |

Let’s say your goal is to observe all known species in a particular biological category. Once a week you go out and collect specimens to identify, or maybe you just bring your binoculars to do some spotting. How long will it take you to cross off every species on ... [Read more...]

### New Zealand school performance: beyond the headlines

September 24, 2012 |

I like the idea of having data on school performance, not to directly rank schools—hard, to say the least, at this stage—but because we can start having a look at the factors influencing test results. I imagine the opportunity in … Continue reading → [Read more...]

### Mid-August flotsam

August 20, 2012 |

Reached mid-semester point, with quite a few new lectures to prepare. Nothing extremely complicated but, as always, the tricky part is finding a way to make it meaningful and memorable. Sometimes, and this is one of those times, I sound … Continue reading → [Read more...]

### Split-plot 1: How does a linear mixed model look like?

June 24, 2012 |

I like statistics and I struggle with statistics. Often times I get frustrated when I don’t understand and I really struggled to make sense of Krushke’s Bayesian analysis of a split-plot, particularly because ‘it didn’t look like’ a split-plot to … Continue reading → [Read more...]

### R, Julia and genome wide selection

April 24, 2012 |

— “You are a pussy” emailed my friend. — “Sensu cat?” I replied. — “No. Sensu chicken” blurbed my now ex-friend. What was this about? He read my post on R, Julia and the shiny new thing, which prompted him … Continue reading → [Read more...]

### Recovering Marginal Effects and Standard Errors of Interactions Terms Pt. II: Implement and Visualize

March 9, 2012 |

In the last post I presented a function for recovering marginal effects of interaction terms. Here we implement the function with simulated data and plot the results using ggplot2.       #---Simulate Data and Fit a linear model with an... [Read more...]

### Recovering Marginal Effects and Standard Errors from Interaction Terms in R

March 5, 2012 |

When I fit models with interactions, I often want to recover not only the interaction effect but also the marginal effect (the main effect + the interaction) and of course the standard errors. There are a couple of ways to do this in R but I ended writ... [Read more...]

### Functional ANOVA using INLA

January 13, 2012 |

[Update alert: INLA author Håvard Rue found a problem with the code below. See here] Ramsay and Silverman’s Functional Data Analysis is a tremendously useful book that deserves to be more widely known. It’s full of ideas of neat things one can do when part of a ... [Read more...]

### Iowa: Was the fix in? (a statistical analysis of the results)

January 4, 2012 |

Summary/TL;DR Either the first precincts to report were widely unrepresentative of Iowa as a whole, or something screwy happened. Background Yesterday was the first primary for the 2012 U.S. presidential elections. When I logged off the internet last night, the results in Iowa showed a dead heat between ... [Read more...]

### Tall big data, wide big data

December 12, 2011 |

After attending two one-day workshops last week I spent most days paying attention to (well, at least listening to) presentations in this biostatistics conference. Most presenters were R users—although Genstat, Matlab and SAS fans were also present and not one … Continue reading → [Read more...]

### R, academia and the democratization of statistics

December 12, 2011 |

I am not a statistician but I use statistics, teach some statistics and write about applications of statistics in biological problems. Last week I was in this biostatistics conference, talking with a Ph.D. student who was surprised about this situation … Continue reading → [Read more...]

### My oh my

December 6, 2011 |

Noted without comment, visit Biostatistics Ryan Gosling !!! for more gems like the one above. [Read more...]

### On the (statistical) road, workshops and R

December 3, 2011 |

Things have been a bit quiet at Quantum Forest during the last ten days. Last Monday (Sunday for most readers) I flew to Australia to attend a couple of one-day workshops; one on spatial analysis (in Sydney) and another one … Continue reading → [Read more...]

### Do we need to deal with ‘big data’ in R?

November 22, 2011 |

David Smith at the Revolutions blog posted a nice presentation on “big data” (oh, how I dislike that term). It is a nice piece of work and the Revolution guys manage to process a large amount of records, starting with … Continue reading → [Read more...]

### Teaching with R: the tools

November 1, 2011 |

I bought an Android phone, nothing fancy just my first foray in the smartphone world, which is a big change coming from the dumb phone world(*). Everything is different and I am back at being a newbie; this is what … Continue reading → [Read more...]

### Power Tools for Aspiring Data Journalists: R

October 31, 2011 |

Picking up on Paul Bradshaw’s post A quick exercise for aspiring data journalists which hints at how you can use Google Spreadsheets to grab – and explore – a mortality dataset highlighted by Ben Goldacre in DIY statistical analysis: experience the thrill of touching real data, I thought I’d describe ... [Read more...]

### Covariance structures

October 26, 2011 |

In most mixed linear model packages (e.g. asreml, lme4, nlme, etc) one needs to specify only the model equation (the bit that looks like y ~ factors...) when fitting simple models. We explicitly say nothing about the covariances that complete … Continue reading → [Read more...]

### Queueing up in R, continued

October 20, 2011 |

Shown above is a queueing simulation. Each diamond represents a person. The vertical line up is the queue; at the bottom are 5 slots where the people are attended. The size of each diamond is proportional to the log of the time it will take them to be attended. Color is ... [Read more...]

### Maximum likelihood

October 13, 2011 |

This post is one of those ‘explain to myself how things work’ documents, which are not necessarily completely correct but are close enough to facilitate understanding. Background Let’s assume that we are working with a fairly simple linear model, where … Continue reading → [Read more...]

### Waiting in line, waiting on R

October 13, 2011 |

I should state right away that I know almost nothing about queuing theory. That’s one of the reasons I wanted to do some queuing simulations. Another reason: when I’m waiting in line at the bank, I tend to do mental calculations for how long it should take me ... [Read more...]
1 2 3