Blog Archives

Inter-ocular trauma test

November 17, 2016
By
Inter-ocular trauma test

I’ve recently been thinking about the role statistics can play in answering questions. I think the it came up on the NSSD podcast a few weeks ago. Basically, problems can be divided into three classes: those that don’t need statistics because the answer is obvious (problems...

Read more »

Using tidytext to make sentiment analysis easy

November 15, 2016
By
Using tidytext to make sentiment analysis easy

Last week I discovered the R package tidytext and its very nice e-book detailing usage. Julia Silge and David Robinson have significantly reduced the effort it takes for me to “grok” text mining by making it “tidy.” It certainly helped that a lot of the...

Read more »

Easy Cross Validation in R with `modelr`

November 11, 2016
By

When estimating a model, the quality of the model fit will always be higher in-sample than out-of-sample. A model will always fit the data that it is trained on, warts and all, and may use those warts and statistical noise to make predictions. As...

Read more »

Parallel Simulation of Heckman Selection Model

April 22, 2015
By
Parallel Simulation of Heckman Selection Model

Parallel Simulation of Heckman Selection Model One of the, if not the, fundamental problems in observational data analysis is the estimation of the value of the unobserved choice. If the (i^{text{th}}) unit chooses the value of (t) on the basis of some factors (mathbf{x_i}), which may include...

Read more »

The Problem with Propensity Scores

April 14, 2015
By
The Problem with Propensity Scores

Are Propensity Scores Useful? Effect estimation for treatments using observation data isn't always straight forward. For example, it is very common that patients who are treated with a certain medication or procedure are healthier than those who are not treated. Those who aren't treated may not be...

Read more »

Frequentist German Tank Problem

March 20, 2014
By
Frequentist German Tank Problem

The German Tank Problem: The Frequentist Way Many things are given a serial number and often that serial number, logically, starts at 1 and for each new unit is increased by 1. For example, German tanks in World War II had several parts with serial numbers. By collecting...

Read more »

Stop using bivariate correlations for variable selection

March 19, 2014
By
Stop using bivariate correlations for variable selection

Stop using bivariate correlations for variable selection Something I've never understood is the widespread calculation and reporting of univariate and bivariate statistics in applied work, especially when it comes to model selection. Bivariate statistics are, at best, useless for multi-variate model selection and, at worst, harmful. Since nearly all...

Read more »

Bayesian Search Models

March 13, 2014
By
Bayesian Search Models

Bayesian Search Theory The US had a pretty big problem on their hands in 1966. Two planes had hit each other during a in-flight refueling and crashed. Normally, this would be an unfortunate thing and terrible for the families of those involved in the crash but otherwise fairly limited...

Read more »

Instrumental Variables Simulation

January 9, 2014
By

Instrumental Variables Instrumental variables are an incredibly powerful for dealing with unobserved heterogenity within the context of regression but the language used to define them is mind bending. Typically, you hear something along the lines of “an instrumental variable is a variable that is correlated with x but uncorrelated with...

Read more »

Penalizing P Values

November 19, 2013
By

Penalizing P Values Ioannidis' paper suggesting that most published results in medical research are not true is now high profile enough that even my dad, an artist who wouldn't know a test statistic if it hit him in the face, knows about it. It has even...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)