# Blog Archives

## Distribution of T-Scores

March 2, 2013
By

Like most of my post these code snippets derive from various other projects.  In this example it shows a simulation of how one can determine if a set of t statistics are distributed properly.  This can be useful when sampling known populations (e.g. U.S. census or hospital populations) or populations that will soon be known

## Bootstrap Confidence Intervals

February 1, 2013
By

Here is an example of nonparametric bootstrapping.  It’s a powerful technique that is similar to the Jackknife. With the bootstrap, however, the approach uses re-sampling. It’s clearly not as good as parametric approaches but it gets the job done. This can be used in a variety of situations ranging from variance estimation to model selection. John

## Binomial Confidence Intervals

January 22, 2013
By
$Binomial Confidence Intervals$

This stems from a couple of binomial distribution projects I have been working on recently.  It’s widely known that there are many different flavors of confidence intervals for the binomial distribution.  The reason for this is that there is a coverage problem with these intervals (see Coverage Probability).  A 95% confidence interval isn’t always (actually

## Y2K38: Our Own Mayan Calendar…Again

December 21, 2012
By
$Y2K38: Our Own Mayan Calendar…Again$

It’s not quite the end of the world as we know it.  We made it through December 21, 2012 unscathed. It’s not going to be the last time we will make it through such a pseudo-calamity.  After all we have built our own end of the world before (e.g. Y2K). Next up January 19, 2038.

## True Significance of a T Statistic

December 17, 2012
By

The example is more of a statistical exercise that  shows the true significance and the density curve of simulated random normal data.  The code can be changed to generate data using either a different mean and standard deviation or a different distribution altogether. This extends the idea of estimating pi by generating random normal data to determine

## Estimating Pi

December 11, 2012
By

Recently I’ve been working on some jackknife and bootstrapping problems.  While working on those projects I figured it would be a fun distraction to take the process and estimate pi.  I’m sure this problem has been tackled countless times but I have never bothered to try it using a Monte Carlo approach.  Here is the

## Mean Value from Grouped Data

December 7, 2012
By

Occasionally, I will get requests from clients to calculate the mean. Most of the time it’s a simple request but from time-to-time the data was originally from grouped data. A common approach is to take the midpoint of each of the groups and just assume that all respondents within that group average out to the

## Importing Data Into R from Different Sources

December 6, 2012
By

I have found that I get data from many different sources.  These sources range from simple .csv files to more complex relational databases, to structure XML or JSON files.  I have compiled the different approaches that one can use to easily access these datasets. Local Column Delimited Files This is probably the most common and

## Plotting Likert Scales

December 4, 2012
By

Graphs can provide an excellent way to emphasize a point and to quickly and efficiently show important information. Sadly, poor graphs can be a good way to waste space in an article, take up time in a presentation, and waste a lot of ink all while providing little to no information. Excel has made it

## Earthquakes Over the Past 7 Days

November 29, 2012
By

This is a brief example using the maps in R and to highlight a source of data.  This is real-time data and it comes from the U.S. Geological Society.  This shows the location of earthquakes with magnitude of at least 1.0 in the lower 48 states. library(maps) library(maptools) library(rgdal) eq = read.table(file="http://earthquake.usgs.gov/earthquakes/catalogs/eqs7day-M1.txt", fill=TRUE, sep=",", header=T) plot.new()