Blog Archives

Bootstrap Confidence Intervals

February 1, 2013
By
Bootstrap Confidence Intervals

Here is an example of nonparametric bootstrapping.  It’s a powerful technique that is similar to the Jackknife. With the bootstrap, however, the approach uses re-sampling. It’s clearly not as good as parametric approaches but it gets the job done. This can be used in a variety of situations ranging from variance estimation to model selection. John

Read more »

Binomial Confidence Intervals

January 22, 2013
By
Binomial Confidence Intervals

This stems from a couple of binomial distribution projects I have been working on recently.  It’s widely known that there are many different flavors of confidence intervals for the binomial distribution.  The reason for this is that there is a coverage problem with these intervals (see Coverage Probability).  A 95% confidence interval isn’t always (actually

Read more »

Y2K38: Our Own Mayan Calendar…Again

December 21, 2012
By
Y2K38: Our Own Mayan Calendar…Again

It’s not quite the end of the world as we know it.  We made it through December 21, 2012 unscathed. It’s not going to be the last time we will make it through such a pseudo-calamity.  After all we have built our own end of the world before (e.g. Y2K). Next up January 19, 2038.

Read more »

True Significance of a T Statistic

December 17, 2012
By
True Significance of a T Statistic

The example is more of a statistical exercise that  shows the true significance and the density curve of simulated random normal data.  The code can be changed to generate data using either a different mean and standard deviation or a different distribution altogether. This extends the idea of estimating pi by generating random normal data to determine

Read more »

Estimating Pi

December 11, 2012
By

Recently I’ve been working on some jackknife and bootstrapping problems.  While working on those projects I figured it would be a fun distraction to take the process and estimate pi.  I’m sure this problem has been tackled countless times but I have never bothered to try it using a Monte Carlo approach.  Here is the

Read more »

Mean Value from Grouped Data

December 7, 2012
By
Mean Value from Grouped Data

Occasionally, I will get requests from clients to calculate the mean. Most of the time it’s a simple request but from time-to-time the data was originally from grouped data. A common approach is to take the midpoint of each of the groups and just assume that all respondents within that group average out to the

Read more »

Importing Data Into R from Different Sources

December 6, 2012
By

I have found that I get data from many different sources.  These sources range from simple .csv files to more complex relational databases, to structure XML or JSON files.  I have compiled the different approaches that one can use to easily access these datasets. Local Column Delimited Files This is probably the most common and

Read more »

Plotting Likert Scales

December 4, 2012
By
Plotting Likert Scales

Graphs can provide an excellent way to emphasize a point and to quickly and efficiently show important information. Sadly, poor graphs can be a good way to waste space in an article, take up time in a presentation, and waste a lot of ink all while providing little to no information. Excel has made it

Read more »

Earthquakes Over the Past 7 Days

November 29, 2012
By
Earthquakes Over the Past 7 Days

This is a brief example using the maps in R and to highlight a source of data.  This is real-time data and it comes from the U.S. Geological Society.  This shows the location of earthquakes with magnitude of at least 1.0 in the lower 48 states. library(maps) library(maptools) library(rgdal) eq = read.table(file="http://earthquake.usgs.gov/earthquakes/catalogs/eqs7day-M1.txt", fill=TRUE, sep=",", header=T) plot.new()

Read more »

Hurricane Sandy Land Wind Speed and Kriging

November 28, 2012
By
Hurricane Sandy Land Wind Speed and Kriging

NJ Hurricane Sandy Landfall Data These data come from the National Climatic Data Center (NCDC).  Using the above link will download all of the data collected by the NCDC on the day of Hurricane Sandy.  The data can also be obtained directly from the source at http://cdo.ncdc.noaa.gov/qclcd/QCLCD. The purpose of this post is not a discussion

Read more »