# Blog Archives

## R Code for Election Posterior Distribution From a Random Sample

January 4, 2015
By

I wrote a summary article a couple of years ago discussing some probability aspects of the 2012 Presidential general election with a particular focus on exit polling. I’ve had a few people email me asking for the code I used in some if the examples. I have used this code since before the 2008 elections so

## Recent Articles

August 19, 2014
By

I have uploaded a few papers I have written and presented at some national conferences over the past several years.  Currently, all the articles relate to election research.

## The Birthday Simulation

May 21, 2014
By

Nothing novel or unique about this problem.  This just extends the problem to measure the probability to three or more people sharing the same birthday using simulation approaches. For two people it’s fairly straight forward and with a group of about 22 people the probability that two people share the same birthday is about 0.5.  For

## Connecting TOAD For MySQL, MySQL Workbench, and R to Amazon AWS EC2 Using SSH Tunneling

January 7, 2014
By

I often use Amazon EC2 to store and retrieve data when I need either additional storage or higher computing capacity.  In this tutorial I’ll share how to connect to a MySQL database so that one can retrieve the data and do the analysis.  I tend to use either TOAD for MySQL or MySQL Workbench to run

## Probabilities and P-Values

December 2, 2013
By

P-values seem to be the bane of a statistician’s existence.  I’ve seen situations where entire narratives are written without p-values and only provide the effects. It can also be used as a data reduction tool but ultimately it reduces the world into a binary system: yes/no, accept/reject. Not only that but the binary threshold is

## Some Options for Testing Tables

November 18, 2013
By

Contingency tables are a very good way to summarize discrete data.  They are quite easy to construct and reasonably easy to understand. However, there are many nuances with tables and care should be taken when making conclusions related to the data. Here are just a few thoughts on the topic. Dealing with sparse data On

## Spatial Clustering With Equal Sizes

November 4, 2013
By

This is a problem I have encountered many times where the goal is to take a sample of spatial locations and apply constraints to the algorithm.  In addition to providing a pre-determined number of K clusters a fixed size of elements needs to be held constant within each cluster. An application of this algorithm is

## Tracking the 2013 Hurricane Season

October 21, 2013
By

With it being the end of hurricane season it’s only appropriate to do a brief summary of the activity this year.   It’s been a surprisingly low-key season as far as hurricanes are concerned.  There have been only a few hurricanes and the barometric pressure of any hurricane this season has not even come close

## Beta Distribution and the NJ U.S. Senate Election

October 14, 2013
By

The beta distribution is highly flexible distribution and applies to many situations and environments. The beta distribution applies well when there are percentages. The upcoming New Jersey U.S. Senate election on Wednesday fits that criterion quite well. So here I applied the beta distribution to some pre-election polls where the numbers were obtained through the

## Random Sequence of Heads and Tails: For R Users

October 10, 2013
By

Rick Wicklin on the SAS blog made a post today on how to tell if a sequence of coin flips were random.  I figured it was only fair to port the SAS IML code over to R.  Just like Rick Wicklin did in his example this is the Wald-Wolfowitz test for randomness.  I tried to