**distributed ecology**, and kindly contributed to R-bloggers)

So here’s what I did. I used data from the Pomeroy rankings and the Log 5 rule to calculate a binomial probability of one team beating another. The problem with this is that you don’t have any data to actually model, its only a predictive probability. What I wanted to do was use a simple Bayesian Beta-Binomial model and integrate prior information from my more NCAA savy friends. I used this model to simulate the NCAA bracket 10,000 times for each friend and with a flat prior. So here’s what I did 10,000 times.

First I used the Pomeroy probability to simulate 34 games between any two teams meeting in the tournament. I used that simulated data in the beta binomial model as my number of successes and failures. I then asked my friends to provide me with two probabilities for each team, a probability of them making it to the final four and a probability of a team winning it all. When teams encountered each other in the simulation I calculated the odds of one team beating another and then converted those odds into a probability and then used an algorithm to solve for the parameters of a beta-distribution. I scaled the beta parameters to reflect the confidence my experts had in their picks. In the tables you can see that priors can have a pretty heavy influence on picks. Here are my tables, each number represents the probability that a team makes it to a given round of the tournament.

Here is the code in R for the model

**leave a comment**for the author, please follow the link and comment on their blog:

**distributed ecology**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...