## A survival guide to Data Science with R, from Graham Williams

February 21, 2014
Graham Williams is the Lead Data Scientist at the Australian Taxation Office, and the creator of Rattle, an open-source GUI for data mining with R. (Check out some recent reviews/demos of Rattle on this blog here and here.) Dr Williams continues his many contributions to the R community with One Page R, a "Survival Guide to Data Science with...

## Interactive exploration of a prior’s impact

February 21, 2014
The probably most frequent criticism of Bayesian statistics sounds something like “It’s all subjective – with the ‘right’ prior, you can get any result you want.”. In order to approach this criticism it has been suggested to do a sensitivity analysis (or robustness analysis), that demonstrates how the choice of priors affects the conclusions drawn

## Forecasting within limits

February 21, 2014
It is common to want forecasts to be positive, or to require them to be within some specified range . Both of these situations are relatively easy to handle using transformations. Positive forecasts To impose a positivity constraint, simply work on the log scale. With the forecast package in R, this can be handled by specifying the Box-Cox parameter...

## Books and lessons about ggplot2

February 19, 2014
I recently got an email from a person at Packt publishing, who suggested I write a book for them about ggplot2. My answer, which is perfectly true, is that I don’t have the time, nor the expertise to do that. What I didn’t say is that 1) a quick web search suggests that Packt doesn’t

## evaluating stochastic algorithms

February 19, 2014
Reinaldo sent me this email a long while ago Could you recommend me a nice reference about measures to evaluate stochastic algorithms (in particular focus in approximating posterior distributions). and I hope he is still reading the ‘Og, despite my lack of prompt reply! I procrastinated and procrastinated in answering this question as I did not

## Voting Twice in France

February 19, 2014
$P_i\sim\mathcal{B}(N_i,p_i)$

On the Monkey Cage blog, Baptiste Coulmont (a.k.a. @coulmont) recently uploaded a post entitled “You can vote twice ! The many political appeals of proxy votes in France“, coauthored with Joël Gombin (a.k.a. @joelgombin), and myself. The study was initially written in French as mentioned in a previous post. Baptiste posted additional information on his blog (http://coulmont.com/blog/…) and I also wanted to post some lines of code,...

## Regression with multiple predictors

February 18, 2014
(This article was first published on Digithead's Lab Notebook, and kindly contributed to R-bloggers) Now that I'm ridiculously behind in the Stanford Online Statistical Learning class, I thought it would be fun to try to reproduce the figure on page 36 of the slides from chapter 3 or page 81 of the book. The result is a curvaceous surface...

## ggplot2: Cheatsheet for Visualizing Distributions

February 18, 2014
In the third and last of the ggplot series, this post will go over interesting ways to visualize the distribution of your data.

## AntWeb – programmatic interface to ant biodiversity data

February 18, 2014
Data on more than 10,000 species of ants recorded worldwide are available through from California Academy of Sciences' AntWeb, a repository that boasts a wealth of natural history data, digital images, and specimen records on ant species from a large community of museum curators. Digging through some of the earliest announcements of AntWeb, I came across...

## Tutorials- Statistical and Multivariate Analysis for Metabolomics

February 17, 2014
I recently had the pleasure in participating in the 2014 WCMC Statistics for Metabolomics Short Course. The course was hosted by the NIH West Coast Metabolomics Center and focused on statistical and multivariate strategies for metabolomic data analysis. A variety of topics were covered using 8 hands on tutorials which focused on: data quality overview